文档详情

Comparing machine learning and knowledge discovery in databases an application to knowledge.pdf

发布:2017-04-09约6.25万字共20页下载文档
文本预览下载声明
1Comparing Machine Learning and Knowledge Discovery in DataBases : An Application to Knowledge Discovery in Texts Yves Kodratoff CNRS, LRI Bat. 490 Univ. Paris-Sud, F - 91405 Orsay Cedex yk@lri.fr Text associated to a course delivered at the ECCAI summer course, Crete July 1999. To be published by Springer-Verlag in the Lecture Notes on AI (LNAI) - Tutorials series, 2000. SUMMARY : This presentation has two goals. The first goal is to compare ML and Knowledge Discovery in Data (KDD, also often called Data Mining, DM) in order to insist on how much they actually differ In order to make my ideas somewhat easier to understand, and as an illustration, I will include a description of several research topics that I find relevant to KDD and to KDD only. The second goal is to show that the definition I give of KDD can be almost directly applied to text analysis, and that will lead us to a very restrictive definition of Knowledge Discovery in Texts (KDT). I will provide a compelling example of a real-life set of rules obtained by what I call KDT techniques. 1. INTRODUCTION KDD is better known by the oversimplified name of Data Mining (DM). Actually, most academics are rather interested by DM which develops methods for extracting knowledge from a given set of data. Industrialists and experts should be more interested in KDD which comprises the whole process of data selection, data cleaning, transfer to a DM technique, applying the DM technique, validating the results of the DM technique, and finally interpreting them for the user. In general, this process is a cycle that improves under the criticism of the expert. Machine Learning (ML) and KDD have in common a very strong link : they both acknowledge the importance of induction as a normal way of thinking, while other scientific fields are reluctant to accept it, to say the least. We shall first explore this common point. We believe that this reluctance relies on a misuse of apparent contradictions inside the theory of confi
显示全部
相似文档