文档详情

数据挖掘外文文献翻译中英文.doc

发布:2018-04-18约1.39万字共14页下载文档
文本预览下载声明
数据挖掘外文文献翻译?(含:英文原文及中文译文) 英文原文 What is Data Mining? Simply stated, data mining refers to extracting or “mining” knowledge from large amounts of data. The term is actually a misnomer. Remember that the mining of gold from rocks or sand is referred to as gold mining rather than rock or sand mining. Thus, “data mining” should have been more appropriately named “knowledge mining from data”, which is unfortunately somewhat long. “Knowledge mining”, a shorter term, may not reflect the emphasis on mining from large amounts of data. Nevertheless, mining is a vivid term characterizing the process that finds a small set of precious nuggets from a great deal of raw material. Thus, such a misnomer which carries both “data” and “mining” became a popular choice . There are many other terms carrying a similar or slightly different meaning to data mining, such as knowledge mining from databases, knowledge extraction, data / pattern analysis, data archaeology, and data dredging. Many people treat data mining as a synonym for another popularly used term, “Knowledge Discovery in Databases”, or KDD. Alternatively, others view data mining as simply an essential step in the process of knowledge discovery in databases. Knowledge discovery consists of an iterative sequence of the following steps: · data cleaning: to remove noise or irrelevant data, · data integration: where multiple data sources may be combined, · data selection : where data relevant to the analysis task are retrieved from the database, · data transformati on : where data are transformed or consolidated into forms appropriate for mining by performing summary or aggregation operations, for instance, · data mining: an essential process where intelligent methods are applied in order to extract data patterns, · pattern evaluation: to identify the truly interesting patterns representing knowledge based on some interestingness measures, and · knowledge presentation: where visualization and knowledge representation techniques are
显示全部
相似文档