文档详情

chapter2.ppt.Convertor.doc

发布:2017-02-12约9.25千字共9页下载文档
文本预览下载声明
Modern Information Retrieval Chapter 2 Modeling Review Chapter 1 Important parts 1.2 1.4 Figure 1.1 ,1.2,1.3 Chapter 2 Modeling 教学目的与要求: 1.Make students comprehend trends and research issues 2.Make students familiar with a taxonomy of information retrieval models 3.Make students master retrieval models 教学内容: 1.A Taxonomy (分类)of Information Retrieval Models 2.Retrieval:Ad hoc and Filtering 3.A Formal Characterization(形式特征) of IR Models 4.Classic Information Retrieval 5.Alternative Set Theoretic Models(集合理论模型) 6.Alternative Algebraic Models(代数模型) 7.Alternative Probabilistic Models(概率模型) 8.Structured Text Retrieval Models(结构化文本检索模型) 9.Models for Browsing(浏览模型) 重点与难点: Classical Information Retrieval Models The Hypertext Model 8 2.1 Introduction Traditional IR systems usually adopt index terms to index and retrieve documents Index term: a keyword or group of selected words any word (more general) Oversimplification! 9Docs Information Need Index Terms doc query Ranking match 10 one central problem regarding information retrieval systems is the issue of predicting which documents are relevant and which are not. Such a decision决定 is usually dependent on a ranking algorithm(排序算法) which attempts to establish a simple ordering of the documents retrieved. ranking algorithms are at the core of information retrieval systems. A ranking algorithm operates according to basic premises regarding the notion of document relevance. Distinct sets of premises yield distinct information retrieval models. The IR model adopted determines the predictions of what is relevant and what is not. 不同的假设?产生不同的IR模型?决定了不同的排序算法,即决定文献是否相关。 12 信息检索模型 信息检索模型是指如何对查询(query)和文档(document)进行表示,然后对它们进行相似度计算的框架和方法。 ??本质上是对相关度建模。 ??信息检索模型是IR中的核心内容之一。 13 相关概念 标引项(Index Term) ??文档表示成多个Term的集合 ??通常用词来表示,但是也可以用其他语言单位来表示 ??Term可以看成关键词(key words) 标引项的权重(Weight) ??不同标引项作用是不同的 ??通过权重加以区分 14 2.2 A Taxonomy of Information Retrieval Models three classic models( Boolean, vector, and probabilistic) set
显示全部
相似文档