文档详情

Lecture 3 retrieval model-I 2011.ppt

发布:2017-03-25约1.41万字共50页下载文档
文本预览下载声明
7/7/2005 7/7/2005 Lecture 1 Overview Lecture 1 Overview Lecture 3 Retrieval Models Part 1 Reference: James Allan, University of Massachusetts Amherst Pandu Nayak and Prabhakar Raghavan, Stanford University Partially modified by Qingcai Chen, HIT Shenzhen What is a retrieval model? Retrieval models (检索模型) can describe the computational process of IR e.g. how documents are ranked Note that how documents or indexes (索引) are stored is implementation Retrieval models can attempt to describe the human process e.g. the information need, interaction Retrieval variables queries (查询), documents (文档), terms (术语), relevance judgments (相关性判别), users, information needs, … Retrieval models have an explicit or implicit definition of relevance (相关度) Models we’ll consider Boolean (布尔模型) (exact match) Statistical language models (统计语言模型) Vector space(向量空间) Latent Semantic Indexing (潜层语义分析) Inference network Classic probabilistic approaches(经典概率模型) Other models exist Topological Generalized vector space Logic-based Exact vs. Best Match Exact-match (精确匹配) (例: “哈工大” ≠“哈尔滨工业大学”, “哈工程” ≠ “哈尔滨工业大学”) query specifies precise retrieval criteria every document either matches or fails to match query result is a set of documents Unordered in pure exact match Best-match (最佳匹配) (例: “哈工大” ≈”哈尔滨工业大学”,相似度80%, “哈工程” ≈”哈尔滨工业大学”,相似度50%) Query describes good or “best” matching document Every document matches query to some degree Result is ranked list of documents Popular approaches often provide some of each E.g., some type of ranking of result set E.g., best-match query language that incorporates exact-match operators (Unranked) Boolean retrieval Boolean model is most common exact-match model queries are logic expressions with document features as operands In pure Boolean model, retrieved documents are not ranked Most implementations provide some sort of ranking query formulation difficult for novice users (新用户) Boolean queries (布尔查询) Used by Boolean model and in o
显示全部
相似文档