文档详情

1. Introduction To search and summarize on Internet with Human Language Technology.pdf

发布:2017-04-08约1.57万字共7页下载文档
文本预览下载声明
To search and summarize on Internet with Human Language Technology Hercules DALIANIS Department of Computer and System Sciences KTH and Stockholm University, Forum 100, 164 40 Kista, Sweden Email:hercules@kth.se Abstract. More and more text are available on the Internet and we need tools to tame this flow. Automatic text summarization is one solution, a text is given to the computer and it returns a non-redundant shorter text. Automatic text summarization can also be used in search engines to decrease time finding documents. To further improve search engines one can use human language technology in form of word analysis as stemming and spell checking. Other methods that can be used are multilingual or cross language information retrieval in searching and finding documents written in other languages than the languages one has knowledge in. In understanding foreign languages one can use machine translation techniques that today had become good enough for practical use. Machine translation (MT) is the technique where the computer translates automatically between natural languages. The MT-techniques have been developed since the early 50’ies. 1. Introduction The rapid change of our environment in form of more and more information available on the Internet increased the speed of development of highly advanced tools to extract, filter, retrieve and translate documents. Three research areas are automatic text summarization, information retrieval tools and machine translation. In automatic text summarization, the most relevant parts of a document are extracted and put together into a non-redundant summary that is shorter than the original document. A good overview of the area can be found in [1]. A more advanced form of summarization is multi-text summarization where several documents are condensed into one summary. 2. Application areas of automatic text summarization The application areas for automatic text summarization are extensive. As the amount of information on the
显示全部
相似文档