文档详情

P10_基于Stacking算法的组合分类器及其应用于中文组块分析.pdf

发布:2017-08-06约2.35万字共5页下载文档
文本预览下载声明
计算机研究与发展 ISSN CN 111777/ TP Journal of Computer Research and Development 42 (5) : 844~848 , 2005 基于 Stacking 算法的组合分类器及其应用于中文组块分析 李  珩  朱靖波  姚天顺 (东北大学计算机软件与理论研究所  沈阳  110004) (syliheng @sinacom) Combined Multiple Classif iers Based on a Stacking Algorithm and Their Applica tion to Chinese Text Chunking Li Heng , Zhu Jingbo , and Yao Tianshun ( ) Institute of Computer Sof tw are and Theory , Northeastern U niversity , S henyan g 110004 Abstract  Comparing with the combined multiple classifiers based on a voting algorithm , a twolayer classi fiercombination experimental framework is presented for Chinese text chunking , in which four diverse classifiers (transformationbased learning , sparse network of winnow , support vector machine , and memo ry based learning) are combined with a stacking algorithm The relevant information is incorporated into the twolayer framework as input feature vectors to construct more complete contextual models The chunking experiments are carried out on the HIT Chinese Treebank Corpus Experimental results show that it is an effective approach , which can achieve an F score of 9364 Key words  stacking ; multiple classifiers ; text chunking 摘  要  与基于 Voting 方法的组合分类器相比 ,提出基于 Stacking 算法的多分类器组合方法 ,通过构造 ( ) 一个两层的叠加式框架结构 ,将 4 种分类器 fn TBL ,SNoW ,SVM ,MBL 进行了组合 ,并融合各种可能 的上下文信息作为各层分类器的输入特征向量 ,在中文组块识别中取得了较好的效果 实验结果表明 , 组合后的分类器无论在准确率还是召回率上都有所提高 ,在哈尔滨工业大学树库语料的测试下达到了 F = 93 64 的结果 关键词  叠加式 ;多分类器 ;文本组块 中图法分类号  TP3911 一个由来已久的研究课题 1  引   言
显示全部
相似文档