首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于高维聚类技术的中文关键词提取算法
引用本文:高学东,吴玲玉.基于高维聚类技术的中文关键词提取算法[J].中国会计电算化,2011(9):23-27.
作者姓名:高学东  吴玲玉
作者单位:北京科技大学经济管理学院,北京100083
摘    要:关键词提取是中文信息处理技术研究中的热点和难点,基于统计信息的方法是其中一个重要分支。本文针对基于统计信息关键词提取方法准确率低的问题,提出基于高维聚类技术的中文关键词提取算法。算法通过依据小词典的快速分词、二次分词、高维聚类和关键词甄选4个步骤实现关键词的提取。理论分析和实验显示,基于高维聚类技术的中文关键词提取方法具备更好的稳定性、更高的效率和更准确的结果。

关 键 词:关键词提取  小词典分词  高维聚类  CABOSFV

Chinese Keywords Extraction Algorithm Based on the High-dimensional Clustering Technique
Authors:GAO Xue-dong  WU Ling-yu
Institution:(School of Economics and Management,University of Science and Technology Beijing,Beijing 100083,China)
Abstract:Keywords extraction methods are hot and difficult spots in Chinese information processing technology,one of which is the method based on statistical information.In order to improve the inaccuracy of it,Chinese keywords extraction algorithm based on the high-dimensional clustering technique is proposed.After word segmenting with small volume dictionary,secondary word segmenting,high-dimensional clustering and word selecting,the algorithm finds the keywords from article.Theoretical analysis and test show that the Chinese keywords extraction algorithm based on the high-dimensional clustering technique has better stability,efficiency and accuracy.
Keywords:Keywords Extraction  Word Segmentation with Small Volume Dictionary  High-Dimensional Clustering  CABOSFV
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号