首页 | 本学科首页   官方微博 | 高级检索  
     检索      

网页指纹查重技术的研究
引用本文:张晓伟.网页指纹查重技术的研究[J].价值工程,2014(15):225-226.
作者姓名:张晓伟
作者单位:泰山职业技术学院
摘    要:网页查重技术是浏览获取有用信息的关键技术,传统的查重方法中,选取关键词在网页中出现的频率来判断网页是否重复,如果关键词相近,可能造成查重误判的情况。本文提出网页本身特有的指纹技术,设计新的查重算法,通过与网页特征库中的指纹比较,完成网页的查重工作,提高查重的准确率。

关 键 词:网页指纹  网页查重  位置向量

On Webpage Fingerprint Duplicate Detection Technology
ZHANG Xiao-wei.On Webpage Fingerprint Duplicate Detection Technology[J].Value Engineering,2014(15):225-226.
Authors:ZHANG Xiao-wei
Institution:ZHANG Xiao-wei;Taishan Polytechnic;
Abstract:Webpage fingerprint checking is a key technology to scan and get useful information. The traditional method of webpage duplicate detection selects the frequency of occurrence of key words as the standard to verify whether it' s duplicate, the similar key words may mislead the duplicate detection. This paper proposed the unique webpage fingerprint technology, designed new detection algorithm. Comparing with webpage feature of fingerprint, the paper completes the webpage repeat-checking work and improves the accuracy of duplicate detection.
Keywords:webpage fingerprint  webpage duplicate detection  position vector
本文献已被 CNKI 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号