排序方式: 共有4条查询结果,搜索用时 20 毫秒
1
1.
2.
在互联网日益普及的今天,企业越来越重视自身的信息化建设,企业信息化的发展使得接入互联网的企业不断增长,如何保护企业信息安全成为企业发展的关键内容之一。本文采取数理统计及文本分类的方法,对来自某知名黑客论坛的300余万条数据进行分析,通过TF-IDF模型与KNN算法分类思想,得出不同行业的网络信息安全威胁程度,并划分出较低、适中以及较高三类等级。在此基础上,根据行业特点深入剖析了不同行业产生信息安全问题的原因,并提出了相应的改进措施和建议。 相似文献
3.
4.
《Enterprise Information Systems》2013,7(1):107-120
Automated information retrieval is critical for enterprise information systems to acquire knowledge from the vast amount of data sets. One challenge in information retrieval is text classification. Current practices rely heavily on the classical naïve Bayes algorithm due to its simplicity and robustness. However, results from this algorithm are not always satisfactory. In this article, the limitations of the naïve Bayes algorithm are discussed, and it is found that the assumption on the independence of terms is the main reason for an unsatisfactory classification in many real-world applications. To overcome the limitations, the dependent factors are considered by integrating a term frequency–inverse document frequency (TF-IDF) weighting algorithm in the naïve Bayes classification. Moreover, the TF-IDF algorithm itself is improved so that both frequencies and distribution information are taken into consideration. To illustrate the effectiveness of the proposed method, two simulation experiments were conducted, and the comparisons with other classification methods have shown that the proposed method has outperformed other existing algorithms in terms of precision and index recall rate. 相似文献
1