共查询到19条相似文献,搜索用时 203 毫秒
1.
2.
本文基于关键词集的中文文本特征表示方法,将本体和词共现的思想引入到中文文本的特征表示中,能够更加准确地表达中文文本特征,进而提高中文文本聚类的质量, 相似文献
3.
为促进项目参与方的合作和交流从而使项目更优质高效的完成,研究了结合IFC标准进行建设项目文档分类的方法。在对建设项目管理的特点进行深入分析的基础上,文章提出了将项目生命期中产生的大量的半结构化或非结构化的中文文本按照国际通用的IFC标准进行分类的方法,从而改进了文本的管理与利用效果。通过空间向量模型来表示中文文本,并采用夹角余弦的方法与国际通用的IFC标准中的实体进行相似度计算,最终实现中文文本的标准化分类,并通过案例分析验证了该方法的可行性。最后对本文提出的算法进行了评价,并提出了下一步的研究方向。 相似文献
4.
空间聚类是一个富有挑战性的研究领域。现介绍了各种空间聚类算法的研究现状与构造思想,探讨分析每类算法的优缺点、聚类领域开放性的问题及其研究方向。 相似文献
5.
6.
本文研究了基于模糊聚类的Web文本挖掘和模糊聚类有效性评价函数,并将其应用于Web文本挖掘中模糊聚类有效性评价。仿真实验表明该方法有一定的准确性和可行性。 相似文献
7.
为建设和谐文明的网络环境,提升对网络不良文本信息的识别和应对能力。文章使用一种新颖的基于增长型自组织特征映射(GSOFM)和潜在语义索引(LSI)相结合方法用于不良文本聚类。这两种算法的结合能够发现全局和局部的模式特点。实验在相同的条件下使用了这种新颖的模式并和单一的GSOFM相比较。实验结果证明:这种新的两种技术的结合与单一的GSOFM方法相比提高了聚类结果的精确性,缩短了计算时间,为网络不良文本聚类提供了一种较好的方法。 相似文献
8.
9.
《上海立信会计学院学报》2021,(1):3-22
随着大数据技术的日益成熟,以文本为对象的研究正引起学术界的重视,但目前尚处于起步阶段,有必要对文本分析技术和文献进行系统梳理。文章从文本信息来源、文本分析技术、文本特征提取、文本分析应用、国内文本分析研究现状五个维度对财务与会计领域现有文本分析技术和文献进行了详细介绍,并指出未来研究方向,有助于国内学者了解财务与会计领域文本分析研究的特征与进展,引起更多学者对财务与会计领域文本分析研究的重视。 相似文献
10.
基于聚类算法的云平台是当前公有云的主要应用形式。平台整体资源利用率最大化、单虚拟机性能的最优化以及平台的服务接受率是聚类算法的设计目标。抽象出基于聚类算法的交叉环物流服务模型,对物流的服务模式、交叉环物流调度服务器等进行了设计。在此基础上对函数聚类算法各种需求向量的交叉环物流调度进行优化分析,最后在自主研发的云计算平台上验证了可行性。 相似文献
11.
研究目标:构建反映行业股价走势的基于社交网络文本挖掘算法的行业投资者情绪指标,并改善嵌入行业投资者情绪指标的Black-Litterman模型对资产的配置结果。研究方法:基于社交网络文本挖掘算法度量投资者情绪,运用主成分分析法构建行业投资者情绪指标,并嵌入Black-Litterman模型中构建投资者观点矩阵,确定行业资产配置比。研究发现:基于行业投资者情绪的BL模型有效提高了资产配置的日均收益率和夏普比率。实证结果在样本外验证(除受新冠疫情影响阶段)、暴涨暴跌阶段以及经过允许卖空和交易成本调整后仍稳健,进而证实了投资者情绪对资产组合有显著影响。研究创新:基于社交网络文本挖掘算法构建投资者情绪指数,解决了仅依赖于预期收益或历史数据的预测模型无法直观揭示投资者心理认知和行为的局限性问题,从一个崭新的视角科学地解决Black-Litterman模型中投资者观点的生成问题。研究价值:扩展了Black-Litterman模型理论体系研究,并推动了行为金融理论在资产配置中的应用。 相似文献
12.
数据挖掘中聚类分析综述 总被引:1,自引:0,他引:1
数据挖掘中的聚类技术是一种非监督分类技术。概述了聚类分析算法中的数据结构和数据类型,分析了聚类分析的意义及研究现状,比较了几种聚类算法的优点及问题,并结合通信领域的应用指出了K-Means聚类技术的绝对优势。 相似文献
13.
《International Journal of Forecasting》2020,36(4):1563-1578
This study investigates the value added by incorporating textual data into customer churn prediction (CCP) models. It extends the previous literature by benchmarking convolutional neural networks (CNNs) against current best practices for analyzing textual data in CCP, and, using real life data from a European financial services provider, validates a framework that explains how textual data can be incorporated in a predictive model. First, the results confirm previous research showing that the inclusion of textual data in a CCP model improves its predictive performance. Second, CNNs outperform current best practices for text mining in CCP. Third, textual data are an important source of data for CCP, but unstructured textual data alone cannot create churn prediction models that are competitive with models that use traditional structured data. A calculation of the additional profit obtained from a customer retention campaign through the inclusion of textual information can be used by practitioners directly to help them make more informed decisions on whether to invest in text mining. 相似文献
14.
WALKING DOWN WALL STREET WITH A TABLET: A SURVEY OF STOCK MARKET PREDICTIONS USING THE WEB
下载免费PDF全文
![点击此处可从《Journal of economic surveys》网站下载免费的PDF全文](/ch/ext_images/free.gif)
Michela Nardo Marco Petracco‐Giudici Minás Naltsidis 《Journal of economic surveys》2016,30(2):356-369
‘A blindfolded chimpanzee throwing darts at The Wall Street Journal could select a portfolio that would do as well as the (stock market) experts’ [Malkiel (2003) The efficient market hypothesis and its critics. Journal of Economic Perspectives 17(1): 59–82)]. However, what if this chimpanzee could browse the Internet before throwing any darts? In this paper, we ask whether online news has any influence on the financial market, and we also investigate how much influence it has. We explore the burgeoning literature on the predictability of financial movements using online information and report its mixed findings. In addition, we collate the efforts of various disciplines, including economics, text mining, sentiment analysis and machine learning, and we offer suggestions for future research. 相似文献
15.
16.
《International Journal of Forecasting》2019,35(4):1548-1560
This study proposes a new, novel crude oil price forecasting method based on online media text mining, with the aim of capturing the more immediate market antecedents of price fluctuations. Specifically, this is an early attempt to apply deep learning techniques to crude oil forecasting, and to extract hidden patterns within online news media using a convolutional neural network (CNN). While the news-text sentiment features and the features extracted by the CNN model reveal significant relationships with the price change, they need to be grouped according to their topics in the price forecasting in order to obtain a greater forecasting accuracy. This study further proposes a feature grouping method based on the Latent Dirichlet Allocation (LDA) topic model for distinguishing effects from various online news topics. Optimized input variable combination is constructed using lag order selection and feature selection methods. Our empirical results suggest that the proposed topic-sentiment synthesis forecasting models perform better than the older benchmark models. In addition, text features and financial features are shown to be complementary in producing more accurate crude oil price forecasts. 相似文献
17.
提出了一种基于改进的小波变换和模糊核聚类的纹理分割方法。该方法首先用改进的离散小波变换进行纹理特征提取。然后用模糊核聚类方法对特征空间的每个像素进行聚类以实现对纹理的分割。实验结果表明所提算法有很好的分割结果。 相似文献
18.
《Enterprise Information Systems》2013,7(1):147-165
Information explosion is a critical challenge to the development of modern information systems. In particular, when the application of an information system is over the Internet, the amount of information over the web has been increasing exponentially and rapidly. Search engines, such as Google and Baidu, are essential tools for people to find the information from the Internet. Valuable information, however, is still likely submerged in the ocean of search results from those tools. By clustering the results into different groups based on subjects automatically, a search engine with the clustering feature allows users to select most relevant results quickly. In this paper, we propose an online semantics-based method to cluster Chinese web search results. First, we employ the generalised suffix tree to extract the longest common substrings (LCSs) from search snippets. Second, we use the HowNet to calculate the similarities of the words derived from the LCSs, and extract the most representative features by constructing the vocabulary chain. Third, we construct a vector of text features and calculate snippets’ semantic similarities. Finally, we improve the Chameleon algorithm to cluster snippets. Extensive experimental results have shown that the proposed algorithm has outperformed over the suffix tree clustering method and other traditional clustering methods. 相似文献
19.
针对桂林电子科技大学现有后勤管理系统相对落后的问题,采用B/S架构,运用开源框架Struts-Spring-Hibernate构建基于J2EE的高校后勤管理系统。该系统将开源框架与Ajax技术相结合,可实现系统的异步提交,减轻网络通信和服务器端的负担;以后勤管理数据为研究对象,通过数据仓库、OLAP、数据挖掘技术的综合运用发现后勤管理系统中的有用信息,为管理高层提供辅助决策支持,具有较强的实用价值。 相似文献