首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
李苍祺  谢识予 《征信》2021,39(1):67-70
在大数据时代,现有的数据质量参差不齐、缺少统一的代码和口径,使得宏观数据与微观数据之间的融合存在问题。以宏观数据库为基准,将微观数据库匹配到宏观数据库是解决数据融合难题的一个方法。建议设置宏观数据库与微观数据库,并鼓励学者上传数据、定期融合数据库、加强与科研机构和数据公司的合作,以此来完善我国数据库的建设工作。  相似文献   

2.
A golden rule for data modelling for data mining classification models with special considerations of problems in insurances. To create classification models to avoid contract cancellations and for cross selling purposes to be used in marketing and sales of insurance companies the necessary data modelling will be discussed. Starting from a binary classification variable — cancelled contracts and active contracts, customers of a branch and non-customers of a branch — we in particular focus on the importance of historical data: To be able to detect decision patterns for cancellations respectively for new contracts in the data with the help of data mining tools, it is necessary for such contracts respectively customers not to use actual data, but data as they were at the time of decision. This obvious, but rarely used principle, is presented in detail as a golden rule for correct data modelling in such situations. As a case study a project and results for nine branches in each case of the Gothaer Versicherungen is presented.  相似文献   

3.
数据是数字经济时代最重要的生产要素之一,流动的数据才能发挥数据价值,金融数据共享是数据流动的一种形式。金融活动中金融数据在不同金融机构之间、金融机构与科技企业之间、金融机构与金融监管机构之间共享,金融数据控制者对其所控制的数据的财产性权益期待是促进数据共享的动力。金融数据中有大量关涉个人信息的数据,而且金融数据安全关系到国家金融体系安全,在确保国家金融体系稳健和个人信息安全的前提下,将金融数据权利赋予金融数据控制者,有利于促进金融数据共享,最大限度地挖掘金融数据价值。为平衡金融数据共享主体之间的利益冲突,应当以确定数据权属为底层逻辑,优先保障个人信息权益,并通过引进侵权惩罚性赔偿机制,调整惩罚性赔偿金分配机制和个人信息侵权公益诉讼规则,完善个人信息侵权赔偿制度。  相似文献   

4.
Although managers consider accurate, timely and relevant information as critical to the quality of their decisions, evidence of large variations in data quality abounds. This research examines factors influencing the level of data quality within a target organization. The results indicate that management's commitment to data quality and the presence of data quality champions strongly influence data quality in the target organization. The results also show that the managers of the participating organization are committed to achieving and maintaining high data quality. However, changing work processes and establishing a data quality awareness culture are required to motivate further improvements to data quality.  相似文献   

5.
近年来,大数据技术的发展和广泛应用给国家经济社会带来了深刻的影响,也给审计工作带来了新的机遇.2014年《国务院关于加强审计工作的意见》明确指出"探索在审计实践中运用大数据技术的途径,加大数据综合利用力度,提高运用信息化技术查核问题、评价判断、宏观分析的能力".在此背景下,学术界和实务界关于大数据技术在审计工作中的应用...  相似文献   

6.
This essay highlights an underutilized source of data in insurance research: The consolidated data insurance holding companies disclose in Form 10‐K to their investors. A reinsurance example demonstrates that using consolidated data, as a complement to individual company statutory data, has the potential to extend insights gleaned from statutory data alone. Consolidated data, however, is limited to publicly traded insurers.  相似文献   

7.
An important role for accountants today is to provide decision support to senior management by assisting them in the analysis of large, complex data sets. Interactive data visualization (IDV) facilitates this process by allowing users to navigate, select, and display data via an easy-to-use interface often used as a component of data analytics. Given the increasing popularity of IDV as a tool for making sense of complex data, it is important that accountants become familiar with and learn how to use this technology. This case provides a hands-on opportunity to organize complex accounting data to create IDVs for decision makers to use. Further, the case enables students to understand the potential impact of IDVs on preparers and users of accounting information. Students will assume the role of a division controller in a hypothetical company and create an IDV to assist the chief executive officer (CEO) in decision making.  相似文献   

8.
The widespread activity involving the Internet and the Web causes large amounts of electronic data to be generated every day. This includes, in particular, semi-structured textual data such as electronic documents, computer programs, log files, transaction records, literature citations, and emails. Storing and manipulating the data thus produced has proven difficult. As conventional DBMSs are not suitable for handling semi-structured data, there is a strong demand for systems that are capable of handling large volumes of complex data in an efficient and reliable way. The Extensible Markup Language (XML) provides such solution. In this paper, we present the concept of a ‘vertical view model’ and its uses as a mapping mechanism for converting complex XML data to relational database tables, and as a standalone data model for storing complex XML data. Copyright © 2003 John Wiley & Sons, Ltd.  相似文献   

9.
Researchers commonly use industry classifications as a means of identifying peer companies to use as a performance benchmark. We describe the structure of commonly used sources of industry classification data available for Australian listed companies, both static and in time series. Next, we run a series of experiments matching firms according to GICS classification data presented in time series versus static data sources. Our results indicate that performance measures are better specified when matching on GICS data from a dynamic relative to a static source. The results of our power tests also underscore the importance of using dynamic industry data.  相似文献   

10.
This paper explores the application of data mining techniques to fraud detection in the audit of financial statements and proposes a taxonomy to support and guide future research. Currently, the application of data mining to auditing is at an early stage of development and researchers take a scatter-shot approach, investigating patterns in financial statement disclosures, text in annual reports and MD&As, and the nature of journal entries without appropriate guidance being drawn from lessons in known fraud patterns. To develop structure to research in data mining, we create a taxonomy that combines research on patterns of observed fraud schemes with an appreciation of areas that benefit from productive application of data mining. We encapsulate traditional views of data mining that operates primarily on quantitative data, such as financial statement and journal entry data. In addition, we draw on other forms of data mining, notably text and email mining.  相似文献   

11.
Relational databases are the predominant method for storing repetitive data in computers because they allow efficient and flexible storage of that data. While medical directors and underwriters are more likely to use a spreadsheet than a database program to analyze their business, the data they wish to study are often stored in corporate databases. Or the data may be complex enough to require being keyed into or downloaded into a personal computer (PC) database program for storage, even if the data are then output to a spread-sheet for numerical analysis. In many circumstances, one can benefit from an understanding of efficient database design. After a brief overview, the reader is led step-by-step through a practical explanation of database design, from a flat file to a relational model.  相似文献   

12.
An efficient and effective analysis of business data requires a better understanding of what the data represents, and to what degree. A human‐like way of accomplishing that without being too detailed yet learning more about data content is to summarize and map the data into concepts familiar to a person performing analysis. Processes of summarization help identify the most essential facts that are embedded in the data. All this is of significant importance for analysis of large amounts of business data required to make good and sound financial decisions. There are two aspects enabling more comprehensive yet easier processing of data: a standardized representation format of financial data; and a human‐friendly way of defining concepts and using them for building personalized models representing processing data. The first of the aspects has been addressed by the eXtensible Business Reporting Language (XBRL)—a standardized format of defining, representing and exchanging corporate and financial information. The second aspect is related to providing individuals with the ability to gain understanding of data content via determining a degree of truth of statements summarizing data based on their own perception of concepts they are looking for. In this paper, we introduce a tablet application—Tablet‐based input of Fuzzy Sets (TiFS)—and demonstrate its usefulness for entering personalized definitions of concepts and terms that enable a quick analysis of financial data. Such analysis means utilization of soft queries and operations of aggregation that extract and summarize the data and present it in a form familiar to analysts. The application allows for defining concepts and terms with ‘finger‐made’ drawings representing a person's perception of concepts. Further, these definitions are used to build summarization statements for exploring XBRL data. They are equipped with ‘drawn’ definitions of linguistic terms (e.g. LARGE, SMALL, FAST) and linguistic quantifiers (e.g. ALL, MOSTLY), and enable summarization of data content from the perspective of a user's interests. The ‘drawn’ linguistic terms and quantifiers represent membership functions of fuzzy sets. Utilization of fuzzy sets allows for performing operations of data summarization in a human‐like way. The application of TiFS illustrates ease of inputting personalized definitions of concepts and their influence on the interpretation of data. This introduces aspects of personalization and adaptation of artificial intelligence systems to perceptions and views of individuals. The proposed application is used to perform a basic analysis of an XBRL document.  相似文献   

13.
This paper discusses a framework for refining an initial object-level rule base with a rule induction to learn meta-level rules which find a data set applicable to an object-level rule. A rule induction process such as ID3 tries to learn meta-level rules and classifies given training data sets into positive data sets and negative ones. The rule refinement process tries to refine an initial object-level rule base on classified data sets by using four refinement strategies. Unifying these two processes, one can obtain a refined object-level rule base with high performance where a meta-level rule selects a data set applicable to it. In order to evaluate the framework, an experiment on real Japanese stock price data shows that a refined object-level rule base, which comes from the initial object-level rule base for representing Granville's Law, has a performance beyond that of the average stock price. The performance is difficult for human technical analysts in a stock market to achieve. The result implies that the framework could create an anomaly from Granville's Law in a stock market technical analysis.  相似文献   

14.
Information access at no cost or low cost is becoming vital for academic researchers in finance and economics. As a result of the budget cuts at most universities in the United States, the vast majority of schools of business, other than research I institutions are now unable to purchase readily available financial data on CDs or tapes (such as CRSP or Compustat data). In order to continue their research agenda, researchers at such schools must find alternative low cost sources of data. In addition to the difficulty in locating the data, iit is a challenge to downloadthe data that one needs in a timely manner. In fact, locating and downloading a particular set of financial data when time is at a premium can be a source of frustration for researchers and educators. This paper removes some of the difficulties that researchers and educators encounter when trying to locate and access financial data on the Internet. It provides an easy way of accessing and downloading one of the most useful financial data sets available on the Internet. In particular, the author shows how to download sets of selected interest rates of the Federal Reserve Bank of Chicago web site. The same steps could be used to access and download other financial data sets that are available on the Board of Governors web site or any of the remaining eleven Federal Reserve Banks' web sites. The data sets include current and historical daily, weekly and monthly rates for a number of financial securities including, certificates of deposit, commercial paper, federal funds, banker's acceptance, Eurodollars deposit, Treasury‐bills, Treasury ‐bonds, finance paper, state and local bonds, conventional mortgages, and rated corporate bonds. The data sets also include the Federal Reserve discount rate, foreign exchange rates and a wide selection of macroeconomic variables.  相似文献   

15.
面对市场主体数量的爆炸性增长,只有结合大数据时代的信息特征及需求,对传统的纳税评估信息采集模式进行创新才能适应税务信息化发展的要求。通过非结构化数据采集、可视化服务、采集众包模式,创新纳税评估信息采集流程和管理;依托物联网技术,创立适应纳税评估数据海量增长不确定性的弹性构架体系。构建以国家统一税务应用平台为中心,以共享数据辅助系统为支撑的纳税评估信息采集模式,是适应大数据时代信息采集需求,促进税源监控,提高纳税评估效率的重要方式。  相似文献   

16.
This paper examines the “V-Matrix” and provides a wave theory life cycle model of organizations’ adoption of big data. The V-Matrix is based on the big data five “V’s”: Volume, Velocity, Variety, Veracity, and Value and captures and enumerates the different potential states that an organization can go through as part of its adoption and evolution towards big data. We extend the V-Matrix to a state space approach in order to provide a characterization of the adoption of big data technologies in an organization. We develop and use a wave theory of implementation to accommodate a firm’s movement through the V-Matrix. Accordingly, the V-Matrix provides a life cycle model of organizational use of the different aspects of big data. In addition, the model can help organizations’ plan for decision-making use of big data as they anticipate movement from one state to another, as they add big data capabilities. As part of this analysis, the paper examines some of the issues that occur in the different states, including synergies and other issues associated with co-occurrence of different V’s with each other. Finally, this paper integrates the V-Matrix with other data analytic life cycles and examines some of the implications of those models.  相似文献   

17.
Three government bond futures contracts and their respective 3-month interest rate futures contracts traded on LIFFE are examined. The data period covers three years of observations, January 1994-December 1996, sampled at half-hourly intervals. Borrowing from the calculation of minimum variance hedge ratios, half-hourly minimum variance spread ratios (the ratio of one contract to another, which provides the minimum variance) are estimated for the above contracts. The hypothesis under examination is whether there is any value-added in estimating minimum spread ratios based on intraday data. Three spread ratios are defined: two ratios calculated from daily data and a third one based on intraday data. Evidence tends to indicate that spread ratios calculated from intraday data exhibit a substantially lower variance than the other two spread ratio speciications. Thus, it is shown that intraday data, in comparison with daily data, allow for lower hedging costs. Moreover, the use of intraday-based spread ratios might be a contributing factor to reducing the maximum cumulative loss potentially incurred while holding a spread position.  相似文献   

18.
This data insight highlights the Transportation Security Administration (TSA) claims data as an underused data set that would be particularly useful to researchers developing statistical models to analyze claim frequency and severity. Individuals who have been injured or had items damaged, lost or stolen may make a claim for losses to the TSA. The federal government reports information on every claim from 2002 to 2017 at https://www.dhs.gov/tsa-claims-data . Information collected includes claim date and type and site as well as closed claim amount and disposition (whether it was approved in full, denied, or settled. We provide summary statistics on the frequency and the severity of the data for the years 2003 to 2015. The data set has several unique features including severity is not truncated (there is no deductible), there are significant mass points in the severity data, and the frequency data shows a high degree of auto correlation if compiled on a weekly basis, and substantial frequency mass points at zero for daily data.  相似文献   

19.
The gradual but marked decline in the correspondence between aggregated accounting numbers and market valuations, such as stock returns, is a well-documented phenomenon in the research literature (Lev and Zarowin, 1999). Rapid advances in technology have paved the way for the collection of unprecedented volumes of data. Currently, the slow speed of information dissemination, laggard accounting systems, and a focus on high levels of aggregation are perhaps the largest contributors to waning relevance of financial reporting. The fight for trading superiority is leading users to seek relevant data elsewhere and may contribute to these observed effects. This paper proposes an accounting system known as User XBRL (U-XBRL) designed to overcome these issues. This system collects, analyzes, and displays information in such a way that caters to the speed, detail, and customization demands of modern-day stakeholders (Krahel and Titera, 2015). U-XBRL amalgamates all types of data pertinent to a business, including both internal company data and exogenous source data. Each piece of data is assigned to a firm resource according to the resource-based view. Then, U-XBRL standardizes the information according to data standards and feeds it to a central repository. This repository is primarily organized through XBRL tags and is governed secondarily by other standards and taxonomies. A number of applications can be used individually to select data from the repository for analysis. Using U-XBRL the recognition, monitoring, and assurance of resources are streamlined.  相似文献   

20.
Indexes of commercial property prices face much scarcer transactions data than housing indexes, yet the advent of tradable derivatives on commercial property places a premium on both high frequency and accuracy of such indexes. The dilemma is that with scarce data a low-frequency return index (such as annual) is necessary to accumulate enough sales data in each period. This paper presents an approach to address this problem using a two-stage frequency conversion procedure, by first estimating lower-frequency indexes staggered in time, and then applying a generalized inverse estimator to convert from lower to higher frequency return series. The two-stage procedure can improve the accuracy of high-frequency indexes in scarce data environments. In this paper the method is demonstrated and analyzed by application to empirical commercial property repeat-sales data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号