首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 187 毫秒
1.
There is widespread agreement in research and practice that data governance is an instrumental element to help organizations leverage and protect data. IS research has observed that our practical and our scientific knowledge of data governance remains limited, and the increasing ability for organizations to generate, acquire, store, transform, process and analyze data calls for us to further identify and address issues on the topic. Striving to contribute to answer this pressing need, we argue that understanding the nature and the implications of governance mechanisms is of high importance as it is these mechanisms that effectively instantiate data governance in an organization. Building on our experience preparing and teaching workshops to 102 executives on the topic, we adopt a position of engaged scholarship and provide a translational account of our pedagogical experience on data governance, highlighting four outstanding themes for IS research. We argue that these four themes—(1) embracing data governance without compromising digital innovation; (2) enacting data governance through repertoires of mechanisms; (3) moving away from data governance toward governing data; and (4) moving away from a view of data at rest to adopt a service-based perspective on data governance—are highly relevant for practice and research. In our view, studying these themes will contribute to inform practitioners who often struggle with the implementation of comprehensive data governance programs and frameworks. At the same time, the ability to leverage theory to study these themes can help research generate novel theoretical contributions on data governance, helping future research on the topic.  相似文献   

2.
本文的主要内容是介绍海洋地质数据库数据处理过程,其目的是对数据库建设的基础环节有更一步的认识,以期对相关工作人员有一定的参考意义。文章先介绍海洋地质数据库的数据实体;然后通过数据分组和分类加工、属性数据的批处理、空间数据处理和元数据处理等方面详细阐述了数据处理过程;最后指出数据质量控制是数据处理过程中的重要环节,是数据完整性、可靠性和可重复利用的根本保障。  相似文献   

3.
Cooperation between different data owners may lead to an improvement in forecast quality—for instance, by benefiting from spatiotemporal dependencies in geographically distributed time series. Due to business competitive factors and personal data protection concerns, however, said data owners might be unwilling to share their data. Interest in collaborative privacy-preserving forecasting is thus increasing. This paper analyzes the state-of-the-art and unveils several shortcomings of existing methods in guaranteeing data privacy when employing vector autoregressive models. The methods are divided into three groups: data transformation, secure multi-party computations, and decomposition methods. The analysis shows that state-of-the-art techniques have limitations in preserving data privacy, such as (i) the necessary trade-off between privacy and forecasting accuracy, empirically evaluated through simulations and real-world experiments based on solar data; and (ii) iterative model fitting processes, which reveal data after a number of iterations.  相似文献   

4.
This paper discusses the importance of managing data quality in academic research in its relation to satisfying the customer. This focus is on the data completeness objectivedimension of data quality in relation to recent advancements which have been made in the development of methods for analysing incomplete multivariate data. An overview and comparison of the traditional techniques with the recent advancements are provided. Multiple imputation is also discussed as a method of analysing incomplete multivariate data, which can potentially reduce some of the biases which can occur from using some of the traditional techniques. Despite these recent advancements in the analysis of incomplete multivariate data, evidence is presented which shows that researchers are not using these techniques to manage the data quality of their current research across a variety of academic disciplines. An analysis is then provided as to why these techniques have not been adopted along with suggestions to improve the frequency of their use in the future. Source-Reference. The ideas for this paper originated from research work on David J. Fogarty's Ph.D. dissertation. The subject area is the use of advanced techniques for the imputation of incomplete multivariate data on corporate data warehouses.  相似文献   

5.
In missing data problems, it is often the case that there is a natural test statistic for testing a statistical hypothesis had all the data been observed. A fuzzy  p -value approach to hypothesis testing has recently been proposed which is implemented by imputing the missing values in the "complete data" test statistic by values simulated from the conditional null distribution given the observed data. We argue that imputing data in this way will inevitably lead to loss in power. For the case of scalar parameter, we show that the asymptotic efficiency of the score test based on the imputed "complete data" relative to the score test based on the observed data is given by the ratio of the observed data information to the complete data information. Three examples involving probit regression, normal random effects model, and unidentified paired data are used for illustration. For testing linkage disequilibrium based on pooled genotype data, simulation results show that the imputed Neyman Pearson and Fisher exact tests are less powerful than a Wald-type test based on the observed data maximum likelihood estimator. In conclusion, we caution against the routine use of the fuzzy  p -value approach in latent variable or missing data problems and suggest some viable alternatives.  相似文献   

6.
针对当前刑侦海量档案数据信息,首先在分析其数据跨平台、复杂化和多样性特点的基础上,设计了刑侦数据仓库的概念模型、逻辑模型和物理模型;接着针对刑侦数据仓库及数据挖掘技术,对已有的刑侦档案数据进行信息整合和数据挖掘,获取大量的有用知识,这些知识在促进刑侦研究工作的同时,对一线的实际刑侦工作具有很大的参考价值;最后,文章给出了面向刑侦档案数据信息的仓库模型,针对其数据挖掘系统框架提出了相应的数据挖掘方法,为进一步的刑侦数据信息联机分析处理和有用信息挖掘以及为公安安全防范决策服务。  相似文献   

7.
This paper explores the impact of copyrights on firm value and on the demand for firm output. Using panel data on franchise value and ticket sales from the National Football League over the 1991–2000 period, we analyze the effect of copyrights (in this case, team logos) using several parametric estimators, the Arellano and Bond [1991. Some tests of specification for panel data: Monte Carlo evidence and an application to employment equations. Review of Economic Studies 58, 277–297] dynamic panel data estimator, and a semi-non-parametric method based on difference-in-differences propensity score matching. We find a negative effect of logo changes on franchise value that is robust across multiple specifications. In addition, logo changes also appear to have a moderate positive, albeit not particularly robust, impact on ticket sales.  相似文献   

8.
9.
Online communities have become an important source for knowledge and new ideas. This paper considers the potential of crowdsourcing as a tool for data analysis to address the increasing problems faced by companies in trying to deal with “Big Data”. By exposing the problem to a large number of participants proficient in different analytical techniques, crowd competitions can very quickly advance the technical frontier of what is possible using a given dataset. The empirical setting of the research is Kaggle, the world?s leading online platform for data analytics, which operates as a knowledge broker between companies aiming to outsource predictive modelling competitions and a network of over 100,000 data scientists that compete to produce the best solutions. The paper follows an exploratory case study design and focuses on the efforts by Dunnhumby, the consumer insight company behind the success of the Tesco Clubcard, to find and lever the enormous potential of the collective brain to predict shopper behaviour. By adopting a crowdsourcing approach to data analysis, Dunnhumby were able to extract information from their own data that was previously unavailable to them. Significantly, crowdsourcing effectively enabled Dunnhumby to experiment with over 2000 modelling approaches to their data rather than relying on the traditional internal biases within their R&D units.  相似文献   

10.
We discuss the impact of volatility estimates from high frequency data on derivative pricing. The principal purpose is to estimate the diffusion coefficient of an Itô process using a nonparametric Nadaraya–Watson kernel approach based on selective estimators of spot volatility proposed in the econometric literature, which are based on high frequency data. The accuracy of different spot volatility estimates is measured in terms of how accurately they can reproduce market option prices. To this aim, we fit a diffusion model to S&P 500 data, and successively, we use the calibrated model to price European call options written on the S&P 500 index. The estimation results are compared to well-known parametric alternatives available in the literature. Empirical results not only show that using intra-day data rather than daily provides better volatility estimates and hence smaller pricing errors, but also highlight that the choice of the spot volatility estimator has effective impact on pricing.  相似文献   

11.
Qualitative expectational data from business surveys are widely used to construct forecasts. However, based typically on evaluation at the macroeconomic level, doubts persist about the utility of these data. This paper evaluates the ability of the underlying firm-level expectations to anticipate subsequent outcomes. Importantly, this evaluation is not hampered by only having access to qualitative outcome data obtained from subsequent business surveys. Quantitative outcome data are also exploited. This required access to a unique panel dataset which matches firms’ responses from the qualitative business survey with the same firms’ quantitative replies to a different survey carried out by the national statistical office. Nonparametric tests then reveal an apparent paradox. Despite evidence that the qualitative and quantitative outcome data are related, we find that the expectational data offer rational forecasts of the qualitative but not the quantitative outcomes. We discuss the role of “discretisation” errors and the loss function in explaining this paradox.  相似文献   

12.
This paper investigates measurement error biases in estimated poverty transition matrices. We compare transition matrices based on survey expenditure data to transition matrices based on measurement‐error‐free simulated expenditure. The simulation model uses estimates that correct for measurement error in expenditure. We find that time‐varying measurement error in expenditure data magnifies economic mobility. Roughly 45% of households initially in poverty at time t ? 1 are found to be out of poverty at time t using data from the Korean Labor and Income Panel Study. When measurement error is removed, this drops to between 26 and 31% of households initially in poverty. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

13.
Policy makers must base their decisions on preliminary and partially revised data of varying reliability. Realistic modeling of data revisions is required to guide decision makers in their assessment of current and future conditions. This paper provides a new framework with which to model data revisions.Recent empirical work suggests that measurement errors typically have much more complex dynamics than existing models of data revisions allow. This paper describes a state-space model that allows for richer dynamics in these measurement errors, including the noise, news and spillover effects documented in this literature. We also show how to relax the common assumption that “true” values are observed after a few revisions.The result is a unified and flexible framework that allows for more realistic data revision properties, and allows the use of standard methods for optimal real-time estimation of trends and cycles. We illustrate the application of this framework with real-time data on US real output growth.  相似文献   

14.
The missing data problem has been widely addressed in the literature. The traditional methods for handling missing data may be not suited to spatial data, which can exhibit distinctive structures of dependence and/or heterogeneity. As a possible solution to the spatial missing data problem, this paper proposes an approach that combines the Bayesian Interpolation method [Benedetti, R. & Palma, D. (1994) Markov random field-based image subsampling method, Journal of Applied Statistics, 21(5), 495–509] with a multiple imputation procedure. The method is developed in a univariate and a multivariate framework, and its performance is evaluated through an empirical illustration based on data related to labour productivity in European regions.  相似文献   

15.
It is a pleasure to comment on the paper by Katsikopoulos et al. (2021), where they present a provocative and stimulating viewpoint in which they argue that simple forecasting rules based on heuristics frequently outperform big data models and should be used as a benchmark when testing big data models. I argue that it is important not to conflate simplicity with adaptability, and that there is a role for big data models in forecasting.  相似文献   

16.
对一次自动站运行故障解决方法、解决过程中遇到的问题进行阐述,得出地面测报业务人员在日常工作中应注意:按时进行自动站数据、台站参数备份;按时对自动站进行维护;实时对分钟及正点数据检查,加强学习提高自动站维护技能和计算机水平。并为出现同类型故障提供一种可参考的解决方法。  相似文献   

17.
Through building and testing theory, the practice of research animates data for human sense-making about the world. The IS field began in an era when research data was scarce; in today's age of big data, it is now abundant. Yet, IS researchers often enact methodological assumptions developed in a time of data scarcity, and many remain uncertain how to systematically take advantage of new opportunities afforded by big data. How should we adapt our research norms, traditions, and practices to reflect newfound data abundance? How can we leverage the availability of big data to generate cumulative and generalizable knowledge claims that are robust to threats to validity? To date, IS academics have largely welcomed the arrival of big data as an overwhelmingly positive development. A common refrain in the discipline is: more data is great, IS researchers know all about data, and we are a well-positioned discipline to leverage big data in research and teaching. In our opinion, many benefits of big data will be realized only with a thoughtful understanding of the implications of big data availability and, increasingly, a deliberate shift in IS research practices. We advocate for a need to re-visit and extend traditional models that are commonly used to guide much of IS research. Based on our analysis, we propose a research approach that incorporates consideration of big data—and associated implications such as data abundance—into a classic approach to building and testing theory. We close our commentary by discussing the implications of this hybrid approach for the organization, execution, and evaluation of theory-informed research. Our recommendations on how to update one approach to IS research practice may have relevance to all theory-informed researchers who seek to leverage big data.  相似文献   

18.
Scholars in our field, Operations and Supply Chain Management (OSCM), are under high pressure to show research productivity. At most schools, this productivity is measured by the number of journal articles published. One possible response to such pressure is to improve research efficiency: publishing more journal articles from each data collection effort. In other words, using one dataset for multiple publications. As long as each publication makes a sufficient contribution, and authors ensure transparency in methods and consistency across publications, generating more than one publication from one data collection effort is possible. The aim of this Notes and Debates article, however, is to draw attention to inappropriate reuse of empirical data in OSCM research, to explain its implications and to suggest ways in which to promote research quality and integrity. Based on two cases of extensive data reuse in OSCM, eighteen problematic practices associated with the reuse of data across multiple journal articles are identified. Recommendations on this issue of data reuse are provided for authors, reviewers, editors and readers.  相似文献   

19.
In forecasting, data mining is frequently perceived as a distinct technological discipline without immediate relevance to the challenges of time series prediction. However, Hand (2009) postulates that when the large cross-sectional datasets of data mining and the high-frequency time series of forecasting converge, common problems and opportunities are created for the two disciplines. This commentary attempts to establish the relationship between data mining and forecasting via the dataset properties of aggregate and disaggregate modelling, in order to identify areas where research in data mining may contribute to current forecasting challenges, and vice versa. To forecasting, data mining offers insights on how to handle large, sparse datasets with many binary variables, in feature and instance selection. Furthermore data mining and related disciplines may stimulate research into how to overcome selectivity bias using reject inference on observational datasets and, through the use of experimental time series data, how to extend the utility and costs of errors beyond measuring performance, and how to find suitable time series benchmarks to evaluate computer intensive algorithms. Equally, data mining can profit from forecasting’s expertise in handling nonstationary data to counter the out-of-date-data problem, and how to develop empirical evidence beyond the fine tuning of algorithms, leading to a number of potential synergies and stimulating research in both data mining and forecasting.  相似文献   

20.
The treatment of missing data has been overlooked by the OM literature, while other fields such as marketing, organizational behavior, economics, statistics and psychometrics have paid more attention to the issue. A review of 103 survey-based articles published in the Journal of Operations Management between 1993 and 2001 shows that listwise deletion, which is often the least accurate technique of dealing with missing data, is heavily utilized by OM researchers. The paper also discusses the research implications of missing data, types of missing data and concludes with recommendations on which techniques should be used under different circumstances in order to improve the treatment of missing data in OM survey research.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号