首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
When proposed new fraud detection systems are tested in revolving credit operations, a straightforward comparison of the observed fraud detection rates is subject to a selectivity bias that tends to favour the existing system. This bias arises from the fact that accounts are terminated when the existing system, but not the proposed new system, detects a fraudulent transaction. This therefore flatters the estimated detection rate of the existing system. We develop more formal estimators that can be used to compare the existing and proposed new systems without risking this effect. We also assess the magnitude of the bias.  相似文献   

2.
Big data is often described as a new frontier of IT-enabled competitive advantage. A limited number of exemplary firms have been used recurrently in the big data debate to serve as successful illustrations of what big data technologies can offer. These firms are well-known, data-driven organizations that often, but not always, are born digital companies. Comparatively little attention has been paid to the challenges that many incumbent organizations face when they try to explore a possible adoption of such technologies. This study investigates how incumbents handle such an exploration and what challenges they face. Drawing on a four-year qualitative field study of four large Scandinavian firms, we are able to develop a typology of how incumbents handle the exploration of and resistance to adopting big data technologies. Directly affecting the incumbents’ exploration are two aspects that separate the adoption of big data technologies from that of other technologies. First, being an elusive concept, big data technologies can mean different things to different organizations. This makes the technologies difficult to explain before an investing body, while it simultaneously opens up possibilities for creative definitions. Second, big data technologies have a transformative effect on the organization of work in firms. This transformative capability will make managers wary as it might threaten their position in the firm, and it will create ripple effects, transforming other systems besides those directly connected to the technology.  相似文献   

3.
Fraud problems in loan application assessment cause significant losses for finance companies worldwide, and much research has focused on machine learning methods to improve the efficacy of fraud detection in some financial domains. However, diverse information falsification in individual fraud remains one of the most challenging problems in loan applications. To this end, we conducted an empirical study to explore the relationships between various fraud types and analyzed the factors influencing information fabrication. Weak relationships exist among different falsification types, and some essential factors play the same roles in different fraud types. In contrast, others have various or opposing effects on these types of frauds. Based on this finding, we propose a novel hierarchical multi-task learning approach to refine fraud-detection systems. Specifically, we first developed a hierarchical fraud category method to break down this problem into several subtasks according to the information types falsified by customers, reducing fraud identification's difficulty. Second, a heterogeneous network with a meta-path-based random walk and heterogeneous skip-gram model can solve the representation learning problem owing to the sophisticated relationships among the applicants' information. Furthermore, the final subtasks can be predicted using a multi-task learning approach with two prediction layers. The first layer provides the probabilities of general fraud categories as auxiliary information for the second layer, which is for specific subtask prediction. Finally, we conducted extensive experiments based on a real-world dataset to demonstrate the effectiveness of the proposed approach.  相似文献   

4.
This paper develops a novel time-varying multivariate Copula-MIDAS-GARCH (TVM-Copula-MIDAS-GARCH) model with exogenous explanatory variables to model the joint distribution of returns. The model accounts for mixed frequency factors that affect the time-varying dependence structure of financial assets. Furthermore, we examine the effectiveness of the proposed model in VaR-based portfolio selection. We conduct an empirical analysis on estimating the 90%, 95%, 99% VaRs of the portfolio constituted of the Shanghai Composite Index, Shanghai SE Fund Index, and Shanghai SE Treasury Bond Index. The empirical results show that the proposed TVM-Copula-MIDAS-GARCH model is effective to investigate the nonlinear time-varying dependence among those three indices and performs better in portfolio selection.  相似文献   

5.
Through building and testing theory, the practice of research animates data for human sense-making about the world. The IS field began in an era when research data was scarce; in today's age of big data, it is now abundant. Yet, IS researchers often enact methodological assumptions developed in a time of data scarcity, and many remain uncertain how to systematically take advantage of new opportunities afforded by big data. How should we adapt our research norms, traditions, and practices to reflect newfound data abundance? How can we leverage the availability of big data to generate cumulative and generalizable knowledge claims that are robust to threats to validity? To date, IS academics have largely welcomed the arrival of big data as an overwhelmingly positive development. A common refrain in the discipline is: more data is great, IS researchers know all about data, and we are a well-positioned discipline to leverage big data in research and teaching. In our opinion, many benefits of big data will be realized only with a thoughtful understanding of the implications of big data availability and, increasingly, a deliberate shift in IS research practices. We advocate for a need to re-visit and extend traditional models that are commonly used to guide much of IS research. Based on our analysis, we propose a research approach that incorporates consideration of big data—and associated implications such as data abundance—into a classic approach to building and testing theory. We close our commentary by discussing the implications of this hybrid approach for the organization, execution, and evaluation of theory-informed research. Our recommendations on how to update one approach to IS research practice may have relevance to all theory-informed researchers who seek to leverage big data.  相似文献   

6.
7.
Forecasts have traditionally served as the basis for planning and executing supply chain activities. Forecasts drive supply chain decisions, and they have become critically important due to increasing customer expectations, shortening lead times, and the need to manage scarce resources. Over the last ten years, advances in technology and data collection systems have resulted in the generation of huge volumes of data on a wide variety of topics and at great speed. This paper reviews the impact that this explosion of data is having on product forecasting and how it is improving it. While much of this review will focus on time series data, we will also explore how such data can be used to obtain insights into consumer behavior, and the impact of such data on organizational forecasting.  相似文献   

8.
Data with large dimensions will bring various problems to the application of data envelopment analysis (DEA). In this study, we focus on a “big data” problem related to the considerably large dimensions of the input-output data. The four most widely used approaches to guide dimension reduction in DEA are compared via Monte Carlo simulation, including principal component analysis (PCA-DEA), which is based on the idea of aggregating input and output, efficiency contribution measurement (ECM), average efficiency measure (AEC), and regression-based detection (RB), which is based on the idea of variable selection. We compare the performance of these methods under different scenarios and a brand-new comparison benchmark for the simulation test. In addition, we discuss the effect of initial variable selection in RB for the first time. Based on the results, we offer guidelines that are more reliable on how to choose an appropriate method.  相似文献   

9.
Recent advances in information technology have led to profound changes in global manufacturing. This study focuses on the theoretical and practical challenges and opportunities arising from the Internet of Things (IoT) as it enables new ways of supply-chain operations partially based on big-data analytics and changes in the nature of industries. We intend to reveal the acting principle of the IoT and its implications for big-data analytics on the supply chain operational performance, particularly with regard to dynamics of operational coordination and optimization for supply chains by leveraging big data obtained from smart connected products (SCPs), and the governance mechanism of big-data sharing. Building on literature closely related to our focal topic, we analyze and deduce the substantial influence of disruptive technologies and emerging business models including the IoT, big data analytics and SCPs on many aspects of supply chains, such as consumers value judgment, products development, resources allocation, operations optimization, revenue management and network governance. Furthermore, we propose several research directions and corresponding research schemes in the new situations. This study aims to promote future researches in the field of big data-driven supply chain management with the IoT, help firms improve data-driven operational decisions, and provide government a reference to advance and regulate the development of the IoT and big data industry.  相似文献   

10.
The seed of this special section was the workshop celebrated at FUNCAS in Madrid in February 2019 “30 Years of Cointegration and Dynamic Factor Models Forecasting and its Future with Big Data”. In this editorial, we describe the main contributions of the 13 papers published within the special section towards forecasting in the context of non- stationary Big Data using cointegration or Dynamic Factor Models.  相似文献   

11.
Nowadays, issues such as limited natural resources, environmental problems, social matters, and significance of resilience in agricultural supply chain (ASC) have dragged considerable attention worldwide. In this research, a five-level multi-objective stochastic mixed-integer linear programming model is designed for tea supply chain (TSC) in Iran. The objective functions of the suggested network are minimizing total costs of the supply chain (SC), the total water consumption, and non-resilience measures, and maximizing job opportunities of facilities. Literally, considering uncertainty for SC networks is extremely beneficial due to the existence of some variations in different parameters like demand. As a consequence, a robust possibilistic optimization (RPO) is implemented to manage the uncertainty. Due to the nature of the multi-objective optimization problem, the weighted-normalized-extended goal programming (WNEGP) approach is employed to solve the model. In order to credit the model, real data is collected from the tea organization of Iran. It is worth mentioning that parameters are gathered according to three aspects of big data: volume, velocity, and variety. The results validated the functionality of the model regarding planning strategy. In addition, it showed applying more costs on SC triggers an effective sustainable-resilient-responsive network. In terms of managerial insights, this study offers a far-reaching perspective to managers especially in ASC to develop their industries. Finally, some sensitivity analyses are discussed on key parameters such as demand, robustness coefficients, and also the value of the objective functions in various states. It is worth mentioning that sensitivity analyses on different states of the problem show how sustainability and resiliency affect the supply chain efficiency.  相似文献   

12.
This paper presents some two-step estimators for a wide range of parametric panel data models with censored endogenous variables and sample selection bias. Our approach is to derive estimates of the unobserved heterogeneity responsible for the endogeneity/selection bias to include as additional explanatory variables in the primary equation. These are obtained through a decomposition of the reduced form residuals. The panel nature of the data allows adjustment, and testing, for two forms of endogeneity and/or sample selection bias. Furthermore, it incorporates roles for dynamics and state dependence in the reduced form. Finally, we provide an empirical illustration which features our procedure and highlights the ability to test several of the underlying assumptions.  相似文献   

13.
Most economic applications rely on a large number of time series, which typically have a remarkable clustering structure and they are available over different spans. To handle these databases, we combined the expectation–maximization (EM) algorithm outlined by Stock and Watson (JBES, 2002) and the estimation algorithm for large factor models with an unknown number of group structures and unknown membership described by Ando and Bai (JAE, 2016; JASA, 2017) . Several Monte Carlo experiments demonstrated the good performance of the proposed method at determining the correct number of clusters, providing the appropriate number of group-specific factors, identifying error-free group membership, and obtaining accurate estimates of unobserved missing data. In addition, we found that our proposed method performed substantially better than the standard EM algorithm when the data had a grouped factor structure. Using the Federal Reserve Economic Data FRED-QD, our method detected two distinct groups of macroeconomic indicators comprising the real activity indicators and nominal indicators. Thus, we demonstrated the usefulness of our group-specific factor model for studies of business cycle chronology and for forecasting purposes.  相似文献   

14.
Quality of service (QoS) determines the service usability and utility and both of which influence the service selection process. The QoS varies from one service provider to other. Each web service has its own methodology for evaluating QoS. The lack of transparent QoS evaluation model makes the service selection challenging. Moreover, most QoS evaluation processes do not consider their historical data which not only helps in getting more accurate QoS but also helps for future prediction, recommendation and knowledge discovery. QoS driven service selection demands a model where QoS can be provided as a service to end users. This paper proposes a layered QaaS (quality as a service) model in the same line as PaaS and software as a service, where users can provide QoS attributes as inputs and the model returns services satisfying the user’s QoS expectation. This paper covers all the key aspects in this context, like selection of data sources, its transformation, evaluation, classification and storage of QoS. The paper uses server log as the source for evaluating QoS values, common methodology for its evaluation and big data technologies for its transformation and analysis. This paper also establishes the fact that Spark outperforms the Pig with respect to evaluation of QoS from logs.  相似文献   

15.
HR and analytics: why HR is set to fail the big data challenge   总被引:1,自引:0,他引:1       下载免费PDF全文
The HR world is abuzz with talk of big data and the transformative potential of HR analytics. This article takes issue with optimistic accounts, which hail HR analytics as a ‘must have’ capability that will ensure HR's future as a strategic management function while transforming organisational performance for the better. It argues that unless the HR profession wises up to both the potential and drawbacks of this emerging field and engages operationally and strategically to develop better methods and approaches, it is unlikely that existing practices of HR analytics will deliver transformational change. Indeed, it is possible that current trends will seal the exclusion of HR from strategic, board‐level influence while doing little to benefit organisations and actively damaging the interests of employees.  相似文献   

16.
We develop methods for inference in nonparametric time-varying fixed effects panel data models that allow for locally stationary regressors and for the time series length T and cross-section size N both being large. We first develop a pooled nonparametric profile least squares dummy variable approach to estimate the nonparametric function, and establish the optimal convergence rate and asymptotic normality of the resultant estimator. We then propose a test statistic to check whether the bivariate nonparametric function is time-varying or the time effect is separable, and derive the asymptotic distribution of the proposed test statistic. We present several simulated examples and two real data analyses to illustrate the finite sample performance of the proposed methods.  相似文献   

17.
We develop a generalized method of moments (GMM) estimator for the distribution of a variable where summary statistics are available only for intervals of the random variable. Without individual data, one cannot calculate the weighting matrix for the GMM estimator. Instead, we propose a simulated weighting matrix based on a first-step consistent estimate. When the functional form of the underlying distribution is unknown, we estimate it using a simple yet flexible maximum entropy density. Our Monte Carlo simulations show that the proposed maximum entropy density is able to approximate various distributions extremely well. The two-step GMM estimator with a simulated weighting matrix improves the efficiency of the one-step GMM considerably. We use this method to estimate the U.S. income distribution and compare these results with those based on the underlying raw income data.  相似文献   

18.
In this paper we propose a nonparametric kernel-based model specification test that can be used when the regression model contains both discrete and continuous regressors. We employ discrete variable kernel functions and we smooth both the discrete and continuous regressors using least squares cross-validation (CV) methods. The test statistic is shown to have an asymptotic normal null distribution. We also prove the validity of using the wild bootstrap method to approximate the null distribution of the test statistic, the bootstrap being our preferred method for obtaining the null distribution in practice. Simulations show that the proposed test has significant power advantages over conventional kernel tests which rely upon frequency-based nonparametric estimators that require sample splitting to handle the presence of discrete regressors.  相似文献   

19.
The increasing richness of data encourages a comprehensive understanding of economic and financial activities, where variables of interest may include not only scalar (point-like) indicators, but also functional (curve-like) and compositional (pie-like) ones. In many research topics, the variables are also chronologically collected across individuals, which falls into the paradigm of longitudinal analysis. The complicated nature of data, however, increases the difficulty of modeling these variables under the classic longitudinal framework. In this study, we investigate the linear mixed-effects model (LMM) for such complex data. Different types of variables are first consistently represented using the corresponding basis expansions so that the classic LMM can then be conducted on them, which generalizes the theoretical framework of LMM to complex data analysis. A number of simulation studies indicate the feasibility and effectiveness of the proposed model. We further illustrate its practical utility in a real data study on Chinese stock market and show that the proposed method can enhance the performance and interpretability of the regression for complex data with diversified characteristics.  相似文献   

20.
In a binary choice panel data model with individual effects and two time periods, Manski proposed the maximum score estimator based on a discontinuous objective function and proved its consistency under weak distributional assumptions. The rate of convergence is low ( N 1/3) and its limit distribution cannot easily be used for statistical inference. In this paper we apply the idea of Horowitz to smooth Manski's objective function. The resulting smoothed maximum score estimator is consistent and asymptotically normal with a rate of convergence that can be made arbitrarily close to N 1/2, depending on the strength of the smoothness assumptions imposed. The estimator can be applied to panels with more than two time periods and to unbalanced panels. We apply the estimator to analyze labour force participation of married Dutch females.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号