Abstract The quality of vehicular collision data is crucial for studying the relationship between injury severity and collision factors. Misclassified injury severity data in the crash dataset, however, may cause inaccurate parameter estimates and consequently lead to biased conclusions and poorly designed countermeasures. This is particularly true for imbalanced data where the number of samples in one class far outnumber the other. To improve the classification performance of the injury severity, the paper presents a robust noise filtering technique to deal with the mislabels in the imbalanced crash dataset using the advanced machine learning algorithms. We examine the state-of-the-art filtering algorithms, including Iterative Noise Filtering based on the Fusion of Classifiers (INFFC), Iterative Partitioning Filter (IPF), and Saturation Filter (SatF). In the case study of Cairo (Egypt), the empirical results show that: (1) the mislabels in crash data significantly influence the injury severity predictions, and (2) the proposed M-IPF filter outperforms its counterparts in terms of the effectiveness and efficiency in eliminating the mislabels in crash data. The test results demonstrate the efficacy of the M-IPF in handling the data noise and mitigating the impacts thereof. 相似文献
Water Resources Management - The accurate and efficient identification of effective reservoirs plays an important role in the real-time flood control operation of multireservoir systems. The... 相似文献
Traditional multi-objective evolutionary algorithms treat each objective equally and search randomly in all solution spaces without using preference information. This might reduce the search efficiency and quality of solutions preferred by decision makers, especially when solving problems with complicated properties or many objectives. Three reference point based algorithms which adopt preference information in optimization progress, e.g., R-NSGA-II, r-NSGA-II and g-NSGA-II, have been shown to be effective in finding more preferred solutions in theoretical test problems. However, more efforts are needed to test their effectiveness in real-world problems. This study conducts a comparison of the above three algorithms with a standard algorithm NSGA-II on a reservoir operation problem to demonstrate their performance in improving the search efficiency and quality of preferred solutions. Under the same calculation times of the objective functions, Pareto optimal solutions of the four algorithms are used in the empirical comparison in terms of the approximation to the preferred solutions. Three performance indicators are then adopted for further comparison. Results show that R-NSGA-II and r-NSGA-II can improve the search efficiency and quality of preferred solutions. The convergence and diversity of their solutions in the concerned region are better than NSGA-II, and the closeness degree to the reference point can be increased by 42.8%, and moreover the number of preferred solutions can be increased by more than 3 times when part of objectives are preferred. By contrast, g-NSGA-II shows worse performance. This study exhibits the performance of three reference point based algorithms and provides insights in algorithm selection for multi-objective reservoir optimization problems.