Inicio  /  Applied Sciences  /  Vol: 10 Par: 1 (2020)  /  Artículo
ARTÍCULO
TITULO

Heterogeneous Defect Prediction Based on Transfer Learning to Handle Extreme Imbalance

Kaiyuan Jiang    
Yutong Zhang    
Haibin Wu    
Aili Wang and Yuji Iwahori    

Resumen

Software systems are now ubiquitous and are used every day for automation purposes in personal and enterprise applications; they are also essential to many safety-critical and mission-critical systems, e.g., air traffic control systems, autonomous cars, and Supervisory Control And Data Acquisition (SCADA) systems. With the availability of massive storage capabilities, high speed Internet, and the advent of Internet of Things devices, modern software systems are growing in both size and complexity. Maintaining a high quality of such complex systems while manually keeping the error rate at a minimum is a challenge. This paper proposed a heterogeneous defect prediction method considering class extreme imbalance problem in real software datasets. In the first stage, Sampling with the Majority method (SWIM) based on Mahalanobis Distance is used to balance the dataset to reduce the influence of minority samples in defect data. Due to the negative impact of uncorrelated features on the classification algorithm, the second stage uses ensemble learning and joint similarity measurement to select the most relevant and representative features between the source project and the target project. The third phase realizes the transfer learning from the source project to the target project in the Grassmann manifold space. Our experiments, conducted using nine projects of three public domain software defect libraries and compared with four existing advanced methods to verify the effectiveness of the proposed method in this paper. The experimental results indicate that the proposed method is more accurate in terms of Area under curve (AUC).

 Artículos similares

       
 
Marta Hervás, Fernando Martínez-Alzamora, Pilar Conejos and Joan Carles Alonso    
In this paper, several methods for the calculation of water quality evolution in drinking water distribution networks are analysed. The Lagrangian Time-Driven method has been implemented in the Epanet simulation software since version 2.0. In version 2.2... ver más
Revista: Water

 
Jinghua Li, Yidong Chen, Lei Zhou, Ruipu Dong, Wenhao Yin, Wenhao Huang and Fan Zhang    
In the context of increasingly competitive shipbuilding, the flexible multi-level picking system, composed of high-rise shelves, Automated Guided Vehicles (AGVs), and picking stations, has been of gradual interest because of its advantages in operation e... ver más
Revista: Applied Sciences

 
Roque Calvo and Ana Arteaga    
Heterogeneous systems of limited capacity have general applications in manufacturing, but also in logistic or service systems due to the differences in server or workstation performance or work assignment; this is in close relationship with system flexib... ver más
Revista: Applied Sciences

 
Eileen Trampe, Nico Rademacher, Maximilian Wulfmeier, Dominik Büschgens and Herbert Pfeifer    
In industrial plants, metal strips are quenched using convective heat transfer. This involves accelerating gas through a nozzle system onto the material to be quenched, resulting in a fast and uniform cooling process. The efficiency of the heat transfer ... ver más
Revista: Applied Sciences

 
Adelia Darlene Drego, Daniel Andersson and Ingo Staack    
Surveillance aircraft perform long-duration missions (>eight hours) that include detection and identification of objects on the ground, the water, or in the air. They have surveillance systems that require large amounts of cooling power (typically 10 s o... ver más
Revista: Aerospace