A COMPARATIVE ANALYSIS STUDY OF SOFTWARE DEFECT PREDICTION USING MACHINE LEARNING ALGORITHMS ON NASA DATASET
Keywords:software defect, machine learning, support vector machine, linear regression, decision tree.
Software Quality is the most important aspect of a software. Software Defect Prediction can directly affect quality and has achieved significant popularity in last few years. Defective software modules have a massive impact over software’s quality leading to cost overruns, delayed timelines and much higher maintenance costs. In this paper we have analysed the most popular and widely used Machine Learning algorithms – Support Vector Machine, Decision Tree, Random Forest, Linear Regression, Naïve Bayes and K-NN. The Primary concern of software development process is to ensure quality software at every development stage; therefore, a common goal and concern of each software development phase is to check and concentrate on improving the software quality. Software quality prediction thus aims to evaluate software quality level periodically and to indicate software quality problems early. Commonly it is also called as a fault (bug) between software experts. It is not so easy to manage quality software because of raising difficulties and several restrictions under which software is developed. In our research we have used three open source, publicly accessible datasets from the PROMISE Software Engineering Database in this investigation which includes dataset by JM1, CM1 and PC1. We worked on 6 different algorithms using this dataset as mentioned above. This research with various characteristics produced findings that is accuracy that demonstrated the capacity and effectiveness of the SDPD model to locate errors and raise software quality. We have got the most accurate results using these algorithms and procedure. Therefore, SDPD model is able to detect the bugs of software faults at the early stage by collecting real-time software development data from the target applications.