This paper provides an extensive review of studies related to expert estimation of software development using Machine-Learning Techniques (MLT). Machine learning in this new era, is demonstrating the promise of producing consistently accurate estimates. Machine learning system effectively “learns” how to estimate from training set of completed projects. The main goal and contribution of the review is to support the research on expert estimation, i.e. to ease other researchers for relevant expert estimation studies using machine-learning techniques. This paper presents the most commonly used machine learning techniques such as neural networks, case based reasoning, classification and regression trees, rule induction, genetic algorithm & genetic programming for expert estimation in the field of software development. In each of our study we found that the results of various machine-learning techniques depends on application areas on which they are applied. Our review of study not only suggests that these techniques are competitive with traditional estimators on one data set, but also illustrate that these methods are sensitive to the data on which they are trained.
Keywords: Machine Learning Techniques (MLT), Neural Networks (NN), Case Based Reasoning (CBR), Classification and Regression Trees (CART), Rule Induction, Genetic Algorithms and Genetic Programming.
The poor performance results produced by statistical estimation models have flooded the estimation area for over the last decade. Their inability to handle categorical data, cope with missing data points, spread of data points and most importantly lack of reasoning capabilities has triggered an increase in the number of studies using non-traditional methods like machine learning techniques.
Machine Learning is the study of computational methods for improving performance by mechanizing the acquisition of knowledge from experience . Expert performance requires much domain specific knowledge, and knowledge engineering has produced hundreds of AI expert systems that are now used regularly in industry. Machine Learning aims to provide increasing levels of automation in the knowledge engineering process, replacing much timeconsuming human activity with automatic techniques that improve accuracy or efficiency by discovering and exploiting regularities in training data. The ultimate test of machine learning is its ability to produce systems that are used regularly in industry, education, and elsewhere. Most evaluation in machine learning is experimental in nature, aimed at showing that the learning method leads to performance on a separate test set, in one or more realistic domains, that is better than performance on that test set without learning.
At a general level, there are two types of machine learning: inductive, and deductive. Deductive learning works on existing facts and knowledge and deduces new knowledge from the old. Inductive machine learning methods create computer programs by extracting rules and patterns out of massive data sets. Inductive learning takes examples and generalizes rather than starting with existing knowledge one major subclass of inductive learning is concept learning. This takes examples of a concept and tries to build a general description of the concept. Very often, the examples are described using attribute-value pairs.
Machine learning overlaps heavily with statistics. In fact, many machine-learning algorithms have been found to have direct counterparts with statistics. For example, boosting is now widely thought to be a form of stage wise regression using a specific type of loss function. Machine learning has a wide spectrum of applications including natural language processing, search engines, medical diagnosis, bioinformatics and cheminformatics, detecting credit card fraud, stock market analysis, classifying DNA sequences, speech and...