Loading…

Finding Latent Code Errors via Machine Learning over Program Executions

This paper proposes a technique for identifying programproperties that indicate errors. The technique generates machinelearning models of program properties known to resultfrom errors, and applies these models to program propertiesof user-written code to classify and rank propertiesthat may lead the...

Full description

Saved in:

Bibliographic Details
Main Authors:	Brun, Yuriy, Ernst, Michael D.
Format:	Conference Proceeding
Language:	English
Subjects:	Applied sciences Artificial intelligence Computer science control theory systems Computing methodologies > Artificial intelligence > Knowledge representation and reasoning > Probabilistic reasoning Computing methodologies > Artificial intelligence > Knowledge representation and reasoning > Vagueness and fuzzy logic Exact sciences and technology Mathematics of computing > Probability and statistics > Probabilistic algorithms Mathematics of computing > Probability and statistics > Probabilistic reasoning algorithms > Markov-chain Monte Carlo methods Mathematics of computing > Probability and statistics > Probabilistic reasoning algorithms > Sequential Monte Carlo methods Software Software and its engineering > Software creation and management > Software development techniques > Error handling and recovery Software and its engineering > Software creation and management > Software verification and validation > Software defect analysis > Software testing and debugging Software engineering Theory of computation > Semantics and reasoning > Program reasoning > Program analysis Theory of computation > Semantics and reasoning > Program semantics
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This paper proposes a technique for identifying programproperties that indicate errors. The technique generates machinelearning models of program properties known to resultfrom errors, and applies these models to program propertiesof user-written code to classify and rank propertiesthat may lead the user to errors. Given a set of propertiesproduced by the program analysis, the technique selectssubset of properties that are most likely to reveal an error.An implementation, the Fault Invariant Classifier,demonstrates the efficacy of the technique. The implementationuses dynamic invariant detection to generate programproperties. It uses support vector machine and decision treelearning tools to classify those properties. In our experimentalevaluation, the technique increases the relevance(the concentration of fault-revealing properties) by a factorof 50 on average for the C programs, and 4.8 for the Javaprograms. Preliminary experience suggests that most of thefault-revealing properties do lead a programmer to an error.
ISSN:	0270-5257
DOI:	10.5555/998675.999452