My general research interest is in software engineering: designing, specifying, implementing, testing, analyzing, verifying, understanding, and debugging software written in widely used programming languages, such as ANSI C and Java. My focus is on approaches and methods that naturally give rise to (semi-)automated tools for programmers, testers, and other software engineers — including end users when they must act as software engineers! The power of computer science is in automation — from design to testing, we need better tools to help us write better programs.
My work to date has usually involved some variety of model checking, but the specifics have varied from full exploration of highly abstracted systems (with MAGIC) to methods more akin to an unusual approach to testing (heuristics with JPF, model-driven verification of C programs with SPIN). Years of frustration when attempting to understand counterexamples produced by model checkers have convinced me that error explanation and fault localization are important (and interesting) topics, and my thesis work was an investigation of how a model checker can be used not only to find an error, but to explain and localize that error. I made use of distance metrics on program executions to find causes for errors, in a manner inspired by the counterfactual theory of causality proposed by David Lewis.
More recently, at JPL, I became interested in the connections between model checking and (randomized) testing -- including using the same models and tools for both, reusing simulation infrastructure, and hybrid approaches such as randomized testing with the complete nondeterminism of model checking. More "frivolously" I have also become fascinated by the taxonomy and nature of actual bugs in programs, particularly file systems.
I see model checking and testing primarily as useful tools for program understanding — the investigative side of software engineering, if you will. Model checking traditionally answers the question: does my (program, protocol, design) have a trace like this? Testing, in a sense, answers the same question. Even at the earliest stages of software design, thinking about how to make answering that question easier not only eventually aids verification, but helps us think about how a program will actually execute. One interesting possibility to consider is the use of model checking and other state-space exploration methods to help visualize the executions of a program. Recent work at IBM has shown the value of visualizing memory configurations of large systems; it seems that program trace-spaces would also be useful in software engineering and performance debugging.
More specific current topics of interest:
- A “grand unified approach” to full coverage efficient test suite generation, combining random testing or model checking and constraint-based directed testing
- Effective testing for machine-learned software, where subject experts (including YOU, the expert on classifying your email and your movie preferences) rather than software engineers or machine learning experts must test the code
- New approaches to test case generation, based on machine learning techniques
- Fundamental empirical research into properties of test coverage metrics on very large test suites produced by systematic methods