Debugging Backwards in Time

A software tool written with Java[TM] technology allows developers to step backwards through the execution of a program to determine where and how programming errors occurred. By recording each state change in the target application, it allows the developer to navigate "backwards in time" to see what the values of variables and objects WERE, enormously simplifying the task of debugging programs.

"This is the debugger that you have always dreamed about, but never thought possible," says Bil Lewis, developer of what he calls 'Omniscient Debugging.' "You can see which values are bad, locate them, and learn who set them and why. You don't have to guess where the problems might be. You don't need to set breakpoints or wonder which threads ran or which methods were called. If a problem occurred, you can find it. You don't ever have to rerun the program."


The 'ODB' (Omniscient Debugger) is an implementation of this concept written in Java. "I wrote the original paper in 1984, but it was just too impractical to implement. Doing this in C, C++, or Lisp was more than I could manage. When I read about the design of the JVM, I suddenly realized that it was possible," Bil continues. "Not only possible, but amazingly simple. I wrote the first line of code on January 6th, and it was working on February 6th! And that includes buying Lindholm & Yellin ("The Java Virtual Machine") and writing my first Java byte code!"

The ODB works by inserting byte codes into the application's class files. This code then collects information on each method call and variable assignment. The ODB assigns "time stamps" to each such event, stores them into a log, and then displays them in a graphical user interface (GUI). A programmer can then use the GUI to review the behavior of objects, variables, and method calls.

Identifying Objects

"Recognizing a specific object is problematic in Java. The toString() method was not intended for this purpose, and the default print strings are awkward," Bil says. "I needed a print string that was short, unique, easily recognizable, and informative. I chose this format: <Employee_12 Jim>, where Employee is the class, 12 is the instantiation count for the class and "Jim" is a programmer-selected instance variable." Arrays look like this: Employee[20]_1, and numbers, strings, characters, etc. as expected: 23, 1.56, "Typical string", 'X'.

"When am I?"

Trying to figure out exactly what point of a program is being displayed is a challenge. With traditional debuggers the programmer generally only knows that the program is stopped at a given breakpoint at a specific line of code, but not its relation to the program as a whole. With Omniscient Debugging, this becomes even more important, because the programmer has to be able to look at the history of an entire program run and select the specific time stamps of interest.

The ODB solves this by displaying the entire trace history of every method call made. The "Trace Pane" displays an indented list of each call, showing the method, the object it was called on, the arguments, and the return value. The programmer can then use this display to move though the program's history by selecting whatever method invocation is of interest. In addition, there is an I/O pane showing the lines printed by the program, and a code pane, showing the line of code that produced the selected time stamp. A "Stack Pane" shows the current stack, and a "Threads Pane" shows the current thread.

Displaying Objects and Variables

Three panes are used to display values -- one for the method arguments and local variables, a second for the "this" object, and a third to hold whatever objects the programmer wishes to keep track of. Any object in any of the panes may be copied to the "Objects Pane" by double-clicking on it. In the illustration above, we see two objects in this pane, an array of 20 integers, and an instance of a Demo object. The Demo object was copied from the trace pane, and the array was copied out of the Demo object.

Navigation

Debugging is done by "navigating" through time and watching values of interest change. Navigation is done by selecting lines in one of the display panes, or stepping forwards or backwards using the buttons. For example, pushing the "step over" button on the code pane works like a tradition "step over" command, advancing the program (called "reverting") to the next line excuted in the current method. Of course there is also a "step over backwards" which will revert to the previous line executed.

As the programmer reverts the display to different time stamps, all panes are updated to reflect the state of the program at that time. In the figure we can see that we are at the first line of average(), that's it's second call, that it occurred in the main thread, and that the previous line output was Starting QuickSort: 20. Because 16 is selected in the Object Pane, pushing the "next value" button will revert to the time when that element was next changed.

Another navigation technique is to display all of the values that a variable took on during a run, and select a specific one. So if element 16 of the array int[20]_0 was assigned the values 0 (at creation), then 729, 1725, finally 1719, the programmer can pop up a list of those values and select one. The display will then revert to the time the variable was set to that value.



"This is where things get interesting," continues Bil, "I've implemented the most obvious commands (Next/Previous Context Switch, Variable Value, Code Line, etc.), but I'm sure there are all sorts of possibilities I've never even thought of. I wonder what other commands people will invent."

Running Arbitrary Methods

"One of the things I miss most from the glory days of Lisp is the ability to type in any expression I want and have it execute," laments Bil. "So I included that. You can run a program, revert to any time stamp you want, change variable values, and type in an expression. The ODB will build a new "time line" and run it. So you push "Evaluate Expression," type in <Demo_0>.sort(6, 16) and it will run, sorting elements 6 through 16 starting from the current values."


Exceptions, Locks, Filters, etc.

If a method throws an exception, it will not have a return value, nor will any of the methods that propagate that exception. The Trace Pane will indicate that and also show a line for the catch statement that handles it. Locks (synchronized sections) will be displayed along with the instance variables of the relevant object in the Objects Pane, and any thread that is blocked will display the object whose lock it's waiting for. When there are more than about 10,000 lines in the Trace Pane, it becomes difficult to comprehend what's happening. So there are a variety of filters one can use to reduce the clutter, and a search command to find specified method calls.

"Once again," Bil adds, "there are all sorts of additional filters, search methods, value specifiers, etc. that people will want. I suspect that in five years I'll hardly recognize the ODB."

Summary

Redefining the purpose of a debugger from "providing control while running" to "retaining all previous state" gives us enormous power and simplicity. It eliminates much of the guesswork required by traditional debuggers, and makes non-deterministic bugs appear deterministic. Adding simple, intuitive methods of navigation and defining more informative print strings makes debugging a program much easier.

To quote Eric Armstrong of Sun's JavaSoft group, "Basically, I would estimate that a debugger of this time would cut my debugging sessions in half, at a minimum, and may in some cases reduce debugging time by a factor of 10."

Try It Out!

An experimential version of the ODB can be downloaded from the web page www.LambdaCS.com/debugger/debugger.html, which includes an easy-to-run demo program and a full user manual.


Last modified: Mon Aug 12 15:01:24 PDT 2002