I have been thinking about the Rust debugging experience, and I want to try tease apart what value a debugger provides to me.
Last year, I gave a talk at QCon that promoted the use of rr
as a tool
for exploring Rust code. I wanted to convey an excitement about debugging,
about using this tool to dive into the weeds of your program’s behavior.
Since then, I have continued to talk to developers, promoting rr
(and the
associated service pernos.co). And I heard Rust developers say that they are
happy with their experience using instrumentation such as logging statements or
the tracing infrastructure.
I have to agree: Logging statements are useful, and sometimes they are all you need. Here is a great quote from “The Practice of Programming”, by Kernighan and Pike:
As personal choice, we tend not to use debuggers beyond getting a stack trace or the value of a variable or two. One reason is that it is easy to get lost in details of complicated data structures and control flow; we find stepping through a program less productive than thinking harder and adding output statements and self-checking code at critical places. Clicking over statements takes longer than scanning the output of judiciously-placed displays. It takes less time to decide where to put print statements than to single-step to the critical section of code, even assuming we know where that is. More important, debugging statements stay with the program; debugging sessions are transient.
This quote is apt for many reasons; I will revisit it again before the end of this post.
I do not know the typical way that people use debuggers. But I can imagine it looks much like the image depicted above: you start your program, you use a single-step command to go through it line by line, and you examine raw memory dumps to try to figure out what the machine is doing. After that initial exploration, you might set breakpoints at specific lines of code, and then tell the debugger to continue running until it hits any of those breakpoints.
Advanced debugger users will set conditional breakpoints, which only halt the
program if a specific predicate holds when it hits the breakpoint, or even
scripted breakpoints, which will automatically execute a series of debugger
commands each time they hit the breakpoint. One neat trick is that you often can
end a script with continue
, so that it seems as though the program never
stopped at all, except for the side-effects on the debugger itself (such as
setting other breakpoints).
But that typical usage pattern doesn’t really provide much value over just adding log statements. If all you want to do is learn what steps the computer took, and what data was in certain variables at those steps, printing to a log is a better way to achieve that.
So, why am I advocating for more people to include a debugger as one of the tools to keep ready on their Rust development utility belt?
There are four main features that a debugger gives me that are not always fulfilled by adding log statements:
Of the four, i think hardware watchpoints are the best example of a capability that is both broadly useful and not typically available outside of a debugger. I will delve into each of these in turn below.
Object Code Inspection
By “object code inspection”, I mean that I can attach a debugger to any binary,
without having to recompile anything. When your workflow is to add new print
statements each time you want to inspect something new, that means editing your
code and recompiling it. That edit-recompile-debug cycle has two problems: it can
be slow (especially the recompilation step). Worse still, the edits can cause
bugs to mysteriously go away, aka the “observer effect”.
(To be fair, using a debugger can also cause those Heisenbugs to change
behavior. This is where tools like rr
can really shine.)
How valuable is “object code inspection”? I have given two reasons to avoid the edit-recompile steps: reduced latency, and reducing the observer effect.
Honestly: It is not a frequent need for my current work. Much of my work involves either fixing bugs that reproduce readily in the face of source code changes, or adding features (which inherently requires source code changes).
However, in practice this can be very
painful: We do not have much support for building an unoptimized rustc
(it has
a very long bootstrap time), but once you enable optimizations, the debugging
experience degrades quite a bit in Rust today.
There is one important exception here: rustc
itself has a long bootstrap time;
so I have often tried to use a debugger to debug rustc
itself, rather than
rely on an edit-recompile-debug cycle.
Raw Control-Flow
By “raw control-flow”, I mean that one can step through the individual assembly code instructions to see exactly what the machine is executing.
For example, at a method invocation, one can step forward and discover how
exactly to which function that invocation dispatches as it goes through Deref
implementations until it finally reaches the target method.
As another example, one can step through the the instructions corresponding to
the subcomponents of a match
pattern. This way, one might discover which parts
matched and which failed to match, and in what order they were evaluated.
Here is a concrete example of the latter:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
After compiling the above with debuginfo enabled and running it under gdb
,
I can investigate the body of foo
while its running:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
In the above, we can see three comparison instructions: a comparison of 0x14(%rdi)
with $0x69
(the character ‘i’), then a comparison of 0x10(%rax)
with $0x79
(the character ‘y’). If they’re both equal, then it jumps to 0x55555555db9d
(which is presumably the first arm’s body). If they’re not, then it does a comparison of 0x10(%rax)
with $0x68
(the character ‘h’), and if that is equal, then it jumps to 0x55555555dbeb
(which is preumably the second arm’s body).
The people I imagine doing this kind of investigation are the ones working out the low-level semantic guarantees when it comes to investigating things like rust#87520. Pretty low-level details, admittedly. Someone new to Rust cannot be expected to do this kind of investigation to gain insight into the language.
In short, this seems like a case where Kernighan and Pike’s quote above is especially apt: we can easily get lost in the details of the control flow as rendered here.
So, how valuable is “raw control-flow”?
Honestly: The more experience I gain in a language or a particular project,
the less and less I find myself relying on this kind of spulunking. The
evaluation order of match
shouldn’t matter in most cases, so I would not
optimize the debugging UX for that. Seeing how method dispatch resolves is
useful, but in practice logging statements provide enough insight that people
rarely find themselves wanting to step through the individual assembly
instructions.
Insight into the raw control-flow is probably most useful for someone peeking into low-level details of the semantics of the language (or a given project, especially a code base with tricky macros that generate control-flow structures). One might hope that that group generalizes to beginners who are learning the language. But I am not certain that raw control-flow will efficiently provide practical insight. For example, consider the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
A newcomer to Rust who is not familiar with the destructor system might think
that the above code only prints a single line of text. Running the code (even
outside of a debugger) will show that to be incorrect. But: Will stepping
through it in a debugger provide the insight about where the <D as Drop>::drop
call comes from? It is not clear to me that it would.
Memory Exploration
By “memory exploration”, I mean that when my debugger is attached to a paused program, I can inspect values on the stack or heap, walking down the nodes of a binary tree or the entries in a hashtable. A good debugger will do some amount of memory traversal and semantic interpretation on my behalf, using type information from the program’s associated debuginfo to determine how best to interpret the bits it finds and print it nicely (e.g. showing comma-separated array elements).
How valuable is “memory exploration”?
Honestly: In Rust, one can add #[derive(Debug)]
to type definitions to get
methods that print the structure for their values. The debugger printouts were
useful when the language didn’t provide it out of the box (like in C and C++),
but they less value in Rust.
Another counter-argument to this supposed first benefit: In a language like Rust,
anyone can use unsafe
code to transmute the value to raw bits, and subsequently print them. This power is not solely in the debugger’s hands.
One benefit of using a debugger here, rather than just using log statements, is
that a debugger will let you immediately see the concrete representation (such
as concrete hashtable layout) in places where the developer has chosen a more
abstract view for the Debug
trait implementation. But again we hear echos of
the Kernighan and Pike quote: Seeing that concrete representation may be crucial
in some scenarios, but in many cases it will just be unnecessary detail
distracting you from the problem at hand.
A second benefit with respect to memory exploration is that a debugger will also let you inspect values that were not part of the original log statement. But, one can view that as a special case of bypassing the edit-recompile cycle, which we already discussed above. If the edit-recompile cycle is short, then there is little cost to adding a new log statement that includes the value of interest.
Hardware Watchpoints
So we have come to perhaps my favorite debugger feature: Hardware watchpoints.
This is the main feature that most programming languages do not provide.
The idea is simple: the gdb command watch -location <place>
will
break the next time the memory at the given place is modified.
For example, watch -location array[i]
will evaluate &array[i]
to
its address, and then the next time array[i]
is changed, the debugger
will stop and print out both the old value and the new value.
Consider the following (buggy) code for merging two sorted sequences into a single sorted sequence:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
|
When I run the code above, I get the output:
1 2 3 4 5 6 7 |
|
which is not quite right! The third element of the generated array is 20, instead of 30.
This toy example is small enough that you might be able to see the bug from
immediate inspection. Adding more test cases to the run might also make the
problem apparent. But I am going to use it to illustrate how to use a hardware
watchpoint to jump immediately to the point in the control flow where the third
element of the recv
array is overwritten.
After starting up the debugger, we let it evaluate the line that creates the destination array:
1 2 3 4 5 6 |
|
Next, lets first check that we understand what we are looking at:
1 2 3 4 |
|
Yep: We have our destination array r
, and it currently has 97 as its
third element r[2]
.
So we tell the debugger that we want it to break as soon as that location is changed, and then let the program run!
1 2 3 4 5 6 7 8 9 10 11 12 |
|
We are now within the body of fn merge
instead of fn main
(as you might have
expected), immediately after the mutation occured:
1 2 3 4 |
|
And indeed, if we print out recv
at this point, we can see that the
array has been “corrupted”, since we now see the two copies of 20 in the
output:
1 2 3 |
|
I am running with the rust-gdb
command here, which has extensions for
printing slices nicely. If you were doing this under just normal gdb
, then
you would see something like this instead.
1 2 |
|
This kind of break-on-write is most useful for unsafe languages, when you have data being corrupted due to out of bounds accesses. One certainly can encode examples of that in unsafe Rust, but we also do not expect them to arise as frequently.
But honestly (a ha): Is this actually revealing anything about the core problem
here? The logic at this match
-arm is entirely sound: if we have (None,
Some(y))
for our two cursors traversing the slices, then we should pass along
y
and keep traversing down the right cursor. Whatever the bug is, it is
arising somewhere else, before we even get here.
Reversible Debugging
If I were restricting my attention to gdb
/rust-gdb
alone, this would be a
point where I would start talking about re-running the program and stepping
through it carefully, or trying to hypothesize points of interest based on
analyzing the source code and setting breakpoints at those points. But if we
were going to do that, then we might as well invest our time adding some of
those logging statements discussed above, because I think those would show the
problem here just as fast, if not faster!
My goal is to show you an alternative strategy, that works for code bases that you are not that familiar with.
Lets continue looking at the merge example, but let us now run it under rr.
First we record a run of interest:
1 2 3 4 5 6 7 8 9 10 |
|
With that recording in place, we can replay the run under rr
, which gives an
interface similar to gdb
. (And we will tell rr
that we want it to use
rust-gdb
as our debugger to get those extensions for rendering Rust
collections.)
1
|
|
Our initial interaction with rr
will be much like how we interacted with gdb
above: we’ll start the program, set a hardware watchpoint for the third array
element, and let it run forward.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
|
But now that we have reached this point in replay, we can rewind time.
We can ask questions like “how did a_cur
(the left-hand side of the tuple) become None
already, without us ever emitting 30 to the recv
array?
Well, the question doesn’t quite look like that. But we can ask "When was a_cur
last written to before this point?”, which ends up effectively the same question.
1 2 3 |
|
Now we tell the system to run backwards.
1 2 3 4 5 6 7 8 9 10 |
|
The gdb
output is potentially confusing because the notions of “Old” and “New”
are inverted here. To dive deeper into this topic, you might consider the
watching the films Primer or TENET.
We still have our first watch point in place for r[2]
, so it halted almost
immediately. (When we were running forward, a watchpoint makes us stop right
after the write occurs. When we run backwards, a watchpoint makes us stop right
before the write occurs.)
Let us keep going backwards.
1 2 3 4 5 6 7 8 9 10 |
|
And now we see where a_cur
is being assigned None
: line 20:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
And if you look at this, you might see the problem: in that arm of the if
, we
are copying over the right-hand tuple value (*y
, from b_cur
), but we are
advancing the left-hand cursor, a_cur
.
With that insight, we can make an easy fix: replace line 20 with
1
|
|
With that change in place, the code works:
1 2 3 4 5 6 7 |
|
This is a case where the ability to explore memory and add printouts from the debugger is crucial: Because now we are exploring the behavior of one specific trace, and we do not always have the opportunity to try to recreate the trace on a new version of the program.
There are other advantages provided by rr
. For example, record/replay means
that you can investigate a trace repeatedly. Each time you replay the trace,
you’ll observe the same execution; and you can add breakpoints (and print
statements on those breakpoints) on that exact trace as it replays. This is
invaluable for tracking down bugs that are intermittent or that depend on subtle
event orderings.
Closing Thoughts
I started writing this post thinking “I’m excited about debuggers! Why is no one in the Rust community using them?” As I dug into the topic, I kept asking myself “what feature is provided here that people cannot more readily get via their logging instrumentation? What value am I getting from this?”
As you might infer from my “Honestly” notes above, I had to admit that the advantages of a debugger when compared to logging instrumentation may be limited in scope. At least, that’s how I see things with typical debugging workflows.
Tools like rr
open up new debugging workflows, one of which I delved into
above. I plan to write more in the future about other such new workflows, such
as ones provided by services like pernos.co. More generally, I want to spend
some time thinking about the nature of debugging: how it links in with how we
develop features, how we communicate our ideas, and how we validate our models.