Documentation/kdb/bt_x86

   1 Copyright Keith Owens, 2007.
   2
   3 How the KDB backtrace for x86 works, how to diagnose problems and submit a bug
   4 ==============================================================================
   5
   6 Unlike ia64, x86 architectures do not mandate unwind information in the kernel.
   7 gcc will include some unwind information for C functions, but not for assembler
   8 code.  Attempts have been made to add unwind information to the assembler code
   9 by hand, with little success.  Eventually Linus rejected the x86 unwind code
  10 because it was breaking too often and destroying useful debugging data.
  11
  12 Even if the x86 unwinder worked correctly, it would only give an accurate
  13 backtrace, it would not give function arguments. Needless to say, function
  14 arguments are what people really want.  To get function arguments requires much
  15 more support from the compiler than simple unwind data, the compiler has to
  16 track line by line where each argument is held and make that data available to
  17 the debugger.  Compiling with gcc -g will provide that information, but the
  18 resulting kernel is several times larger than normal.
  19
  20 Although the gcc -g data can be stored on another machine, there are constructs
  21 in the kernel that cannot be tracked by this method.  i386 builds with 4K stacks
  22 and all x86_64 builds have multiple kernel stacks.  The compiler knows nothing
  23 about these extra stacks and cannot backtrace through them correctly.  The
  24 assembler code in arch/{i386,x86_64}/kernel/entry.S is a maze of twisty logic
  25 paths, some of which eventually jump to common labels.  Describing this twisty
  26 logic to an unwinder is very difficult, expecially when you try to describe
  27 where arguments and/or registers are held).
  28
  29 KDB gets an accurate x86 backtrace and extracts the arguments by performing code
  30 decomposition and analysis at run time.  This avoids the need to bloat the
  31 running kernel to several times its normal size with gcc -g data.  KDB (unlike
  32 gdb) also knows about the additional kernel stacks and twisty assembler code
  33 paths.
  34
  35 The x86 backtrace code for i386 is very similar to the x86_64 code, with 80%
  36 common code and data.  Most of the differences between the backtrace for the two
  37 architectures is due to the assembler code and stack handling.  To avoid code
  38 duplication between KDB patches, the x86 backtrace code is actually stored in
  39 the kdb common patch, in source kdb/kdba_bt_x86.c.  kdb/Makefile only builds
  40 kdba_bt_x86.o for i386 or x86_64.  Most of the code reads as if the architecture
  41 is x86_64, using register names like rsp and rip.  i386 is treated as a subset
  42 of x86_64, with fewer registers and printing the names as esp and eip.  When
  43 this documentation refers to rsp and rip, read it as esp and eip for i386.  The
  44 20% of code and data that is different in held in two large #ifdef sections,
  45 scan kdba_bt_x86.c for CONFIG_X86_64.  Be careful when changing anything in the
  46 architecture specific sections, you will need to review the other architecture
  47 to see if it needs changes as well.
  48
  49 The idea behind the x86 backtrace is to trace one function at a time, which
  50 gives us the calling function.  Then apply the same algorithm to the calling
  51 function until you unwind to the first function in the process.  The starting
  52 point for tracing any process is to extract the current stack pointer and
  53 current instruction pointer (rsp and rip).  The way that these values are
  54 extracted varies between running tasks and blocked tasks, the method is
  55 described later (Process Starting Point) but ignore it for now, just assume that
  56 we have a starting rsp and rip.
  57
  58 Given the instruction pointer (rip), we identify the start and end of the kernel
  59 or module function it is in, using the kernel symbol table.  This is easy for C
  60 code, it is significantly harder for assembler code because of the twisty code
  61 paths that branch to common labels.  The method of identifying the current
  62 function is described later (Identifying The Current Function) but ignore it for
  63 now, just assumes that we have the start and end address of the function plus
  64 its name.
  65
  66 After the rip has been mapped to a function name with sensible start and end
  67 addresses, the next step is to analyse the code paths in that function.  KDB
  68 already has a built in disassembler (copied with slight modifications from
  69 binutils) which knows how to decode each x86 instruction.  Instead of
  70 duplicating that logic in kdba_bt_x86, it takes advantage of the fact that you
  71 can override the disassembler's print function, sending the output line to a
  72 buffer instead of printing it.  kdba_bt_x86 stills has to decode the buffer but
  73 that is a lot easier than decoding the x86 instruction set.
  74
  75 The code analysis consists of two main passes.  There are example below of the
  76 analysis with basic block (bb) debugging activated (Examples of Basic Block
  77 Debugging Output).
  78
  79 The first pass (bb_pass1) identifies all the basic blocks in the function.  For
  80 our purposes, a basic block has a single entry point and one or more exit
  81 points.  The start of the function is the start of basic block 0, all other
  82 blocks are the target of jump instructions (conditional or unconditional) from
  83 within the rest of the code.  A block ends with an unconditional jump or with a
  84 terminating instruction such as ret, iret, sysexit, sysret or ud2a (BUG).  A
  85 block can also end because the next instruction is the start of a new block
  86 (target of jmp or jcc), in this case there is an implied drop through from one
  87 block to the next.
  88
  89 Although a call instruction also transfers control, it returns to the next
  90 instruction so call is not treated as a transfer.  Instead call is treated as a
  91 normal instruction with side effects, the scratch registers are cleared after a
  92 call.
  93
  94 At the end of the first pass over the function we have a directed graph that
  95 starts at bb[0].  The nodes of the graph (bb_list[]) are the basic blocks with
  96 their start and end address.  The vertices are the jmp or jcc instructions
  97 (bb_jmp_list[]) that transfer control between blocks, plus any implied drop
  98 through transfers between consecutive blocks.  This graph can have cycles, many
  99 functions have loops in them which transfer control back to earlier in the code
 100 body.
 101
 102 The second pass (bb_pass2) runs over the directed graph analysing the effect of
 103 each instruction on the register and memory state.  It is important to
 104 understand that the analysis in this pass is an abstract one, it does not use
 105 actual hex values for the register contents, instead it uses symbolic values.
 106 When the basic block code says that "register rsi contains value rax" it means
 107 that whatever value was in rax on entry to the function has also been copied to
 108 register rsi at this point in the logic flow.
 109
 110 At an abstract level, all C functions start with exactly the same state, each
 111 register contains its own symbolic value (except for the stack pointer, see
 112 later) with no local stack variables defined yet.  Assembler functions tend to
 113 have unusual starting points, with some registers and/or memory contents defined
 114 differently on entry.  For example, ret_from_intr on i386 already has a struct
 115 pt_regs on its stack, ret_from_intr on x86_64 already has a partial struct
 116 pt_regs plus another two words stacked on top of it.  The special starting cases
 117 are listed in the arch specific bb_special_cases[].
 118
 119 Once the input state of bb[0] has been defined (including any special cases),
 120 bb_pass2_do_changed_blocks() runs over all the nodes in bb_list[].  Each
 121 instruction in each block is analysed (Tracking the Effects of Instructions) to
 122 see what effect it has on the abstract register state, the analysis of each
 123 instruction is done in bb_usage().  An instruction can copy one register to
 124 another, it can copy a register to stack, move from stack to a register or it
 125 can invalidate the contents of a register or memory location.  A general rule in
 126 bb_usage() is that any operation whose results cannot be calculated in terms of
 127 an original input value gives an undefined result.  Remember that it is the
 128 abstract value that becomes undefined, moving a constant to a register gives a
 129 defined value for the view of the program but it is undefined as far as the
 130 abstract state is concerned.
 131
 132 References to data on the stack are a little awkward because the stack pointer
 133 frequently changes.  To overcome this, kdba_bt_x86 defines a pseudo register
 134 called the 'original stack pointer' (osp).  This always represents the stack
 135 pointer on entry to the function, so on entry rsp contains osp+0x0.  As rsp is
 136 modified, it still points at osp, but its offset from osp changes.  Copying rsp
 137 to another register (e.g. mov %rsp,%rbp) copies the osp offset as well.  At the
 138 point that this function calls the next function down the stack, kdba_bt_x86
 139 knows the delta from osp to rsp.  Applying that delta to the actual value of the
 140 stack pointer gives the stack pointer value on input to the current function,
 141 that location contains the return address so we can go up one stack frame and
 142 repeat the process.
 143
 144 After doing basic block analysis on the current function, kdba_bt_x86 knows what
 145 the abstract register and memory state is at the point this function was
 146 interrupted or it called the next function down the stack, this is the exit
 147 state.  For an interrupt the actual register values are saved in a struct
 148 pt_regs, for a call we have unwound from the KDB interrupt back to the called
 149 function so we have some idea of what the register values are in the called
 150 function.  The abstract exit state is merged with the known actual register
 151 values to derive the original stack pointer.  That in turn gives us any
 152 registers that were saved on stack.  The original stack pointer gives the return
 153 address from the calling function, go up one stack frame and repeat the
 154 analysis.
 155
 156
 157 Process Starting Point
 158 ======================
 159
 160 All backtrace code needs a starting point which defines at least the stack
 161 pointer and instruction pointer, it may define other registers as well.  The
 162 first part of kdba_bt_stack() extracts the starting point.  Processes can be in
 163 one of three states, running (currently on a cpu), blocked (sleeping or ready to
 164 run but not currently on a cpu) or unknown.
 165
 166 For running processes, the current rsp and rip are dynamic.  Because KDB stops
 167 the entire machine by sending an interrupt to the other cpus, KDB can save the
 168 rsp and rip for each cpu at the point where KDB is entered.  This data is held
 169 in array kdb_running_process and is stored by kdb_save_running() and the arch
 170 specific kdba_save_running() functions.  When backtracing a running process, KDB
 171 uses the data in kdb_running_process as its starting point.
 172
 173 For blocked processes we always have the saved rsp, it is held in the process's
 174 thread_info.  For i386 blocked processes, thread_info also contains the saved
 175 rip.  For x86_64 blocked processes, rip is no longer saved in thread_info, it is
 176 assumed that all blocked processes will resume at assembler label thread_return,
 177 so that rip is used on x86_64.  See arch specific kdba_bt_stack_rip().
 178
 179 Unknown process state only occurs when the user does 'bt <stack_address>'.
 180 Unlike other bt commands, 'bt <stack_address>' does not identify any specific
 181 process, instead it identifies a kernel stack.  <stack_address> must be inside a
 182 valid kernel stack and must point to a saved rip from a call instruction.
 183 kdba_bt_x86.c uses the common kdba_get_stack_info() and arch specific
 184 kdba_get_stack_info_alternate() functions to check that the address falls within
 185 a valid kernel stack.  If the user gives a stack address that does not point to
 186 a saved rip from a call instruction then the backtrace will be garbage.
 187
 188
 189 Identifying The Current Function
 190 ================================
 191
 192 Given a rip value, KDB uses the kallsyms data to find the start of the function
 193 (first address <= rip) and the end of the function (next symbol in kallsyms).
 194 This works for plain C code because gcc only generates one label per function.
 195 It does not work for assembler code or for assembler code embedded in C
 196 functions, because the assembler labels appear as global entries in kallsyms.
 197 For example, arch/i386/kernel/entry.S has function ret_from_exception which
 198 contains three global labels ret_from_intr, check_userspace and
 199 resume_userspace.  If rip points to any of those global labels, KDB wants the
 200 start of the real function, i.e. ret_from_exception.  In addition, if rip points
 201 to ret_from_exception, KDB wants the end of the function to be after the last
 202 global label in that function, i.e. after resume_userspace.
 203
 204 The simplest way to handle these unwanted global labels is to list the spurious
 205 assembler labels, which is done in the arch specific array bb_spurious.  After
 206 mapping rip to the nearest start and end labels from kallsyms, kdb_bb() works
 207 backwards until it finds a non-spurious label then works forwards to the next
 208 non-spurious label.  That gives a real start and end address and a real name for
 209 the current function.
 210
 211 Note that this algorithm only applies in kdb_bb() when it maps rip to a suitable
 212 start and end address.  When disassembling the code, you will still see the
 213 spurious label names, users need to see the extra labels.  ret_from_exception on
 214 i386 disassembles like this (2.6.22) :-
 215
 216 [0]kdb> id ret_from_exception
 217 0xc0102554 ret_from_exception:         cli
 218 0xc0102555 ret_from_intr:         mov    $0xfffff000,%ebp
 219 0xc010255a ret_from_intr+0x5:     and    %esp,%ebp
 220 0xc010255c check_userspace:         mov    0x34(%esp),%eax
 221 0xc0102560 check_userspace+0x4:     mov    0x30(%esp),%al
 222 0xc0102564 check_userspace+0x8:     and    $0x20003,%eax
 223 0xc0102569 check_userspace+0xd:     cmp    $0x3,%eax
 224 0xc010256c check_userspace+0x10:    jb     0xc010258c resume_kernel
 225 0xc0102572 check_userspace+0x16:    mov    %esi,%esi
 226 0xc0102574 resume_userspace:         cli
 227 0xc0102575 resume_userspace+0x1:     mov    0x8(%ebp),%ecx
 228 0xc0102578 resume_userspace+0x4:     and    $0xfe3e,%ecx
 229 0xc010257e resume_userspace+0xa:     jne    0xc01026f4 work_pending
 230 0xc0102584 resume_userspace+0x10:    jmp    0xc01026a7 restore_all
 231 0xc0102589 resume_userspace+0x15:    lea    0x0(%esi),%esi
 232 0xc010258c resume_kernel:         cli
 233
 234 For the purposes of kdba_bt_x86.c, any rip from 0xc0102554 to 0xc0102589 needs
 235 to map to the range 0xc0102554 (start) to 0xc010258c (end) with function name
 236 ret_from_exception.  Therefore ret_from_intr, check_userspace and
 237 resume_userspace are listed in bb_spurious[] for i386 so those symbols are
 238 ignored.  The comments in bb_spurious[] list the function that encloses each
 239 spurious label, those comments are only for humans, they do not affect the code.
 240
 241 Once rip has been mapped to non-spurious labels, the module name, function name,
 242 start and end address are stored in variables bb_mod_name, bb_func_name,
 243 bb_func_start, bb_func_end.  These variables are used throughout kdba_bt_x86.c
 244 for processing each function in turn.
 245
 246 Watch for changes to assembler code, especially in arch/i386/kernel/entry.S,
 247 arch/x86_64/kernel/entry.S and arch/x86_64/ia32/ia32entry.S.  When new labels
 248 are added you may need to adjust bb_spurious[] for that architecture.  Running
 249 bb_all can help identify assembler labels that have been added or deleted.
 250
 251
 252 Tracking the Effects of Instructions
 253 ====================================
 254
 255 bb_pass2_do_changed_blocks() uses the KDB disassembler to decode the x86
 256 instructions to something a human can read.  bb_dis_pass2() is used as a print
 257 routine to store data for a single instruction in a buffer then
 258 bb_parse_buffer() starts the analysis.  Any instruction prefixes like lock or
 259 rep are stripped out.  The opcode string is isolated then up to 3 operands are
 260 extracted (imul can have 3 operands), these are src, dst and dst2.  The operand
 261 is matched against table bb_opcode_usage_all[] which lists all the instructions
 262 that actually appear in i386 and x86_64 kernels.  A lot of the x86 instrcution
 263 set is not used by the kernel so instructions such as SSE do not appear in
 264 bb_opcode_usage_all[].
 265
 266 Each operand is decoded by bb_parse_operand() to see whether it has a segment
 267 prefix, displacement, base, index or scale.  An indirect call or jmp is
 268 identified.  Operands consisting only of a register are classified as 'reg'
 269 type, displacements starting with '$' are immediate values otherwise the operand
 270 refers to a memory location.  Any base or index register name is mapped to the
 271 abstract register name that contains it, this takes care of mapping (say) %ah to
 272 rax.
 273
 274 After decoding the opcode and all its operands, bb_usage() decides what effect
 275 the instruction will have on the abstract state machine.  Some x86 instructions
 276 list all the affected registers in their operands and these can be handled as
 277 generic cases.  Alas many x86 instructions have side effects and change
 278 registers that are not listed in the operands, these have to be handled as
 279 special cases.  enum bb_operand_usage lists the generic and special cases.
 280
 281 bb_usage() is basically one huge switch statement over the special values in
 282 enum bb_operand_usage.  For each special case it tracks the side effects of the
 283 instruction.  Once all the special cases have been handled and converted to
 284 generic cases then bb_usage() handles the generic cases.
 285
 286 bb_usage() detects when a register is copied to another register, a register is
 287 copied to stack or a known stack value is copied to a register and updates the
 288 state data accordingly.  It is particularly important that all stack pointer
 289 updates and copies of the stack pointer are tracked, much of the saved state is
 290 on stack and can be accessed via any register that points to the stack, not just
 291 via rsp.
 292
 293 i386 built with 4K stacks and all x86_64 builds have multiple kernel stacks.
 294 bb_usage() knows which instructions or locations are used to switch stacks and
 295 pretends that these instructions have no effect on the contents of rsp.  The
 296 higher level backtrace code knows how to handle stack switching, it is too
 297 complicated for basic block analysis.
 298
 299
 300 Transfer of Control Outside the Current Function
 301 ================================================
 302
 303 Ignoring call instructions, most C code does not transfer control outside the
 304 current function, IOW there are no jump instructions to instructions outside the
 305 function.  There are a few cases that this can occur for C code, inline
 306 assembler and tail recursion.
 307
 308 Tail recursion occurs when a function ends by returning the value from a second
 309 function and that second function has exactly the same arguments and return
 310 value as the current function.  For example,
 311
 312     int bar(int i, char *p)
 313     {
 314         ... do some work and return an int ...
 315     }
 316
 317     int foo(int i, char *p)
 318     {
 319         return bar(i, p);
 320     }
 321
 322 If tail recursion is active (gcc -foptimize-sibling-calls) then instead of foo
 323 calling bar, bar returning to foo then foo returning to its caller, gcc will end
 324 foo with a direct jmp to bar.  The source code says that something called foo
 325 but the stack trace will show bar is active, with no sign of foo on stack.  When
 326 bar returns it will use the return address from the code that called foo.
 327
 328 bb_transfer() detects an unconditional jmp to code outside the function body and
 329 assumes that this represents tail recursion.  For tail recursion to work
 330 correctly, all the preserved registers must contain their original values,
 331 bb_sanity_check() validates this.  Any deviation from the expected state will
 332 stop basic block analysis and fall back on the old unreliable backtrace code.
 333
 334 Besides tail recursion in C code, assembler code can jump to labels outside the
 335 current function.  Unfortunately this occurs all the time in the twisty
 336 assembler code and, to make things worse, many of these transfers are done with
 337 non-standard register or memory state.  bb_special_case() and the arch specific
 338 bb_special_cases[] handle all the known special cases, including what the
 339 register and/or memory state should be.  Any deviation from the expected state
 340 will stop basic block analysis and fall back on the old unreliable backtrace
 341 code.
 342
 343
 344 Locating Arguments
 345 ==================
 346
 347 Function arguments can be passed in registers or on stack.  The full ABI for
 348 passing arguments is described in
 349
 350   http://www.caldera.com/developers/devspecs/abi386-4.pdf
 351   http://www.x86-64.org/documentation/abi.pdf
 352
 353 The short description, ignoring special cases like passing structures by name
 354 and floating point arguments which tend not to apply to the kernel, is :-
 355
 356 i386.   With -mpregparm=0, all arguments are passed on stack, except for
 357         functions defined as FASTCALL, where the first 3 arguments are passed in
 358         registers.
 359
 360         With -mregparm=3, the first 3 arguments are passed in registers except
 361         for functions defined as asmlinkage or with variable number of
 362         arguments, when arguments are still passed on stack.  -mpregparm=3 used
 363         to be a config option, in recent kernels it is the default.
 364
 365         Arguments defined as long long (64 bit) are passed in two registers or
 366         in two locations on stack.  Being passed in two pieces makes a 64 bit
 367         argument look like two 32 bit arguments to KDB, it will be printed as
 368         two separate arguments.
 369
 370         When compiled with -mregparm=3, if a 64 bit argument is argument number
 371         2 then it will not be split between register and stack, instead it will
 372         all be on stack and the third argument register will not be used.  This
 373         makes it look like there is an extra argument in the list.  There is
 374         nothing that KDB can do to detect these corner cases with 64 bit
 375         arguments on i386, which is a pity because they can confuse users.
 376
 377         The preserved registers are ebx, ebp, esp, esi, edi.  Arguments are
 378         passed in eax, edx, ecx.  The return value is in eax.
 379
 380 x86_64. The first 6 arguments are passed in registers, the 7th and later
 381         arguments are passed on stack.  Except for functions with a variable
 382         number of arguments (e.g. printk) where all arguments are on stack
 383         except for rax which contains the number of SSE arguments (always 0 for
 384         the kernel).
 385
 386         The preserved registers are rbx, rbp, rsp, r12, r13, r14, r15.
 387         Arguments are passed in rdi, rsi, rdx, rcx, r8, r9.  The return value is
 388         in rax.
 389
 390 For both architectures, kdba_bt detects an argument that is passed in a register
 391 by the fact that the function code reads from that argument type register while
 392 it contains its original value.  IOW, testing the value of rax, copying rax to
 393 another register or storing it on stack without first overwriting rax means that
 394 rax contains a useful input value.  Reading from memory which is above the
 395 original stack pointer means that there is a argument at that location on
 396 stack.
 397
 398 There are some functions in the kernel whose definition contains arguments that
 399 are not actually used.  Typically these functions are instantiations of generic
 400 function definitions where some, but not all, instantiations use all the
 401 arguments.  For example, a filesystem function may take flags that are not used
 402 by this particular filesystem, but the function definition has to match the
 403 generic filesystem declarations.  If the unused arguments are at the end of the
 404 list then there is no way of telling from the object code that they exist, the
 405 function that does not use the trailing aguments will have no code that refers
 406 to them.  KDB will print a truncated argument list for this case.
 407
 408 If the unused arguments are not at the end of the list then KDB can detect the
 409 presence of the unused arguments, because there is code that refers to later
 410 arguments.  KDB will print the unused argument, although gcc may have
 411 overwritten the register it is in, in which case KDB prints "invalid".
 412
 413 Originally kdba_bt_x86 would detect that there was no reference to arguments in
 414 registers but there were still references to arguments on stack and assume that
 415 the function had all its arguments on stack.  Unfortunately this did not work
 416 with the large number of 'pass through' functions in the kernel.  A 'pass
 417 through' function is one which calls another function with roughly the same
 418 argument list and makes no other reference to the register arguments.  For
 419 example, ipv4_doint_and_flush_strategy() takes 7 arguments, calls
 420 devinet_conf_sysctl() with those 7 arguments in the same order and has no other
 421 reference to any of its arguments.
 422
 423 Pass through functions do not touch the arguments that are passed in registers
 424 because they are already in the right location for the routine about to be
 425 called, so the pass through function has no code that references the argument
 426 registers.  No code means that kdba_bt_x86 cannot tell if the function has
 427 register arguments or not.  The arguments passed on stack must always be copied
 428 to the new stack frame, even for pass through functions, so the arguments on
 429 stack can always be detected.
 430
 431 kdba_bt_x86 was changed to assume that if there are any arguments on stack then
 432 there are always arguments in registers, except for a list of functions that are
 433 known to be asmlinkage or to have a variable number of arguments.
 434 bb_assume_pass_through() ignores the known special cases, for other functions
 435 which have stack arguments but no register arguments it assumes the function is
 436 pass through and prints a warning about that assumption.
 437
 438 The above heuristics mean that there is one case that kdba_bt_x86 cannot detect:
 439 pass through functions where all the arguments are in registers.  These have no
 440 argument references at all in their code, so they are printed with no arguments.
 441 All they do is call another function so this class of functions never fails, or
 442 if it does fail then it is due to something that is not argument related.  If
 443 the failure is further down the call stack then the arguments are printed at the
 444 next function down the stack, so the user still has the arguments.
 445
 446 This list of limitations on getting the x86 arguments may seem to be a long one,
 447 but kdba_bt_x86 gives sensible results for most functions.  For kernel
 448 debugging, any arguments are far better than none at all.
 449
 450
 451 Kernel Stack Switching
 452 ======================
 453
 454 Understanding the unusual way that x86 kernel stacks are used is very important
 455 when diagnosing backtrace problems.  Every process has its own normal kernel
 456 stack, even processes that run entirely within the kernel such as kthread or the
 457 per cpu migration processes.  The normal stacks are 4K or 8K on i386 (depending
 458 on CONFIG_4KSTACKS) and 8K on x86_64.  The normal stacks are global, they are
 459 not tied to any cpu.
 460
 461 For i386 with 8K stacks there are no other kernel stacks so there is no stack
 462 switching to worry about.
 463
 464 For i386 with 4K process stacks, each cpu also has a 4K soft irq stack and a 4K
 465 hard irq stack.  It is possible for a process to be running on its own process
 466 stack, for the process to be interrupted by a soft irq which is then interrupted
 467 by a hard irq.  At that point the backtrace is split between the hard irq, the
 468 soft irq and the normal normal stacks.
 469
 470 On x86_64, each cpu always has stacks for stackfault, doublefault, nmi, debug,
 471 mce and interrupts.  See Documentation/x86_64/kernel-stacks.
 472
 473 The arch specific kdba_get_stack_info_alternate() function works out which stack
 474 the backtrace starts on, how big the stack is and how to switch to the next
 475 stack.  This information is stored in the kdb_activation_record and used by the
 476 higher level backtrace code to detect a stack switch.
 477
 478 The normal stack has some padding at the end, this reflects the stack pointer
 479 when the process was created in the kernel.  kdba_bt_x86 cannot backtrace
 480 through this padding data, mainly because the code that set the nitial stack
 481 pointer no longer exists after boot.  ARCH_NORMAL_PADDING defines how many words
 482 to ignore at the end of the normal stack.
 483
 484
 485 Debugging KDB
 486 =============
 487
 488 KDB has conditional debugging print statements scattered throughout the code.
 489 If KDB is not behaving as expected, you can turn on debugging and rerun the
 490 command.  Before debugging KDB, set LINES 10000 and capture the output via a
 491 serial console.  If using minicom, turn on word wrap (control-A W) and capture
 492 mode (control-A L).  If you are using a serial console via a serial to Ethernet
 493 interface using ssh or telnet, use the 'script' command to start the session.
 494
 495 The various KDB_DEBUG_FLAG_* flags are listed in include/linux/kdbprivate.h.
 496 You set debug with 'set KDBDEBUG 0xnn' where nn is the or'd value of the desired
 497 flags.  'set KDBDEBUG 0' turns off KDB debugging.  When diagnosing x86 backtrace
 498 problems, the most useful debug flags are
 499
 500   KDB_DEBUG_FLAG_ARA            0x10    Activation record, arch specific
 501   KDB_DEBUG_FLAG_BB_SUMM        0x04    Basic block analysis, summary only
 502   KDB_DEBUG_FLAG_BB             0x20    All basic block analysis
 503
 504 ARA prints information about the different kernel stacks as kdba_bt_x86 unwinds
 505 through the switched kernel stacks.  BB_SUMM prints a summary of the basic block
 506 analysis for each function, including the abstract exit state and the rollback
 507 calculations.  BB prints a huge amount of basic block debugging, you probably
 508 only want to turn this for the full backtrace on as a last resort.
 509
 510 I find 'set KDBDEBUG 0x14' to be best to get an overview of a problem.  It gives
 511 both the kernel stack information plus the abstract state and actual location of
 512 data for each function.
 513
 514 Command 'bb1' does a detailed debug session for a single function, bb1 takes a
 515 single parameter, the address of the exit point from the function, by number,
 516 not by name.  bb1 turns on KDB_DEBUG_FLAG_BB, does basic block analysis for the
 517 function that contains the exit point then resets the debug flags to their
 518 previous value.
 519
 520 Command 'bb_all' runs through every function in the base kernel (not module
 521 functions) and does a basic block analysis of every function.  It also validates
 522 the various tables in kdba_bt_x86 where possible.  bb_all is meant for the KDB
 523 maintainer to check that all the base kernel function pass the sanity checks, it
 524 can also be used by end users when reporting a bug.  bb_all takes no parameters.
 525 It prints a '.' for every 100 functions it has analysed and allows for up to 20
 526 errors before giving up.  The output from bb_all also includes the config
 527 variables that affect basic block analysis plus any assumptions about 'pass
 528 through' functions.
 529
 530
 531 Submitting a Bug Report Against kdba_bt_x86
 532 ===========================================
 533
 534 Capture the KDB output via a serial console.
 535
 536 set LINES 10000
 537 set BTSP 1
 538 set KDBDEBUG 0x14
 539 Reproduce the problem.
 540 set KDBDEBUG 0
 541 bb_all
 542
 543 If you can identify the rip/eip where kdba_bt_x86 gets confused, run bb1 with
 544 that address.
 545
 546 Find each set of output from kdba_get_stack_info in the trace, extract the last
 547 two lines and type those lines into KDB.  That will give a hex and symbolic dump
 548 of the raw kernel stacks.  For example, if the trace data is
 549
 550 kdba_get_stack_info: esp=0xc04fbef8 cpu=0 task=c047b3e0
 551 kdba_get_stack_info: ar->stack
 552     physical_start=0xc04fb000
 553     physical_end=0xc04fc000
 554     logical_start=0xc04fb038
 555     logical_end=0xc04fc000
 556     next=0xc04b4f44
 557     id=hardirq_ctx
 558     set MDCOUNT 1024
 559     mds 0xc04fb000
 560
 561 then type the last two lines into KDB.  Repeat this for each stack listed by
 562 kdba_get_stack_info on the failing backtrace.
 563
 564 Send all the console output to the KDB maintainer.
 565
 566
 567 Examples of Basic Block Debugging Output
 568 ========================================
 569
 570 Example of the basic block analysis of fs/namei::getname() on i386.  Kernel
 571 2.6.22, i386, compiled with frame pointers, gcc 4.1.0.
 572
 573 Basic block debugging is very verbose, so set a high number of output lines.
 574 You really need a reliable serial console to capture this amount of output.
 575
 576     [0]kdb> set LINES 10000
 577
 578 A simple disassemble of getname().  This is not required for debugging purposes
 579 since each instruction is printed as part of basic block debugging, but this can
 580 be easier to read.
 581
 582     [0]kdb> id getname
 583     0xc015cce8 getname:         push   %ebp
 584     0xc015cce9 getname+0x1:     mov    %esp,%ebp
 585     0xc015cceb getname+0x3:     push   %edi
 586     0xc015ccec getname+0x4:     push   %esi
 587     0xc015cced getname+0x5:     push   %ebx
 588     0xc015ccee getname+0x6:     sub    $0x4,%esp
 589     0xc015ccf1 getname+0x9:     mov    %eax,%edi
 590     0xc015ccf3 getname+0xb:     mov    $0xd0,%edx
 591     0xc015ccf8 getname+0x10:    mov    0xc04b2120,%eax
 592     0xc015ccfd getname+0x15:    call   0xc0153009 kmem_cache_alloc
 593     0xc015cd02 getname+0x1a:    mov    %eax,0xfffffff0(%ebp)
 594     0xc015cd05 getname+0x1d:    mov    $0xfffffff4,%eax
 595     0xc015cd0a getname+0x22:    cmpl   $0x0,0xfffffff0(%ebp)
 596     0xc015cd0e getname+0x26:    je     0xc015cd7d getname+0x95
 597     0xc015cd10 getname+0x28:    mov    %esp,%eax
 598     0xc015cd12 getname+0x2a:    and    $0xfffff000,%eax
 599     0xc015cd17 getname+0x2f:    cmpl   $0xffffffff,0x18(%eax)
 600     0xc015cd1b getname+0x33:    je     0xc015cd39 getname+0x51
 601     0xc015cd1d getname+0x35:    mov    $0xfffffff2,%esi
 602     0xc015cd22 getname+0x3a:    cmp    $0xbfffffff,%edi
 603     0xc015cd28 getname+0x40:    ja     0xc015cd60 getname+0x78
 604     0xc015cd2a getname+0x42:    mov    $0xc0000000,%ebx
 605     0xc015cd2f getname+0x47:    sub    %edi,%ebx
 606     0xc015cd31 getname+0x49:    cmp    $0xfff,%ebx
 607     0xc015cd37 getname+0x4f:    jbe    0xc015cd3e getname+0x56
 608     0xc015cd39 getname+0x51:    mov    $0x1000,%ebx
 609     0xc015cd3e getname+0x56:    mov    %ebx,%ecx
 610     0xc015cd40 getname+0x58:    mov    %edi,%edx
 611     0xc015cd42 getname+0x5a:    mov    0xfffffff0(%ebp),%eax
 612     0xc015cd45 getname+0x5d:    call   0xc023dbb4 strncpy_from_user
 613     0xc015cd4a getname+0x62:    cmp    $0x0,%eax
 614     0xc015cd4d getname+0x65:    jle    0xc015cd5a getname+0x72
 615     0xc015cd4f getname+0x67:    mov    $0xffffffdc,%esi
 616     0xc015cd54 getname+0x6c:    cmp    %ebx,%eax
 617     0xc015cd56 getname+0x6e:    jae    0xc015cd60 getname+0x78
 618     0xc015cd58 getname+0x70:    jmp    0xc015cd71 getname+0x89
 619     0xc015cd5a getname+0x72:    je     0xc015cd76 getname+0x8e
 620     0xc015cd5c getname+0x74:    jge    0xc015cd71 getname+0x89
 621     0xc015cd5e getname+0x76:    mov    %eax,%esi
 622     0xc015cd60 getname+0x78:    mov    0xfffffff0(%ebp),%edx
 623     0xc015cd63 getname+0x7b:    mov    0xc04b2120,%eax
 624     0xc015cd68 getname+0x80:    call   0xc01521f1 kmem_cache_free
 625     0xc015cd6d getname+0x85:    mov    %esi,%eax
 626     0xc015cd6f getname+0x87:    jmp    0xc015cd7d getname+0x95
 627     0xc015cd71 getname+0x89:    mov    0xfffffff0(%ebp),%eax
 628     0xc015cd74 getname+0x8c:    jmp    0xc015cd7d getname+0x95
 629     0xc015cd76 getname+0x8e:    mov    $0xfffffffe,%esi
 630     0xc015cd7b getname+0x93:    jmp    0xc015cd60 getname+0x78
 631     0xc015cd7d getname+0x95:    pop    %edx
 632     0xc015cd7e getname+0x96:    pop    %ebx
 633     0xc015cd7f getname+0x97:    pop    %esi
 634     0xc015cd80 getname+0x98:    pop    %edi
 635     0xc015cd81 getname+0x99:    pop    %ebp
 636     0xc015cd82 getname+0x9a:    ret
 637
 638 The bb1 command only one argument which must be an address, not a name.  bb1
 639 turns on full basic block debugging and analyses the function containing the
 640 supplied address.  Give bb1 the address of the exit point from this function,
 641 IOW the return address that is stored on stack due to a call from this function
 642 to the next function down the call stack.  Assume that getname() has called
 643 kmem_cache_free() and something went wrong in kmem_cache_free() or one of the
 644 functions that it calls.  The call to kmem_cache_free is at 0xc015cd68 and the
 645 return address on stack is the instruction after the call, i.e. 0xc015cd6d, so
 646
 647     [0]kdb> bb1 0xc015cd6d
 648     bb_pass1: func_name getname func_start 0xc015cce8 func_end 0xc015cd83
 649
 650 bb_pass1 has identified the function name and its start and end address.  For C
 651 functions these are just the function start address and the next symbol in
 652 kallsyms.  For Assembler code there may be spurious labels so the function name
 653 may not match the label prior to the address given to bb1.  For an example of
 654 that on i386, find the address of resume_userspace then pass that address to the
 655 bb1 KDB command.
 656
 657     bb_pass1: end
 658       bb[0] start 0xc015cce8 end 0xc015cd38 drop_through 1
 659       bb[1] start 0xc015cd39 end 0xc015cd3d drop_through 1
 660       bb[2] start 0xc015cd3e end 0xc015cd58 drop_through 0
 661       bb[3] start 0xc015cd5a end 0xc015cd5f drop_through 1
 662       bb[4] start 0xc015cd60 end 0xc015cd6f drop_through 0
 663       bb[5] start 0xc015cd71 end 0xc015cd74 drop_through 0
 664       bb[6] start 0xc015cd76 end 0xc015cd7b drop_through 0
 665       bb[7] start 0xc015cd7d end 0xc015cd82 drop_through 0
 666       bb_jmp[0] from 0xc015cd0e to 0xc015cd7d drop_through 0
 667       bb_jmp[1] from 0xc015cd1b to 0xc015cd39 drop_through 0
 668       bb_jmp[2] from 0xc015cd28 to 0xc015cd60 drop_through 0
 669       bb_jmp[3] from 0xc015cd37 to 0xc015cd3e drop_through 0
 670       bb_jmp[4] from 0xc015cd4d to 0xc015cd5a drop_through 0
 671       bb_jmp[5] from 0xc015cd56 to 0xc015cd60 drop_through 0
 672       bb_jmp[6] from 0xc015cd58 to 0xc015cd71 drop_through 0
 673       bb_jmp[7] from 0xc015cd5a to 0xc015cd76 drop_through 0
 674       bb_jmp[8] from 0xc015cd5c to 0xc015cd71 drop_through 0
 675       bb_jmp[9] from 0xc015cd6f to 0xc015cd7d drop_through 0
 676       bb_jmp[10] from 0xc015cd74 to 0xc015cd7d drop_through 0
 677       bb_jmp[11] from 0xc015cd7b to 0xc015cd60 drop_through 0
 678       bb_jmp[12] from 0xc015cd38 to 0xc015cd39 drop_through 1
 679       bb_jmp[13] from 0xc015cd3d to 0xc015cd3e drop_through 1
 680       bb_jmp[14] from 0xc015cd5f to 0xc015cd60 drop_through 1
 681
 682 After analysing the logic flow, we can see that getname() consists of 8 basic
 683 blocks (nodes in bb_list[]).  5 of these blocks end in unconditional jumps, the
 684 other 3 drop through to the next block.  There are 15 transfers of control
 685 (vertices in bb_jmp_list[]).  12 of these transfers are explicit jmp or jcc
 686 instructions, the other 3 are implicit transfers when dropping through from one
 687 block to the next.  The node list is sorted by start address, the vertex list is
 688 not sorted.
 689
 690 Basic block 0 starts at the function start (0xc015cce8) and ends at 0xc015cd38.
 691 0xc015cd39 is the target of a jump instruction (0xc015cd1b: je 0xc015cd39) so
 692 0xc015cd39 starts a new block, which means that 0xc015cd38 becomes the end of
 693 the previous block.  Because bb[0] does not end in an explicit jmp instruction,
 694 there is a drop through from the end of bb[0] to the start of bb[1], see
 695 bb_jmp[12].
 696
 697     bb_pass2: start
 698
 699 To get the most accurate results from pass2, try to scan the directed graph by
 700 only looking at nodes whose inputs are all defined.  Initially only process
 701 nodes with no missing inputs.
 702
 703       bb_pass2_do_changed_blocks: allow_missing 0
 704
 705       bb[0]
 706       bb_reg_state c07282e0
 707         rax = rax
 708         rbx = rbx
 709         rcx = rcx
 710         rdx = rdx
 711         rdi = rdi
 712         rsi = rsi
 713         rbp = rbp
 714         rsp = osp+0x0
 715
 716 The initial state for bb[0] is the same for all C functions.  Each register
 717 contains its own abstract value, except for rsp which is defined in terms of the
 718 original stack pointer (osp).
 719
 720      '0xc015cce8 getname:         push   %ebp'
 721
 722 The first instruction of getname() saves the frame pointer.
 723
 724       opcode 'push' matched by 'push', usage 44
 725       src  R: %ebp base_rc 8 (rbp)
 726
 727 bb_usage() reports how the instruction was recognised and how its operands were
 728 decoded.  Although this is i386 (ebp), it is reported as rbp.  Using the x86_64
 729 names for registers throughout makes it easier to create common code for the two
 730 architecures.
 731
 732       rsp osp offset +0x0 -> -0x4
 733
 734 A push instruction decrements rsp by 4 (i386) or 8 (x86_64) bytes.  rsp
 735 originally contained the original stack pointer (osp), now it contains the
 736 original stack pointer - 4.
 737
 738       *(rsp+0x0 osp-0x4) = rbp slot 0
 739
 740 The stack location pointed to by *rsp now contains the original value of rbp.
 741 Since rsp contains (osp-0x4), *(osp-0x4) contains rbp.  It is slot 0 in the
 742 memory array associated with the register state.
 743
 744      '0xc015cce9 getname+0x1:     mov    %esp,%ebp'
 745       opcode 'mov' matched by 'mov', usage 36
 746       src  R: %esp base_rc 9 (rsp)
 747       dst  R: %ebp base_rc 8 (rbp)
 748       rbp = rsp (osp-0x4)
 749
 750 Copy esp (rsp) to ebp (rbp).  rsp contained (osp-0x4) so rbp also contains
 751 (osp-0x4).  Any reference to data via either rbp or rsp will now be tracked as a
 752 stack location.
 753
 754      '0xc015cceb getname+0x3:     push   %edi'
 755       opcode 'push' matched by 'push', usage 44
 756       src  R: %edi base_rc 6 (rdi)
 757       rsp osp offset -0x4 -> -0x8
 758       *(rsp+0x0 osp-0x8) = rdi slot 1
 759      '0xc015ccec getname+0x4:     push   %esi'
 760       opcode 'push' matched by 'push', usage 44
 761       src  R: %esi base_rc 7 (rsi)
 762       rsp osp offset -0x8 -> -0xc
 763       *(rsp+0x0 osp-0xc) = rsi slot 2
 764      '0xc015cced getname+0x5:     push   %ebx'
 765       opcode 'push' matched by 'push', usage 44
 766       src  R: %ebx base_rc 3 (rbx)
 767       rsp osp offset -0xc -> -0x10
 768       *(rsp+0x0 osp-0x10) = rbx slot 3
 769
 770 Push 3 registers to stack.  rsp is adjusted for each push and stack locations
 771 are assigned to contain the values of edi, esi and ebx.  This sequence is very
 772 common in i386 C functions.  edi, esi and ebx are preserved registers on i386,
 773 but gcc wants to use them for scratch space.  The original contents iof these
 774 registers must be saved on stack and restored before returning to our caller.
 775
 776      '0xc015ccee getname+0x6:     sub    $0x4,%esp'
 777       opcode 'sub' matched by 'sub', usage 51
 778       src  I: $0x4
 779       dst  R: %esp base_rc 9 (rsp)
 780       rsp osp offset -0x10 -> -0x14
 781
 782 Subtract 4 bytes from esp.  This defines the local stack variables.  Sorry,
 783 names for local stack variables are not available to KDB.
 784
 785      '0xc015ccf1 getname+0x9:     mov    %eax,%edi'
 786       opcode 'mov' matched by 'mov', usage 36
 787       src  R: %eax base_rc 2 (rax)
 788       dst  R: %edi base_rc 6 (rdi)
 789       rdi = rax (rax)
 790
 791 Having saved edi on stack, gcc now overwrites edi with eax.  At this point rax
 792 still contains its original value, so rdi now contains a copy of rax, as well as
 793 the original value which is still in rax.  This is a common sequence in C code.
 794 rax contains argument 0 but it is also a scratch register.  If the code needs to
 795 use argument 0 later then its value must be saved somewhere before executing any
 796 instruction that changes rax.  edi is a preserved register so its contents will
 797 not be changed by any function that we call, or if it is changed then it will be
 798 restored before returning to this function.
 799
 800 rax is listed in the arch specific bb_param_reg[] list and the code is reading
 801 from rax while it still contains its original value.  The only way that makes
 802 any sense is when rax is an input argument to getname().  We note that fact in
 803 bb_reg_read().
 804
 805      '0xc015ccf3 getname+0xb:     mov    $0xd0,%edx'
 806       opcode 'mov' matched by 'mov', usage 36
 807       src  I: $0xd0
 808       dst  R: %edx base_rc 5 (rdx)
 809       rdx = undefined
 810
 811 Moving an constant value to edx.  Although this is a constant, it does not refer
 812 to any of the original values that were supplied to this function.  Therefore
 813 rdx becomes undefined for the purposes of the code analysis.
 814
 815      '0xc015ccf8 getname+0x10:    mov    0xc04b2120,%eax'
 816       opcode 'mov' matched by 'mov', usage 36
 817       src  M: 0xc04b2120
 818       dst  R: %eax base_rc 2 (rax)
 819       rax = undefined
 820
 821 Moving a constant value to eax makes rax undefined.
 822
 823      '0xc015ccfd getname+0x15:    call   0xc0153009 <kmem_cache_alloc>'
 824       opcode 'call' matched by 'call', usage 17
 825       src  M: 0xc0153009
 826       bb_reg_state c0728658
 827         rax = undefined
 828         rbx = rbx
 829         rcx = rcx
 830         rdx = undefined
 831         rdi = rax
 832         rsi = rsi
 833         rbp = osp-0x4
 834         rsp = osp-0x14
 835         slot 0 offset_address -0x4 rbp
 836         slot 1 offset_address -0x8 rdi
 837         slot 2 offset_address -0xc rsi
 838         slot 3 offset_address -0x10 rbx
 839
 840 Basic block debugging prints the register and memory state when transfering
 841 control between blocks and when issuing call instructions.  The call state is
 842 mainly useful when C code calls assembler routines, especially if you are not
 843 sure what state the assembler code expects.  Not all of our assembler is as well
 844 documented as it could be :(
 845
 846       rax = undefined
 847       rcx = undefined
 848       rdx = undefined
 849
 850 The i386 ABI says that some registers are preserved across calls, see the arch
 851 specific bb_preserved_reg[] list.  Any registers not in that list automatically
 852 become undefined after a call instruction.
 853
 854      '0xc015cd02 getname+0x1a:    mov    %eax,0xfffffff0(%ebp)'
 855       opcode 'mov' matched by 'mov', usage 36
 856       src  R: %eax base_rc 2 (rax)
 857       dst  M: 0xfffffff0(%ebp) base_rc 8 (rbp)
 858
 859 eax is the return value from the call, it is being saved at offset 0xfffffff0
 860 (-0x10) from ebp.  Since rbp contains (osp-0x4) the return value is being stored
 861 at (osp-0x14).  This is a stack location but we have no record of any data being
 862 held at that location, it is part of the local stack variables.
 863
 864      '0xc015cd05 getname+0x1d:    mov    $0xfffffff4,%eax'
 865       opcode 'mov' matched by 'mov', usage 36
 866       src  I: $0xfffffff4
 867       dst  R: %eax base_rc 2 (rax)
 868       rax = undefined
 869      '0xc015cd0a getname+0x22:    cmpl   $0x0,0xfffffff0(%ebp)'
 870       opcode 'cmpl' matched by 'cmp', usage 3
 871       src  I: $0x0
 872       dst  M: 0xfffffff0(%ebp) base_rc 8 (rbp)
 873      '0xc015cd0e getname+0x26:    je     0xc015cd7d <getname+0x95>'
 874       opcode 'je' matched by 'j', usage 28
 875       src  M: 0xc015cd7d
 876       bb_reg_state c0728658
 877         rax = undefined
 878         rbx = rbx
 879         rcx = undefined
 880         rdx = undefined
 881         rdi = rax
 882         rsi = rsi
 883         rbp = osp-0x4
 884         rsp = osp-0x14
 885         slot 0 offset_address -0x4 rbp
 886         slot 1 offset_address -0x8 rdi
 887         slot 2 offset_address -0xc rsi
 888         slot 3 offset_address -0x10 rbx
 889
 890 Transfer of control, print the register and memory state.
 891
 892       matched: from 0xc015cd0e to 0xc015cd7d drop_through 0 bb_jmp[0]
 893
 894 Which bb_jmp_list[] entry matches this transfer.
 895
 896       new state c07286b8
 897
 898 The current abstract register and memory state is cloned at address c07286b8.
 899 This state becomes one of the inputs to the basic block whose start address is
 900 0xc015cd7d.
 901
 902      '0xc015cd10 getname+0x28:    mov    %esp,%eax'
 903       opcode 'mov' matched by 'mov', usage 36
 904       src  R: %esp base_rc 9 (rsp)
 905       dst  R: %eax base_rc 2 (rax)
 906       rax = rsp (osp-0x14)
 907
 908 Copy rsp which contains (osp-0x14) to rax.  rax contains a valid stack pointer.
 909
 910      '0xc015cd12 getname+0x2a:    and    $0xfffff000,%eax'
 911       opcode 'and' matched by 'and', usage 11
 912       src  I: $0xfffff000
 913       dst  R: %eax base_rc 2 (rax)
 914       rax = undefined
 915
 916 But not for long.
 917
 918      '0xc015cd17 getname+0x2f:    cmpl   $0xffffffff,0x18(%eax)'
 919       opcode 'cmpl' matched by 'cmp', usage 3
 920       src  I: $0xffffffff
 921       dst  M: 0x18(%eax) base_rc 2 (rax)
 922      '0xc015cd1b getname+0x33:    je     0xc015cd39 <getname+0x51>'
 923       opcode 'je' matched by 'j', usage 28
 924       src  M: 0xc015cd39
 925       bb_reg_state c0728658
 926         rax = undefined
 927         rbx = rbx
 928         rcx = undefined
 929         rdx = undefined
 930         rdi = rax
 931         rsi = rsi
 932         rbp = osp-0x4
 933         rsp = osp-0x14
 934         slot 0 offset_address -0x4 rbp
 935         slot 1 offset_address -0x8 rdi
 936         slot 2 offset_address -0xc rsi
 937         slot 3 offset_address -0x10 rbx
 938
 939 Another transfer of control, print the state.
 940
 941       matched: from 0xc015cd1b to 0xc015cd39 drop_through 0 bb_jmp[1]
 942
 943 Which bb_jmp_list[] entry was used.
 944
 945       reuse bb_jmp[0]
 946
 947 To save space, we only clone the state if it is different.  Otherwise we reuse
 948 the state from another vertex and bump the reference count.
 949
 950      '0xc015cd1d getname+0x35:    mov    $0xfffffff2,%esi'
 951       opcode 'mov' matched by 'mov', usage 36
 952       src  I: $0xfffffff2
 953       dst  R: %esi base_rc 7 (rsi)
 954       rsi = undefined
 955
 956 Using esi as a scratch register, even though the i386 ABi says that esi is a
 957 preserved register.  Not to worry, the original value of rsi was saved on stack
 958 on entry and it will be restored before exit.
 959
 960      '0xc015cd22 getname+0x3a:    cmp    $0xbfffffff,%edi'
 961       opcode 'cmp' matched by 'cmp', usage 3
 962       src  I: $0xbfffffff
 963       dst  R: %edi base_rc 6 (rdi)
 964      '0xc015cd28 getname+0x40:    ja     0xc015cd60 <getname+0x78>'
 965       opcode 'ja' matched by 'j', usage 28
 966       src  M: 0xc015cd60
 967       bb_reg_state c0728658
 968         rax = undefined
 969         rbx = rbx
 970         rcx = undefined
 971         rdx = undefined
 972         rdi = rax
 973         rsi = undefined
 974         rbp = osp-0x4
 975         rsp = osp-0x14
 976         slot 0 offset_address -0x4 rbp
 977         slot 1 offset_address -0x8 rdi
 978         slot 2 offset_address -0xc rsi
 979         slot 3 offset_address -0x10 rbx
 980       matched: from 0xc015cd28 to 0xc015cd60 drop_through 0 bb_jmp[2]
 981       new state c0728710
 982
 983 This state is different from any states already saved, clone to a new entry.
 984
 985      '0xc015cd2a getname+0x42:    mov    $0xc0000000,%ebx'
 986       opcode 'mov' matched by 'mov', usage 36
 987       src  I: $0xc0000000
 988       dst  R: %ebx base_rc 3 (rbx)
 989       rbx = undefined
 990      '0xc015cd2f getname+0x47:    sub    %edi,%ebx'
 991       opcode 'sub' matched by 'sub', usage 51
 992       src  R: %edi base_rc 6 (rdi)
 993       dst  R: %ebx base_rc 3 (rbx)
 994       rbx = undefined
 995      '0xc015cd31 getname+0x49:    cmp    $0xfff,%ebx'
 996       opcode 'cmp' matched by 'cmp', usage 3
 997       src  I: $0xfff
 998       dst  R: %ebx base_rc 3 (rbx)
 999      '0xc015cd37 getname+0x4f:    jbe    0xc015cd3e <getname+0x56>'
1000       opcode 'jbe' matched by 'j', usage 28
1001       src  M: 0xc015cd3e
1002       bb_reg_state c0728658
1003         rax = undefined
1004         rbx = undefined
1005         rcx = undefined
1006         rdx = undefined
1007         rdi = rax
1008         rsi = undefined
1009         rbp = osp-0x4
1010         rsp = osp-0x14
1011         slot 0 offset_address -0x4 rbp
1012         slot 1 offset_address -0x8 rdi
1013         slot 2 offset_address -0xc rsi
1014         slot 3 offset_address -0x10 rbx
1015       matched: from 0xc015cd37 to 0xc015cd3e drop_through 0 bb_jmp[3]
1016       new state c0728768
1017
1018 This state is different from any states already saved, clone to a new entry.
1019
1020       bb_reg_state c0728658
1021         rax = undefined
1022         rbx = undefined
1023         rcx = undefined
1024         rdx = undefined
1025         rdi = rax
1026         rsi = undefined
1027         rbp = osp-0x4
1028         rsp = osp-0x14
1029         slot 0 offset_address -0x4 rbp
1030         slot 1 offset_address -0x8 rdi
1031         slot 2 offset_address -0xc rsi
1032         slot 3 offset_address -0x10 rbx
1033       matched: from 0xc015cd38 to 0xc015cd39 drop_through 1 bb_jmp[12]
1034       reuse bb_jmp[3]
1035
1036 Basic block 0 drops through to basic block 1, treat it as an implicit transfer
1037 of control.  The state is the same as the previous jump instruction so reuse it
1038 and bump the reference count.
1039
1040 That ends basic block 0, now pick the next block in the list that (a) needs to
1041 be scanned and (b) has all its input states defined.  In this case bb[1].
1042
1043       bb[1]
1044
1045 bb[1] starts at 0xc015cd39 and has two paths that transfer control to it.
1046 bb_jmp[1] from an explicit jump at 0xc015cd1b and a drop through at bb_jmp[12].
1047 Where there is more than one input state we have to merge them and reconcile the
1048 final value.
1049
1050       first state c07286b8
1051
1052 The first input state is stored at c07286b8.  Looking back through the trace we
1053 find that entry associated with bb_jmp[0], not bb_jmp[1] as expected.  However
1054 bb_jmp[1] reused the state that was stored for bb_jmp[0] so all is well.
1055
1056       bb_reg_state c0728658
1057         rax = undefined
1058         rbx = rbx
1059         rcx = undefined
1060         rdx = undefined
1061         rdi = rax
1062         rsi = rsi
1063         rbp = osp-0x4
1064         rsp = osp-0x14
1065         slot 0 offset_address -0x4 rbp
1066         slot 1 offset_address -0x8 rdi
1067         slot 2 offset_address -0xc rsi
1068         slot 3 offset_address -0x10 rbx
1069
1070 The first state for bb[1].
1071
1072       merging state c0728768
1073
1074 Now merge the second state, which is held at c0728768.
1075
1076       rbx = undefined
1077       rsi = undefined
1078
1079 The two states disagree on the values being tracked in rbx and rsi.  Compiler
1080 theory 101 says that if two or more paths to a basic block have different values
1081 for a register then that register cannot be relied on at the start of the block,
1082 so make it undefined.  The same logic applies to memory locations.
1083
1084       final state
1085       bb_reg_state c0728658
1086         rax = undefined
1087         rbx = undefined
1088         rcx = undefined
1089         rdx = undefined
1090         rdi = rax
1091         rsi = undefined
1092         rbp = osp-0x4
1093         rsp = osp-0x14
1094         slot 0 offset_address -0x4 rbp
1095         slot 1 offset_address -0x8 rdi
1096         slot 2 offset_address -0xc rsi
1097         slot 3 offset_address -0x10 rbx
1098
1099 After merging all the input states, this is the final starting state for bb[1].
1100 Now track what bb[1] does to the state.
1101
1102      '0xc015cd39 getname+0x51:    mov    $0x1000,%ebx'
1103       opcode 'mov' matched by 'mov', usage 36
1104       src  I: $0x1000
1105       dst  R: %ebx base_rc 3 (rbx)
1106       rbx = undefined
1107       bb_reg_state c0728658
1108         rax = undefined
1109         rbx = undefined
1110         rcx = undefined
1111         rdx = undefined
1112         rdi = rax
1113         rsi = undefined
1114         rbp = osp-0x4
1115         rsp = osp-0x14
1116         slot 0 offset_address -0x4 rbp
1117         slot 1 offset_address -0x8 rdi
1118         slot 2 offset_address -0xc rsi
1119         slot 3 offset_address -0x10 rbx
1120       matched: from 0xc015cd3d to 0xc015cd3e drop_through 1 bb_jmp[13]
1121       reuse bb_jmp[3]
1122
1123 bb[1] is a single instruction which drops through to bb[2].
1124
1125       bb[2]
1126       first state c0728768
1127       bb_reg_state c0728658
1128         rax = undefined
1129         rbx = undefined
1130         rcx = undefined
1131         rdx = undefined
1132         rdi = rax
1133         rsi = undefined
1134         rbp = osp-0x4
1135         rsp = osp-0x14
1136         slot 0 offset_address -0x4 rbp
1137         slot 1 offset_address -0x8 rdi
1138         slot 2 offset_address -0xc rsi
1139         slot 3 offset_address -0x10 rbx
1140       merging state c0728768
1141
1142 bb[2] has two inputs, both vertices are pointing to input state c0728768.
1143 Merging an entry with itself has no effect.
1144
1145      '0xc015cd3e getname+0x56:    mov    %ebx,%ecx'
1146       opcode 'mov' matched by 'mov', usage 36
1147       src  R: %ebx base_rc 3 (rbx)
1148       dst  R: %ecx base_rc 4 (rcx)
1149       rcx = rbx (undefined)
1150      '0xc015cd40 getname+0x58:    mov    %edi,%edx'
1151       opcode 'mov' matched by 'mov', usage 36
1152       src  R: %edi base_rc 6 (rdi)
1153       dst  R: %edx base_rc 5 (rdx)
1154       rdx = rdi (rax)
1155      '0xc015cd42 getname+0x5a:    mov    0xfffffff0(%ebp),%eax'
1156       opcode 'mov' matched by 'mov', usage 36
1157       src  M: 0xfffffff0(%ebp) base_rc 8 (rbp)
1158       dst  R: %eax base_rc 2 (rax)
1159       rax = *(rbp-0x10) (osp-0x14)  rax = undefined
1160      '0xc015cd45 getname+0x5d:    call   0xc023dbb4 <strncpy_from_user>'
1161       opcode 'call' matched by 'call', usage 17
1162       src  M: 0xc023dbb4
1163       bb_reg_state c0728658
1164         rax = undefined
1165         rbx = undefined
1166         rcx = undefined
1167         rdx = rax
1168         rdi = rax
1169         rsi = undefined
1170         rbp = osp-0x4
1171         rsp = osp-0x14
1172         slot 0 offset_address -0x4 rbp
1173         slot 1 offset_address -0x8 rdi
1174         slot 2 offset_address -0xc rsi
1175         slot 3 offset_address -0x10 rbx
1176       rax = undefined
1177       rcx = undefined
1178       rdx = undefined
1179      '0xc015cd4a getname+0x62:    cmp    $0x0,%eax'
1180       opcode 'cmp' matched by 'cmp', usage 3
1181       src  I: $0x0
1182       dst  R: %eax base_rc 2 (rax)
1183      '0xc015cd4d getname+0x65:    jle    0xc015cd5a <getname+0x72>'
1184       opcode 'jle' matched by 'j', usage 28
1185       src  M: 0xc015cd5a
1186       bb_reg_state c0728658
1187         rax = undefined
1188         rbx = undefined
1189         rcx = undefined
1190         rdx = undefined
1191         rdi = rax
1192         rsi = undefined
1193         rbp = osp-0x4
1194         rsp = osp-0x14
1195         slot 0 offset_address -0x4 rbp
1196         slot 1 offset_address -0x8 rdi
1197         slot 2 offset_address -0xc rsi
1198         slot 3 offset_address -0x10 rbx
1199       matched: from 0xc015cd4d to 0xc015cd5a drop_through 0 bb_jmp[4]
1200       reuse bb_jmp[3]
1201      '0xc015cd4f getname+0x67:    mov    $0xffffffdc,%esi'
1202       opcode 'mov' matched by 'mov', usage 36
1203       src  I: $0xffffffdc
1204       dst  R: %esi base_rc 7 (rsi)
1205       rsi = undefined
1206      '0xc015cd54 getname+0x6c:    cmp    %ebx,%eax'
1207       opcode 'cmp' matched by 'cmp', usage 3
1208       src  R: %ebx base_rc 3 (rbx)
1209       dst  R: %eax base_rc 2 (rax)
1210      '0xc015cd56 getname+0x6e:    jae    0xc015cd60 <getname+0x78>'
1211       opcode 'jae' matched by 'j', usage 28
1212       src  M: 0xc015cd60
1213       bb_reg_state c0728658
1214         rax = undefined
1215         rbx = undefined
1216         rcx = undefined
1217         rdx = undefined
1218         rdi = rax
1219         rsi = undefined
1220         rbp = osp-0x4
1221         rsp = osp-0x14
1222         slot 0 offset_address -0x4 rbp
1223         slot 1 offset_address -0x8 rdi
1224         slot 2 offset_address -0xc rsi
1225         slot 3 offset_address -0x10 rbx
1226       matched: from 0xc015cd56 to 0xc015cd60 drop_through 0 bb_jmp[5]
1227       reuse bb_jmp[3]
1228      '0xc015cd58 getname+0x70:    jmp    0xc015cd71 <getname+0x89>'
1229       opcode 'jmp' matched by 'j', usage 28
1230       src  M: 0xc015cd71
1231       bb_reg_state c0728658
1232         rax = undefined
1233         rbx = undefined
1234         rcx = undefined
1235         rdx = undefined
1236         rdi = rax
1237         rsi = undefined
1238         rbp = osp-0x4
1239         rsp = osp-0x14
1240         slot 0 offset_address -0x4 rbp
1241         slot 1 offset_address -0x8 rdi
1242         slot 2 offset_address -0xc rsi
1243         slot 3 offset_address -0x10 rbx
1244       matched: from 0xc015cd58 to 0xc015cd71 drop_through 0 bb_jmp[6]
1245       reuse bb_jmp[3]
1246
1247       bb[3]
1248       first state c0728768
1249       bb_reg_state c0728658
1250         rax = undefined
1251         rbx = undefined
1252         rcx = undefined
1253         rdx = undefined
1254         rdi = rax
1255         rsi = undefined
1256         rbp = osp-0x4
1257         rsp = osp-0x14
1258         slot 0 offset_address -0x4 rbp
1259         slot 1 offset_address -0x8 rdi
1260         slot 2 offset_address -0xc rsi
1261         slot 3 offset_address -0x10 rbx
1262
1263 bb[3] only has one input, nothing to merge.
1264
1265      '0xc015cd5a getname+0x72:    je     0xc015cd76 <getname+0x8e>'
1266       opcode 'je' matched by 'j', usage 28
1267       src  M: 0xc015cd76
1268       bb_reg_state c0728658
1269         rax = undefined
1270         rbx = undefined
1271         rcx = undefined
1272         rdx = undefined
1273         rdi = rax
1274         rsi = undefined
1275         rbp = osp-0x4
1276         rsp = osp-0x14
1277         slot 0 offset_address -0x4 rbp
1278         slot 1 offset_address -0x8 rdi
1279         slot 2 offset_address -0xc rsi
1280         slot 3 offset_address -0x10 rbx
1281       matched: from 0xc015cd5a to 0xc015cd76 drop_through 0 bb_jmp[7]
1282       reuse bb_jmp[3]
1283      '0xc015cd5c getname+0x74:    jge    0xc015cd71 <getname+0x89>'
1284       opcode 'jge' matched by 'j', usage 28
1285       src  M: 0xc015cd71
1286       bb_reg_state c0728658
1287         rax = undefined
1288         rbx = undefined
1289         rcx = undefined
1290         rdx = undefined
1291         rdi = rax
1292         rsi = undefined
1293         rbp = osp-0x4
1294         rsp = osp-0x14
1295         slot 0 offset_address -0x4 rbp
1296         slot 1 offset_address -0x8 rdi
1297         slot 2 offset_address -0xc rsi
1298         slot 3 offset_address -0x10 rbx
1299       matched: from 0xc015cd5c to 0xc015cd71 drop_through 0 bb_jmp[8]
1300       reuse bb_jmp[3]
1301      '0xc015cd5e getname+0x76:    mov    %eax,%esi'
1302       opcode 'mov' matched by 'mov', usage 36
1303       src  R: %eax base_rc 2 (rax)
1304       dst  R: %esi base_rc 7 (rsi)
1305       rsi = rax (undefined)
1306       bb_reg_state c0728658
1307         rax = undefined
1308         rbx = undefined
1309         rcx = undefined
1310         rdx = undefined
1311         rdi = rax
1312         rsi = undefined
1313         rbp = osp-0x4
1314         rsp = osp-0x14
1315         slot 0 offset_address -0x4 rbp
1316         slot 1 offset_address -0x8 rdi
1317         slot 2 offset_address -0xc rsi
1318         slot 3 offset_address -0x10 rbx
1319       matched: from 0xc015cd5f to 0xc015cd60 drop_through 1 bb_jmp[14]
1320       reuse bb_jmp[3]
1321
1322       bb[5]
1323       first state c0728768
1324       bb_reg_state c0728658
1325         rax = undefined
1326         rbx = undefined
1327         rcx = undefined
1328         rdx = undefined
1329         rdi = rax
1330         rsi = undefined
1331         rbp = osp-0x4
1332         rsp = osp-0x14
1333         slot 0 offset_address -0x4 rbp
1334         slot 1 offset_address -0x8 rdi
1335         slot 2 offset_address -0xc rsi
1336         slot 3 offset_address -0x10 rbx
1337       merging state c0728768
1338      '0xc015cd71 getname+0x89:    mov    0xfffffff0(%ebp),%eax'
1339       opcode 'mov' matched by 'mov', usage 36
1340       src  M: 0xfffffff0(%ebp) base_rc 8 (rbp)
1341       dst  R: %eax base_rc 2 (rax)
1342       rax = *(rbp-0x10) (osp-0x14)  rax = undefined
1343      '0xc015cd74 getname+0x8c:    jmp    0xc015cd7d <getname+0x95>'
1344       opcode 'jmp' matched by 'j', usage 28
1345       src  M: 0xc015cd7d
1346       bb_reg_state c0728658
1347         rax = undefined
1348         rbx = undefined
1349         rcx = undefined
1350         rdx = undefined
1351         rdi = rax
1352         rsi = undefined
1353         rbp = osp-0x4
1354         rsp = osp-0x14
1355         slot 0 offset_address -0x4 rbp
1356         slot 1 offset_address -0x8 rdi
1357         slot 2 offset_address -0xc rsi
1358         slot 3 offset_address -0x10 rbx
1359       matched: from 0xc015cd74 to 0xc015cd7d drop_through 0 bb_jmp[10]
1360       reuse bb_jmp[3]
1361
1362       bb[6]
1363       first state c0728768
1364       bb_reg_state c0728658
1365         rax = undefined
1366         rbx = undefined
1367         rcx = undefined
1368         rdx = undefined
1369         rdi = rax
1370         rsi = undefined
1371         rbp = osp-0x4
1372         rsp = osp-0x14
1373         slot 0 offset_address -0x4 rbp
1374         slot 1 offset_address -0x8 rdi
1375         slot 2 offset_address -0xc rsi
1376         slot 3 offset_address -0x10 rbx
1377      '0xc015cd76 getname+0x8e:    mov    $0xfffffffe,%esi'
1378       opcode 'mov' matched by 'mov', usage 36
1379       src  I: $0xfffffffe
1380       dst  R: %esi base_rc 7 (rsi)
1381       rsi = undefined
1382      '0xc015cd7b getname+0x93:    jmp    0xc015cd60 <getname+0x78>'
1383       opcode 'jmp' matched by 'j', usage 28
1384       src  M: 0xc015cd60
1385       bb_reg_state c0728658
1386         rax = undefined
1387         rbx = undefined
1388         rcx = undefined
1389         rdx = undefined
1390         rdi = rax
1391         rsi = undefined
1392         rbp = osp-0x4
1393         rsp = osp-0x14
1394         slot 0 offset_address -0x4 rbp
1395         slot 1 offset_address -0x8 rdi
1396         slot 2 offset_address -0xc rsi
1397         slot 3 offset_address -0x10 rbx
1398       matched: from 0xc015cd7b to 0xc015cd60 drop_through 0 bb_jmp[11]
1399       reuse bb_jmp[3]
1400
1401       bb[4]
1402       first state c0728710
1403       bb_reg_state c0728658
1404         rax = undefined
1405         rbx = rbx
1406         rcx = undefined
1407         rdx = undefined
1408         rdi = rax
1409         rsi = undefined
1410         rbp = osp-0x4
1411         rsp = osp-0x14
1412         slot 0 offset_address -0x4 rbp
1413         slot 1 offset_address -0x8 rdi
1414         slot 2 offset_address -0xc rsi
1415         slot 3 offset_address -0x10 rbx
1416       merging state c0728768
1417       rbx = undefined
1418       merging state c0728768
1419       merging state c0728768
1420
1421 bb[4] has 4 inputs, 3 of which have the same state.  One one path (state
1422 c0728710) rbx is defined, on the others (c0728768) rbx is undefined so the final
1423 state has rbx as undefined.
1424
1425       final state
1426       bb_reg_state c0728658
1427         rax = undefined
1428         rbx = undefined
1429         rcx = undefined
1430         rdx = undefined
1431         rdi = rax
1432         rsi = undefined
1433         rbp = osp-0x4
1434         rsp = osp-0x14
1435         slot 0 offset_address -0x4 rbp
1436         slot 1 offset_address -0x8 rdi
1437         slot 2 offset_address -0xc rsi
1438         slot 3 offset_address -0x10 rbx
1439      '0xc015cd60 getname+0x78:    mov    0xfffffff0(%ebp),%edx'
1440       opcode 'mov' matched by 'mov', usage 36
1441       src  M: 0xfffffff0(%ebp) base_rc 8 (rbp)
1442       dst  R: %edx base_rc 5 (rdx)
1443       rdx = *(rbp-0x10) (osp-0x14)  rdx = undefined
1444      '0xc015cd63 getname+0x7b:    mov    0xc04b2120,%eax'
1445       opcode 'mov' matched by 'mov', usage 36
1446       src  M: 0xc04b2120
1447       dst  R: %eax base_rc 2 (rax)
1448       rax = undefined
1449      '0xc015cd68 getname+0x80:    call   0xc01521f1 <kmem_cache_free>'
1450       opcode 'call' matched by 'call', usage 17
1451       src  M: 0xc01521f1
1452       bb_reg_state c0728658
1453         rax = undefined
1454         rbx = undefined
1455         rcx = undefined
1456         rdx = undefined
1457         rdi = rax
1458         rsi = undefined
1459         rbp = osp-0x4
1460         rsp = osp-0x14
1461         slot 0 offset_address -0x4 rbp
1462         slot 1 offset_address -0x8 rdi
1463         slot 2 offset_address -0xc rsi
1464         slot 3 offset_address -0x10 rbx
1465       rax = undefined
1466       rcx = undefined
1467       rdx = undefined
1468       bb_reg_state c0728658
1469         rax = undefined
1470         rbx = undefined
1471         rcx = undefined
1472         rdx = undefined
1473         rdi = rax
1474         rsi = undefined
1475         rbp = osp-0x4
1476         rsp = osp-0x14
1477         slot 0 offset_address -0x4 rbp
1478         slot 1 offset_address -0x8 rdi
1479         slot 2 offset_address -0xc rsi
1480         slot 3 offset_address -0x10 rbx
1481      '0xc015cd6d getname+0x85:    mov    %esi,%eax'
1482       opcode 'mov' matched by 'mov', usage 36
1483       src  R: %esi base_rc 7 (rsi)
1484       dst  R: %eax base_rc 2 (rax)
1485       rax = rsi (undefined)
1486      '0xc015cd6f getname+0x87:    jmp    0xc015cd7d <getname+0x95>'
1487       opcode 'jmp' matched by 'j', usage 28
1488       src  M: 0xc015cd7d
1489       bb_reg_state c0728658
1490         rax = undefined
1491         rbx = undefined
1492         rcx = undefined
1493         rdx = undefined
1494         rdi = rax
1495         rsi = undefined
1496         rbp = osp-0x4
1497         rsp = osp-0x14
1498         slot 0 offset_address -0x4 rbp
1499         slot 1 offset_address -0x8 rdi
1500         slot 2 offset_address -0xc rsi
1501         slot 3 offset_address -0x10 rbx
1502       matched: from 0xc015cd6f to 0xc015cd7d drop_through 0 bb_jmp[9]
1503       reuse bb_jmp[3]
1504
1505       bb[7]
1506       first state c07286b8
1507       bb_reg_state c0728658
1508         rax = undefined
1509         rbx = rbx
1510         rcx = undefined
1511         rdx = undefined
1512         rdi = rax
1513         rsi = rsi
1514         rbp = osp-0x4
1515         rsp = osp-0x14
1516         slot 0 offset_address -0x4 rbp
1517         slot 1 offset_address -0x8 rdi
1518         slot 2 offset_address -0xc rsi
1519         slot 3 offset_address -0x10 rbx
1520       merging state c0728768
1521       rbx = undefined
1522       rsi = undefined
1523       merging state c0728768
1524       final state
1525       bb_reg_state c0728658
1526         rax = undefined
1527         rbx = undefined
1528         rcx = undefined
1529         rdx = undefined
1530         rdi = rax
1531         rsi = undefined
1532         rbp = osp-0x4
1533         rsp = osp-0x14
1534         slot 0 offset_address -0x4 rbp
1535         slot 1 offset_address -0x8 rdi
1536         slot 2 offset_address -0xc rsi
1537         slot 3 offset_address -0x10 rbx
1538      '0xc015cd7d getname+0x95:    pop    %edx'
1539       opcode 'pop' matched by 'pop', usage 42
1540       src  R: %edx base_rc 5 (rdx)
1541       rdx = *(rsp+0x0) (osp-0x14)  rdx = undefined
1542       rsp osp offset -0x14 -> -0x10
1543
1544 This instruction is a bit misleading.  It looks like gcc is restoring a value
1545 from the stack *(osp-0x14) to edx, but we have no record of any useful data
1546 being stored at osp-0x14.  In fact gcc is just reducing the stack pointer by 4
1547 bytes to reverse the effect of 0xc015ccee: sub $0x4,%esp, the value popped into
1548 edx contains nothing useful.  Why gcc does pop instead of add $0x4,%esp is a
1549 puzzle, probably some micro optimization.
1550
1551      '0xc015cd7e getname+0x96:    pop    %ebx'
1552       opcode 'pop' matched by 'pop', usage 42
1553       src  R: %ebx base_rc 3 (rbx)
1554       rbx = *(rsp+0x0) (osp-0x10) value rbx
1555       rsp osp offset -0x10 -> -0xc
1556       delete rbx from osp-0x10 slot 3
1557
1558 This pop is doing something useful.  It is restoring the original value of the
1559 preserved register ebx from stack, reversing 0xc015cced: push %ebx.  Note that
1560 incrementing rsp from osp-0x10 to osp-0xc invalidates the data held in memory at
1561 osp-0x10, so we delete our record of it.
1562
1563      '0xc015cd7f getname+0x97:    pop    %esi'
1564       opcode 'pop' matched by 'pop', usage 42
1565       src  R: %esi base_rc 7 (rsi)
1566       rsi = *(rsp+0x0) (osp-0xc) value rsi
1567       rsp osp offset -0xc -> -0x8
1568       delete rsi from osp-0xc slot 2
1569      '0xc015cd80 getname+0x98:    pop    %edi'
1570       opcode 'pop' matched by 'pop', usage 42
1571       src  R: %edi base_rc 6 (rdi)
1572       rdi = *(rsp+0x0) (osp-0x8) value rdi
1573       rsp osp offset -0x8 -> -0x4
1574       delete rdi from osp-0x8 slot 1
1575
1576 Pop the other preserved registers, in reverse order to the push sequence at the
1577 start.
1578
1579      '0xc015cd81 getname+0x99:    pop    %ebp'
1580       opcode 'pop' matched by 'pop', usage 42
1581       src  R: %ebp base_rc 8 (rbp)
1582       rbp = *(rsp+0x0) (osp-0x4) value rbp
1583       rsp osp offset -0x4 -> +0x0
1584       delete rbp from osp-0x4 slot 0
1585
1586 Pop the previous frame pointer.
1587
1588      '0xc015cd82 getname+0x9a:    ret    '
1589       opcode 'ret' matched by 'ret', usage 48
1590
1591 When a ret instruction is executed, all the preserved registers must be back to
1592 their original value and the stack pointer must contain osp+0.
1593 bb_sanity_check() will complain and abort the backtrace if this is not true.  No
1594 complaints here.
1595
1596     bb_pass2: end bb_reg_params 1 bb_memory_params 0
1597
1598 We identified one argument passed in a register (the read of rax at 0xc015ccf1)
1599 and no reference to memory locations above the stack frame.  So we have one
1600 argument being passed in a register and no arguments being passed on stack.
1601 This matches
1602
1603   char * getname(const char __user * filename)
1604
1605     bb_pass2: bb_exit_state at 0xc015cd6d
1606       bb_reg_state c07287c0
1607         rax = undefined
1608         rbx = undefined
1609         rcx = undefined
1610         rdx = undefined
1611         rdi = rax
1612         rsi = undefined
1613         rbp = osp-0x4
1614         rsp = osp-0x14
1615         slot 0 offset_address -0x4 rbp
1616         slot 1 offset_address -0x8 rdi
1617         slot 2 offset_address -0xc rsi
1618         slot 3 offset_address -0x10 rbx
1619
1620 We told bb1 that the exit address from this function is 0xc015cd6d.  The
1621 abstract state at this exit point was saved, it defines how we rollback the
1622 actual register values from the next function down the stack (kmem_cache_free)
1623 to get the actual register values on entry to this function (getname).  See
1624 bb_actual_rollback() which updates bb_actual[].
1625
1626 Looking at the exit state above, we see that rsp contains the abstracte value
1627 osp-0x14.  It is a given that we have the actual value of rsp after the call
1628 from getname() to kmem_cache_free(), otherwise we would not have found the
1629 return address on stack and we would not be analysing getname().  Adding 0x14
1630 (the delta from osp to rsp) to our current actual rsp gives us the actual value
1631 of osp on entry to getname().
1632
1633 The main aim of doing all this work is to track the function arguments so we can
1634 print them if possible.  getname() only has one argument which was passed in
1635 eax.  According to the abstract exit state, the original value of rax is
1636 currently in rdi, so by looking at the actual value of rdi from the next stack
1637 frame down we are able to get the argument to getname().
1638
1639 It is not always possible to get register arguments, gcc will only preserve
1640 input arguments as long as it needs them so there may be no saved copy of
1641 arguments that are passed in register.  In this case, bt_print_one() prints
1642 "invalid".
1643
1644 If basic block analysis detected any arguments were passed on stack, their
1645 contents can now be extracted based on the known value of the stack pointer.
1646 bt_print_one() prints the arguments, if BT_ARGS is non-zero then any argument
1647 that might be a kernel address is printed as a symbol.
1648
1649 Once rsp has been rolled back to osp, we can calculate that actual address of
1650 the stack locations that contain useful data.  The previous values of rbp, rdi,
1651 rsi and rbx are then copied from those locations into bb_actual[].  That gives
1652 the values for those registers at the exit point from the function that called
1653 getname().  Go up one level and repeat the analysis.
1654
1655 There are two references to rdi in the exit state, which can be confusing.
1656
1657         rdi = rax
1658         slot 1 offset_address -0x8 rdi
1659
1660 The first reference says that "register rdi contains the original value of rax",
1661 the second reference says that "*(osp-0x8) contains the original value of rdi".
1662 Do not confuse the two, one is by name, the other is by value.
1663
1664 getname() is a fairly simple function, it has no loops.  __follow_mount is more
1665 complicated, it has loops as well as BUG() statements.
1666
1667     [0]kdb> id __follow_mount
1668     0xc015be76 __follow_mount:         push   %ebp
1669     0xc015be77 __follow_mount+0x1:     mov    %esp,%ebp
1670     0xc015be79 __follow_mount+0x3:     push   %edi
1671     0xc015be7a __follow_mount+0x4:     push   %esi
1672     0xc015be7b __follow_mount+0x5:     push   %ebx
1673     0xc015be7c __follow_mount+0x6:     mov    %eax,%esi
1674     0xc015be7e __follow_mount+0x8:     xor    %edi,%edi
1675     0xc015be80 __follow_mount+0xa:     jmp    0xc015beca __follow_mount+0x54
1676     0xc015be82 __follow_mount+0xc:     mov    (%esi),%eax
1677     0xc015be84 __follow_mount+0xe:     call   0xc0169664 lookup_mnt
1678     0xc015be89 __follow_mount+0x13:    mov    %eax,%ebx
1679     0xc015be8b __follow_mount+0x15:    test   %eax,%eax
1680     0xc015be8d __follow_mount+0x17:    je     0xc015bed3 __follow_mount+0x5d
1681     0xc015be8f __follow_mount+0x19:    mov    0x4(%esi),%eax
1682     0xc015be92 __follow_mount+0x1c:    call   0xc0163de2 dput
1683     0xc015be97 __follow_mount+0x21:    test   %edi,%edi
1684     0xc015be99 __follow_mount+0x23:    je     0xc015bead __follow_mount+0x37
1685     0xc015be9b __follow_mount+0x25:    mov    (%esi),%eax
1686     0xc015be9d __follow_mount+0x27:    test   %eax,%eax
1687     0xc015be9f __follow_mount+0x29:    je     0xc015bead __follow_mount+0x37
1688     0xc015bea1 __follow_mount+0x2b:    movl   $0x0,0x64(%eax)
1689     0xc015bea8 __follow_mount+0x32:    call   0xc016835b mntput_no_expire
1690     0xc015bead __follow_mount+0x37:    mov    %ebx,(%esi)
1691     0xc015beaf __follow_mount+0x39:    mov    0x10(%ebx),%eax
1692     0xc015beb2 __follow_mount+0x3c:    test   %eax,%eax
1693     0xc015beb4 __follow_mount+0x3e:    je     0xc015bec2 __follow_mount+0x4c
1694     0xc015beb6 __follow_mount+0x40:    cmpl   $0x0,(%eax)
1695     0xc015beb9 __follow_mount+0x43:    jne    0xc015bebf __follow_mount+0x49
1696     0xc015bebb __follow_mount+0x45:    ud2a
1697     0xc015bebd __follow_mount+0x47:    jmp    0xc015bebd __follow_mount+0x47
1698     0xc015bebf __follow_mount+0x49:    lock incl (%eax)
1699     0xc015bec2 __follow_mount+0x4c:    mov    %eax,0x4(%esi)
1700     0xc015bec5 __follow_mount+0x4f:    mov    $0x1,%edi
1701     0xc015beca __follow_mount+0x54:    mov    0x4(%esi),%edx
1702     0xc015becd __follow_mount+0x57:    cmpl   $0x0,0x74(%edx)
1703     0xc015bed1 __follow_mount+0x5b:    jne    0xc015be82 __follow_mount+0xc
1704     0xc015bed3 __follow_mount+0x5d:    mov    %edi,%eax
1705     0xc015bed5 __follow_mount+0x5f:    pop    %ebx
1706     0xc015bed6 __follow_mount+0x60:    pop    %esi
1707     0xc015bed7 __follow_mount+0x61:    pop    %edi
1708     0xc015bed8 __follow_mount+0x62:    pop    %ebp
1709     0xc015bed9 __follow_mount+0x63:    ret
1710
1711     [0]kdb> bb1 0xc015bed9
1712     bb_pass1: func_name __follow_mount func_start 0xc015be76 func_end 0xc015beda
1713     bb_pass1: end
1714       bb[0] start 0xc015be76 end 0xc015be80 drop_through 0
1715       bb[1] start 0xc015be82 end 0xc015beac drop_through 1
1716       bb[2] start 0xc015bead end 0xc015bebb drop_through 0
1717
1718 Note that the ud2a (BUG) instruction at 0xc015bebb ends bb[2].
1719
1720       bb[3] start 0xc015bebd end 0xc015bebd drop_through 0
1721
1722 bb[3] is peculiar, it is a jmp to itself, nothing else refers to 0xc015bebd and
1723 you cannot drop through from the previous instruction because ud2a kills the
1724 kernel.  The i386 and x86_64 BUG() macros contain for(;;) after ud2a, for no
1725 good reason that I can see (is there old hardware that does not abort on ud2a?).
1726 ia64 and the generic versions of BUG() do not contain for(;;).  for(;;) after
1727 ud2a generates a branch to itself than can never be executed.
1728
1729       bb[4] start 0xc015bebf end 0xc015bec1 drop_through 1
1730       bb[5] start 0xc015bec2 end 0xc015bec9 drop_through 1
1731       bb[6] start 0xc015beca end 0xc015bed2 drop_through 1
1732       bb[7] start 0xc015bed3 end 0xc015bed9 drop_through 0
1733       bb_jmp[0] from 0xc015be80 to 0xc015beca drop_through 0
1734       bb_jmp[1] from 0xc015be8d to 0xc015bed3 drop_through 0
1735       bb_jmp[2] from 0xc015be99 to 0xc015bead drop_through 0
1736       bb_jmp[3] from 0xc015be9f to 0xc015bead drop_through 0
1737       bb_jmp[4] from 0xc015beb4 to 0xc015bec2 drop_through 0
1738       bb_jmp[5] from 0xc015beb9 to 0xc015bebf drop_through 0
1739       bb_jmp[6] from 0xc015bebd to 0xc015bebd drop_through 0
1740       bb_jmp[7] from 0xc015bed1 to 0xc015be82 drop_through 0
1741       bb_jmp[8] from 0xc015beac to 0xc015bead drop_through 1
1742       bb_jmp[9] from 0xc015bec1 to 0xc015bec2 drop_through 1
1743       bb_jmp[10] from 0xc015bec9 to 0xc015beca drop_through 1
1744       bb_jmp[11] from 0xc015bed2 to 0xc015bed3 drop_through 1
1745
1746 Apart from bb[0] and the special case bb[3], all the other blocks are part of a
1747 cycle.  That cycle goes bb[0] -> bb[6].  bb[6] -> {bb[1], bb[7]}.  bb[1] ->
1748 {bb[2], bb[7]}.  bb[2] -> {bb[4], bb[5]}.  bb[4] -> bb[5].  bb[5] -> bb[6] and
1749 back to the start.  bb[7] ends with 'ret', it does not feed into other blocks.
1750
1751     bb_pass2: start
1752
1753       bb_pass2_do_changed_blocks: allow_missing 0
1754
1755       bb[0]
1756       [ ... detail snipped ... ]
1757       matched: from 0xc015be80 to 0xc015beca drop_through 0 bb_jmp[0]
1758       new state c07286d8
1759
1760       bb_pass2_do_changed_blocks: allow_missing 1
1761
1762 Because of the cycle, only bb[0] can be processed with 0 missing inputs, all the
1763 other blocks have at least one missing input.  Call bb_pass2_do_changed_blocks()
1764 again, this time allowing one missing input per blocks.
1765
1766       bb[6]
1767       first state c07286d8
1768       [ ... detail snipped ... ]
1769       matched: from 0xc015bed2 to 0xc015bed3 drop_through 1 bb_jmp[11]
1770       reuse bb_jmp[7]
1771
1772       bb[7]
1773       first state c0728730
1774       [ ... detail snipped ... ]
1775
1776       bb[1]
1777       first state c0728730
1778       [ ... detail snipped ... ]
1779       matched: from 0xc015beac to 0xc015bead drop_through 1 bb_jmp[8]
1780       reuse bb_jmp[1]
1781
1782       bb[2]
1783       first state c0728788
1784       [ ... detail snipped ... ]
1785       merging state c0728788
1786       merging state c0728788
1787       [ ... detail snipped ... ]
1788       matched: from 0xc015beb9 to 0xc015bebf drop_through 0 bb_jmp[5]
1789       reuse bb_jmp[1]
1790
1791       bb[4]
1792       first state c0728788
1793       [ ... detail snipped ... ]
1794       matched: from 0xc015bec1 to 0xc015bec2 drop_through 1 bb_jmp[9]
1795       reuse bb_jmp[1]
1796
1797       bb[5]
1798       first state c0728788
1799       [ ... detail snipped ... ]
1800       merging state c0728788
1801       [ ... detail snipped ... ]
1802       matched: from 0xc015bec9 to 0xc015beca drop_through 1 bb_jmp[10]
1803       reuse bb_jmp[1]
1804
1805       bb[6]
1806       first state c07286d8
1807       [ ... detail snipped ... ]
1808       merging state c0728788
1809       matched: from 0xc015bed2 to 0xc015bed3 drop_through 1 bb_jmp[11]
1810       reuse bb_jmp[1]
1811
1812 Note the rescan of bb[6].  The first scan only had one input from bb[0].  After
1813 traversing the cycle and getting back from bb[5] to bb[6], bb[6] now has more
1814 inputs so we need to rescan it.  With the additional input, the output state
1815 from bb[6] has changed since the first scan, which means that every block it
1816 feeds has to be rescanned.  bb[6] feeds bb[1] and bb[7].
1817
1818       bb[7]
1819       first state c0728788
1820       [ ... detail snipped ... ]
1821       merging state c0728788
1822       [ ... detail snipped ... ]
1823
1824 bb[7] being rescanned, this time it has data for both its inputs.
1825
1826       bb[1]
1827       first state c0728788
1828       [ ... detail snipped ... ]
1829       matched: from 0xc015beac to 0xc015bead drop_through 1 bb_jmp[8]
1830       no state change
1831
1832 bb[1] is being rescanned because the input from bb[6] has changed, however the
1833 rescan of bb[1] reports 'no state change', the changed input from bb[6] did not
1834 affect the final output state from bb[1].  Because the output state from bb[1]
1835 has not changed since the previous scan, there is no need to rescan bb[2], bb[7]
1836 or bb[4].  Since bb[4] is not being rescanned, there is no need to rescan bb[5]
1837 or bb[6] and the cycle is closed.