How to get a stack trace at runtime
While debugging it's often useful to print stack traces at runtime. This document describes how to do so.
Typography conventions
Label | Paths, files, and commands |
---|---|
(shell) | outside the chroot and SDK shell on your workstation |
(sdk) | inside the chrome-sdk SDK shell on your workstation |
(device) | in your VM or ChromeOS device |
(cros) | inside the cros_sdk ChromeOS chroot |
Chromium
How to build and deploy a Chrome that can print runtime stack traces
Make sure is_official_build is false.
Unless you explicitly specify --official
to chrome-sdk, this should be
false. It seems that recently (2022), enabling is_official_build stops
runtime stack traces from working in all scenarios.
Making sure exclude_unwind_tables
is false or frame pointer unwinding works
NOTE: It seems that forcing exclude_unwind_tables=false
no longer
works if is_official_build is true, as of 2022-ish. For now,
make sure is_official_build is false (see the above section).
You can skip this section if you are not building an official build, i.e.
not passing either --internal
or --official
to chrome-sdk, and not
explicitly setting is_official_build=true
in your gn arguments.
You may also skip this section if frame pointer unwinding works on your platform. Currently all platforms support frame pointer unwinding, except ARM devices that use thumb instructions. See compiler.gni for more information.
However, if you are building an official build then exclude_unwind_tables
will be true by default. To add flags to gn args, you can either add them via
--gn-extra-args
:
(shell) cros chrome-sdk --board=eve --gn-extra-args="exclude_unwind_tables=false"
Or, you can run the following command, which will let you edit
the arguments directly. N.B. your path to your build directory may be
different, and re-running cros chrome-sdk
will overwrite these.
(sdk) gn args out_${SDK_BOARD}/Release
Then append exclude_unwind_tables=false
.
Deploying Chrome
After rebuilding (see example command below), make sure to pass the
--nostrip
or --strip-flags=-S
flags to deploy_chrome
. The --nostrip
flag will usually result in a much bigger binary, so if you only need stack
traces at runtime, prefer --strip-flags=-S
.
# Rebuild example
(sdk) autoninja -C out_${SDK_BOARD}/Release chrome nacl_helper
# Deploy chrome example
(sdk) deploy_chrome --build-dir=out_${SDK_BOARD}/Release --device=DUT --mount --strip-flags=-S
Note: The effects of the --mount
option will not survive a reboot. If you
reboot, re-run deploy_chrome.
Note: --mount
may be optional if the rootfs of your device is large enough for your build
configuration.
How to use base::StackTrace
Next, include the following header in the file from which you want runtime stack traces:
#include "base/debug/stack_trace.h"
Then, add this code to where you want the stack trace.
LOG(ERROR) << base::debug::StackTrace();
This code will output to the standard Chrome log, which is accessible via:
(device) tail -F /var/log/chrome/chrome
See the chrome os logging document for further info.
It's also possible to log the stack trace to stderr with the following code:
base::debug::StackTrace().Print();
However, this prints to stderr rather than the standard log, so it won't
show up in /var/log/chrome/chrome
.
Be aware that taking a runtime stack trace is an expensive operation, so if you put it in a hot codepath it can make Chrome's performance very bad. In timing dependent code (e.g. race conditions, graphics related stuff), it can even change the result.
Alternatively, you can save the stack trace (before printing it) and print it out at a later time:
auto stack = base::debug::StackTrace();
LOG(ERROR) << stack;
For reference, getting the stack trace itself takes on the order of milliseconds. Symbolizing the stack trace takes on the order of hundreds of milliseconds.
Printing stack traces from the GPU process
Printing a stack trace at runtime requires access to /proc/self/maps among
other files. The GPU process does not have access to this, so stack traces
will show unsymbolized. To symbolize them, add the following flag to your
/etc/chrome_dev.conf
:
--disable-gpu-sandbox
To know if you are in the GPU process or not, add a LOG(ERROR)
to the code
you are inspecting, and get the process ID. For example, it's 8689 in the below
example:
[8689:8689:0612/144957.826867:ERROR:window_state.cc(866)]
Then get the chrome invocation for that PID:
(device) cat /proc/8689/cmdline
This should give something like this:
/opt/google/chrome/chrome --type=gpu-process <...>
In particular, --type=gpu-process
tells you it is the GPU process.
Printing stack traces from a renderer process
Similar to the GPU process, it requires permissions. You can identify the
process by its invocation having --type=renderer
. To get runtime stack traces
while in a renderer process by adding the following flag to your
/etc/chrome_dev.conf
:
--no-sandbox
Debugging runtime stack traces not being symbolized
Sometimes stack traces are printed without being symbolized ("
1. Check exclude_unwind_tables
is false or frame-pointer unwinding works.
2. Check the process you are printing from isn't sandboxed.
Add the sandbox disabling options mentioned in the document to rule this out.
Processes need to be able to access files in /proc/self/
to inspect their
memory maps.
3. Check that the binary on the DUT (device-under-test) hasn't been stripped.
Make sure deploy_chrome is run with --nostrip
or --strip-flags=-S
. You can
compare the size between the binary on your machine and the DUT to check
further.
4. Check that you are running the correct chrome binary on the DUT.
Re-running deploy_chrome
with the --mount
option can rule this out.
5. Check that the appropriate symbol_level gn argument is set.
Not setting symbol_level
should be sufficient to avoid this issue.
However, you can rule this being a problem out by explicitly specifying
symbol_level=1
.
See the below table to see which configurations will let you print stack traces if you are unsure. However, building with symbol_level=-1 (default) and exclude_unwind_tables=false should let you print runtime stack traces.
Stripping with --strip-flags=-S
vs --nostrip
produces a much smaller
binary if you are building with symbol_level=1 or symbol_level=2. For stack
trace purposes, symbol_level=1 should be enough.
If frame pointer unwinding is supported, you may be able to print an unsymbolized stack trace. Unfortunately it is non-trivial to symbolize the trace afterwards (with only the trace information). It would require the memory map information, at least.
symbol_level | exclude_unwind_tables | deploy_chrome | size | runtime traces? |
---|---|---|---|---|
2 | false | nostrip | 5.3 GB | yes |
2 | false | strip-flags=-S | 282 MB | yes |
2 | false | 197 MB | no | |
2 | true | nostrip | 5.3 GB | yes |
2 | true | strip-flags=-S | 262 MB | yes |
2 | true | 176 MB | no | |
0 | true | nostrip | 263 MB | x86/x64 only |
0 | true | strip-flags=-S | 262 MB | x86/x64 only |
0 | true | 176 MB | no |
Data collected on hatch, 2021/05, ~M92 with GN flags like:
% gn gen out_hatch/Release --args="is_debug=false is_chrome_branded=true symbol_level=2 exclude_unwind_tables=false"
x86/x64 symbols available with symbol_level=0 (source).
From this table, we can tell the unwind tables take up ~21MB (197MB - 176MB), and the symbol tables take up ~86MB (262MB - 176MB).
Example stack trace output
#0 0x5788e671fe19 base::debug::CollectStackTrace()
#1 0x5788e66f2eb3 base::debug::StackTrace::StackTrace()
#2 0x5788e62bb264 content::ContentMainRunnerImpl::RunServiceManager()
#3 0x5788e62baca1 content::ContentMainRunnerImpl::Run()
#4 0x5788e62dc7b1 service_manager::Main()
#5 0x5788e62b916e content::ContentMain()
#6 0x5788e4127b65 ChromeMain
#7 0x7c58b7c8dad4 __libc_start_main
#8 0x5788e41279fa _start
Symbolizing minidumps with tast symbolize
When Chrome or other programs, such as system daemons, crash on ChromeOS, they
save their memory in minidump format. This is where tast symbolize
comes in
handy. You can obtain a stack trace from a minidump, without having to
reproduce the crash. It works for release builds and locally built binaries.
Crashes generated by Chrome and CrOS CQs are not supported. To symbolize
a minidump file use the following command inside the ChromeOS chroot:
(cros) tast symbolize <crash.dmp>
Check ChromiumOS Crash Reporting to see where you can find the minidumps
on your test device. tast symbolize
downloads symbols if the crashed binary
was produced by ChromeOS builders or generates them for local builds done
in ChromeOS chroot (this is different to simple chrome workflow discussed
above). Should you encounter any problems, try using
tast --verbose symbolize <crash.dmp>
to see diagnostic information.
How it works
This section describes how tast symbolize works. If you just want to get a stack trace, you can stop reading here.
tast symbolize
is typically run without any extra arguments and obtains all
necessary information from the minidump itself. The details depend on the origin
of the minidump:
- Chrome crashes, which are handled by Crashpad, generate minidumps
containing
chromeos-board
andchromeos-builder-path
annotations.tast symbolize
uses these annotations to download or generate symbols. - Other binaries are handled by a similar tool, Breakpad, which copies
/etc/lsb-release
into the minidump.tast symbolize
extracts board name and builder path from this file.
You can inspect contents of a minidump using minidump_dump
tool from Breakpad,
which is available in ChromeOS chroot. Googlers can also install
prebuilt binaries to use it outside chroot.
Under the hood, tast symbolize
runs minidump_stackwalk twice.
- First, it runs minidump_stackwalk to collect names and module IDs of the binaries referenced by a minidump. If everything can be symbolized, for example when cached symbols are available, then this is the final output.
- Next, it obtains missing symbols. If the builder path is not empty, then it
downloads them from
gs://chromeos-image-archive
. Otherwise, it uses dump_syms to generate symbols files from split debug files in build root (in the local chroot). Symbols are cached in/tmp/breakpad_symbols
. - Finally, it runs minidump_stackwalk the second time and prints the output.
Other resources
Googler only:
- go/cros-stack-traces - information about getting stack traces on DUTs