When optimizing a program, you also need a way to determine which parts of the program are “hot” (executed frequently enough to affect runtime) and worth modifying. This is best done via profiling.
There are many different profilers available, each with their strengths and weaknesses. The following profilers have been used successfully on Rust programs.
- perf is a general-purpose profiler that uses hardware performance counters. Hotspot and Firefox Profiler are good for viewing data recorded by perf. It works on Linux.
- AMD μProf is a general-purpose profiler. It works on Windows and Linux.
- flamegraph is a Cargo command that uses perf/DTrace to profile your code and then displays the results in a flame graph. It works on Linux and all platforms that support DTrace (macOS, FreeBSD, NetBSD, and possibly Windows).
- Cachegrind & Callgrind give global, per-function, and per-source-line instruction counts and simulated cache and branch prediction data. They work on Linux and some other Unixes.
- DHAT is good for finding which parts of the code are causing a lot of
allocations, and for giving insight into peak memory usage. It can also be
used to identify hot calls to
memcpy. It works on Linux and some other Unixes. dhat-rs is an experimental alternative that is a little less powerful and requires minor changes to your Rust program, but works on all platforms.
- heaptrack is another heap profiling tool. It works on Linux.
countssupports ad hoc profiling, which combines the use of
eprintln!statement with frequency-based post-processing, which is good for getting domain-specific insights into parts of your code. It works on all platforms.
- Coz performs causal profiling to measure optimization potential, and has Rust support via coz-rs. It works on Linux.
To profile a release build effectively you might need to enable source line
debug info. To do this, add the following lines to your
[profile.release] debug = 1
See the Cargo documentation for more details about the
Unfortunately, even after doing the above step you won’t get detailed profiling
information for standard library code. This is because shipped versions of the
Rust standard library are not built with debug info. To remedy this, you can
build your own version of the compiler and standard library, following these
instructions, and adding the following lines to the
[rust] debuginfo-level = 1
This is a hassle, but may be worth the effort in some cases.
Rust uses a mangling scheme to encode function names in compiled code. If a
profiler is unaware of this scheme, its output may contain symbol names like
Names like these can be manually demangled using