Profiling

When optimizing a program, you also need a way to determine which parts of the program are “hot” (executed frequently enough to affect runtime) and worth modifying. This is best done via profiling.

Profilers

There are many different profilers available, each with their strengths and weaknesses. The following profilers have been used successfully on Rust programs.

  • perf is a general-purpose profiler that uses hardware performance counters. Hotspot and Firefox Profiler are good for viewing data recorded by perf. It works on Linux.
  • AMD μProf is a general-purpose profiler. It works on Windows and Linux.
  • flamegraph is a Cargo command that uses perf/DTrace to profile your code and then displays the results in a flame graph. It works on Linux and all platforms that support DTrace (macOS, FreeBSD, NetBSD, and possibly Windows).
  • Cachegrind & Callgrind give global, per-function, and per-source-line instruction counts and simulated cache and branch prediction data. They work on Linux and some other Unixes.
  • DHAT is good for finding which parts of the code are causing a lot of allocations, and for giving insight into peak memory usage. It can also be used to identify hot calls to memcpy. It works on Linux and some other Unixes. dhat-rs is an experimental alternative that is a little less powerful and requires minor changes to your Rust program, but works on all platforms.
  • heaptrack is another heap profiling tool. It works on Linux.
  • counts supports ad hoc profiling, which combines the use of eprintln! statement with frequency-based post-processing, which is good for getting domain-specific insights into parts of your code. It works on all platforms.
  • Coz performs causal profiling to measure optimization potential, and has Rust support via coz-rs. It works on Linux.

Debug Info

To profile a release build effectively you might need to enable source line debug info. To do this, add the following lines to your Cargo.toml file:

[profile.release]
debug = 1

See the Cargo documentation for more details about the debug setting.

Unfortunately, even after doing the above step you won’t get detailed profiling information for standard library code. This is because shipped versions of the Rust standard library are not built with debug info. To remedy this, you can build your own version of the compiler and standard library, following these instructions, and adding the following lines to the config.toml file:

[rust]
debuginfo-level = 1

This is a hassle, but may be worth the effort in some cases.

Symbol Demangling

Rust uses a mangling scheme to encode function names in compiled code. If a profiler is unaware of this scheme, its output may contain symbol names like _ZN3foo3barE or _ZN28_$u7b$$u7b$closure$u7d$$u7d$E or _ZN88_$LT$core..result..Result$LT$$u21$$C$$u20$E$GT$$u20$as$u20$std..process..Termination$GT$6report17hfc41d0da4a40b3e8E. Names like these can be manually demangled using rustfilt.