Hashing

HashSet and HashMap are two widely-used types. The default hashing algorithm is not specified, but at the time of writing the default is an algorithm called SipHash 1-3. This algorithm is high quality—it provides high protection against collisions—but is relatively slow, particularly for short keys such as integers.

If profiling shows that hashing is hot, and HashDoS attacks are not a concern for your application, the use of hash tables with faster hash algorithms can provide large speed wins.

  • rustc-hash provides FxHashSet and FxHashMap types that are drop-in replacements for HashSet and HashMap. Its hashing algorithm is low-quality but very fast, especially for integer keys, and has been found to out-perform all other hash algorithms within rustc. (fxhash is an older, less well maintained implementation of the same algorithm and types.)
  • fnv provides FnvHashSet and FnvHashMap types. Its hashing algorithm is higher quality than rustc-hash’s but a little slower.
  • ahash provides AHashSet and AHashMap. Its hashing algorithm can take advantage of AES instruction support that is available on some processors.

If hashing performance is important in your program, it is worth trying more than one of these alternatives. For example, the following results were seen in rustc.

If you decide to universally use one of the alternatives, such as FxHashSet/FxHashMap, it is easy to accidentally use HashSet/HashMap in some places. You can use clippy to avoid this problem.

Some types don’t need hashing. For example, you might have a newtype that wraps an integer and the integer values are random, or close to random. For such a type, the distribution of the hashed values won’t be that different to the distribution of the values themselves. In this case the nohash_hasher crate can be useful.

Hash function design is a complex topic and is beyond the scope of this book. The ahash documentation has a good discussion.