A benchmark on mapping few keys
Recently, I’ve worked on optimizing my transition-based parser and came across such situation:
I need an associated, or key-value, structure to store the cached scored for each transition action at a certain state. In practice, the number of transition actions is relatively small (less than 100), so I became curious about the best associated data structure for small set of keys.
Previous posts on such performance benchmark mainly focus on the comparsion of different hash maps on large scale, like this one. This post differs from theirs by setting the number of keys to a small number. Time consumption on random insertion, ordered insertion, random retrieve and ordered retrieve is evaluated. I didn’t give a try on the delete operation cause there is no such operation in my problems.
Three types of mapping facilities are adopted in this posts. They are:
- Ordered List:
std::map is a several-time loser in previous benchmark because of the O(log n) time complexity.
I choose the
std::map and hope it can win a round in this special situation.
boost::container::flat_map is suggested by this StackOverflow question.
It seems the key are ordered and stored in a list in
boost::container::flat_map, rather than a tree like the
Expectingly, this data structure should have some nice memory allocation feature compared with the
The Hashtable group is actually my old friends and I used
std::unordered_map a lot.
All the experiments are conducted on a Xeon(R) CPU E5-2620 @ 2.00GHz server.
My GCC is a new-version one, the 4.8.2.
The code used in this benchmark can be found at this gist.
-O3 optimizing level is configured in compling the code.
The benchmark results are shown below.
Generally speaking, the
google::dense_hash_map is still the best choice if you are aggressively purchasing the speed performance.
Experimental result confirms that the
dense_hash_map achieves both the fastest insertion and fastest querying.
Also, we see that on the condition that the number of keys is relatively small, the sorted group (
boost::container::flat_map) performs better than the hash group.
One thing we need to note that, after the number of key increase to about 100, the sorted group starts to perform poor compared to the hash group.
By comparing the
boost::container::flat_map, their performances are almost identical.
I am a little disappointed by this fact because I always considering the
boost::container::flat_map as a silver bullet for the few key mapping problems.
For practice, there are several suggestion:
- For the mapping problem with less than 100 keys,
std::mapis still a solution. It’s especially true if ordered keys is in your concern.
google::dense_hash_mapis still number one choice, if memory and the annoying
set_empty_key()are both not the problems.