IntrusiveHashMap
IntrusiveHashMap
provides a “hashed” or “unordered” set, using intrusive links. It provides a container for elements, each of which has a key. A hash function is applied to a key to generate a hash id which is used to group the elements in to buckets for fast lookup. This container is a mix of std::unordered_set
and std::unordered_map
. There is no separation between elements and keys, but each element can contain non-key data.
Iteration over elements is provided and is constant time.
In order to optimize lookup, the container can increase the number of buckets used. This is called “expansion” and when it occurs is control by the “expansion policy” of the container. The policy can be to be automatic or only done explicitly.
Usage
To use an IntrusiveHashMap
the element must provide support for the container. This is done through an associated descriptor class which provides the operations needed to manipulate the elements in the container.
Examples
Details
template<typename H
>
class IntrusiveHashMap
Template Parameters
H – Element operations.
An unordered map using a hash function. The properties of the map are determined by types and operations provided by the descriptor type H. The following types are derived from H and defined in the container type.
type
value_type
The type of elements in the container, deduced from the return types of the link accessor methods in H.
type
key_type
The type of the key used for hash computations. Deduced from the return type of the key accessor. An instance of this type is never default constructed nor modified, therefore it can be a reference if the key type is expensive to copy.
type
hash_id
The type of the hash of a
key_type
. Deduced from the return type of the hash function. This must be a numeric type.H
This describes the hash map, primarily via the operations required for the map. The related types are deduced from the function return types. This is designed to be compatible with
IntrusiveDList
.static key_type
key_of
(value_type *v)Key accessor - return the key of the element v.
static hash_id
hash_of
(key_type key)Hash function - compute the hash value of the key.
static bool
equal
(key_type lhs, key_type rhs)Key comparison - two keys are equal if this function returns
true
.static IntrusiveHashMap::value_type *&
next_ptr
(IntrusiveHashMap::value_type *v)Return a reference to the next element pointer embedded in the element v.
static IntrusiveHashMap::value_type *&
prev_ptr
(IntrusiveHashMap::value_type *v)Return a reference to the previous element pointer embedded in the element v.
type
iterator
An STL compliant iterator over elements in the container.
type
range
An STL compliant half open range of elements, represented by a pair of iterators.
IntrusiveHashMap &
insert
(value_type *v)Insert the element v into the container. If there are already elements with the same key, v is inserted after these elements.
There is no
emplace
because v is put in the container, not a copy of v. For the same reason v must be constructed before calling this method, the container will never create an element instance.iterator
begin
()Return an iterator to the first element in the container.
iterator
end
()Return an iterator to past the last element in the container.
iterator
find
(value_type *v)Search for v in the container. If found, return an iterator referring to v. If not return the end iterator. This validates v is in the container.
range
equal_range
(key_type key)Find all elements with a key that is equal to key. The returned value is a half open range starting with the first matching element to one past the last matching element. All element in the range will have a key that is equal to key. If no element has a matching key the range will be empty.
IntrusiveHashMap &
erase
(iterator spot)Remove the element referred to by spot from the container.
iterator
iterator_for
(value_type *v)Return an iterator for v. This is very fast, faster than
IntrusiveHashMap::find()
but less safe because no validation done on v. If it not in the container (either in no container or a different one) further iteration on the returned iterator will go badly. It is useful inside rangefor
loops when it is guaranteed the element is in the container.template<typename
F
>
IntrusiveHashMap &apply
(F &&f)Template Parameters
F – A functional type with the signature
void (value_type*)
.
This applies the function f to every element in the container in such a way that modification of the element does not interfere with the iteration. The most common use is to
delete
the elements during cleanup. The common idiomfor ( auto & elt : container) delete &elt;
is problematic because the iteration links are in the deleted element causing the computation of the next element to be a use after free. Using
IntrusiveHashMap::apply()
enables safe cleanup.container.apply([](value_type & v) { delete & v; });
Because the links are intrusive it is possible for other classes or the element class to modify them. In such cases this method provides a safe way to invoke such mechanisms.
Design Notes
This is a refresh of an previously existing class, TSHahTable
. The switch to C++ 11 and then C++ 17 made it possible to do much better in terms of the internal implementation and API. The overall functionality is the roughly the same but with an easier API, compatibility with IntrusiveDList
, improved compliance with STL container standards, and better internal implementation.
The biggest change is that elements are stored in a single global list rather than per hash bucket. The buckets serve only as entry points in to the global list and to count the number of elements per bucket. This simplifies the implementation of iteration, so that the old Location
nested class can be removed. Elements with equal keys can be handled in the same way as with STL containers, via iterator ranges, instead of a custom pseudo-iterator class.
Notes on IntrusiveHashMap::apply()
This was added after some experience with use of the container. Initially it was added to make cleaning up the container easier. Without it, cleanup looks like
for ( auto spot = map.begin(), limit = map.end() ; spot != limit ; delete &( * spot++)) {
; // empty
}
Instead one can do
map.apply([](value_type& v) { delete &v; });
The post increment operator guarantees that spot has been updated before the current element is destroyed. However, it turns out to be handy in other map modifying operations. In the unit tests there is this code
// Erase all the non-"dup" and see if the range is still correct.
map.apply([&map](Thing *thing) {
if (thing->_payload != "dup"sv) {
This removes all elements that do not have the payload “dup”. As another design note, IntrusiveHashMap::iterator_for()
here serves to bypass validation checking on the target for IntrusiveHashMap::erase()
, which is proper because IntrusiveHashMap::apply()
guarantees thing is in the map.
Without apply
this is needed
auto idx = map.begin();
while (idx != map.end()) {
auto x{idx++};
if ("dup"sv != x->_payload) {
map.erase(x);
}
}
The latter is more verbose and more importantly less obvious, depending on a subtle interaction with post increment.