(Start at the beginning of the series – and all the source can be found in my github repo)
Now that I was used to writing C++11-style constexpr
, I decided to try some compile-time string hashing. It turns out that FNV1 is very easy to express in a move-down-the-string recursive style:
namespace detail
{
constexpr uint64_t fnv1(uint64_t h, const char* s)
{
return (*s == 0) ? h :
fnv1((h * 1099511628211ull) ^
static_cast(*s), s+1);
}
}
constexpr uint64_t fnv1(const char* s)
{
return true ?
detail::fnv1(14695981039346656037ull, s) :
throw err::fnv1_runtime_error;
}
That wasn’t difficult enough
So far so good. How about some other hash functions? On a previous project I had used MurmurHash, so I decided to give that a go.
(Coincidentally, a blog post has recently surfaced implementing 32-bit Murmur3 using C++14 constexpr
. But I was sticking to C++11 constexpr
.)
Implementing such a big function from cold in C++11 constexpr
was bound to be difficult, so I scaffolded. I used the freely-available runtime implementation for testing, and initially I converted it to C++14 constexpr
, which was fairly easy. After doing that, and with Wikipedia as a pseudocode guide I was ready to start breaking down (or building up) the functionality, C++11-style. At the base of 32-bit Murmur3 is the hash round function which is done for each 4-byte chunk of the key:
constexpr uint32_t murmur3_32_k(uint32_t k)
{
return (((k * 0xcc9e2d51) << 15)
| ((k * 0xcc9e2d51) >> 17)) * 0x1b873593;
}
constexpr uint32_t murmur3_32_hashround(
uint32_t k, uint32_t hash)
{
return (((hash^k) << 13)
| ((hash^k) >> 19)) * 5 + 0xe6546b64;
}
And this can easily be put into a loop, with a helper function to constitute 4-byte chunks (the somewhat strange formulation of word32le
here is because it is used differently in some other code):
constexpr uint32_t word32le(const char* s, int len)
{
return
(len > 0 ? static_cast(s[0]) : 0)
+ (len > 1 ? (static_cast(s[1]) << 8) : 0)
+ (len > 2 ? (static_cast(s[2]) << 16) : 0)
+ (len > 3 ? (static_cast(s[3]) << 24) : 0);
}
constexpr uint32_t word32le(const char* s)
{
return word32le(s, 4);
}
constexpr uint32_t murmur3_32_loop(
const char* key, int len, uint32_t hash)
{
return len == 0 ? hash :
murmur3_32_loop(
key + 4,
len - 1,
murmur3_32_hashround(
murmur3_32_k(word32le(key)), hash));
}
So this is the first part of Murmur3 that will process all the 4-byte chunks of key. What remains is to deal with the trailing bytes (up to 3 of them) and then finalize the hash. To deal with the trailing bytes, I chain the end functions (_end0
through _end3
) and "jump in" at the right place. There is probably a more elegant way to do this, but...
constexpr uint32_t murmur3_32_end0(uint32_t k)
{
return (((k*0xcc9e2d51) << 15)
| ((k*0xcc9e2d51) >> 17)) * 0x1b873593;
}
constexpr uint32_t murmur3_32_end1(uint32_t k,
const char* key)
{
return murmur3_32_end0(
k ^ static_cast(key[0]));
}
constexpr uint32_t murmur3_32_end2(uint32_t k,
const char* key)
{
return murmur3_32_end1(
k ^ (static_cast(key[1]) << 8), key);
}
constexpr uint32_t murmur3_32_end3(uint32_t k,
const char* key)
{
return murmur3_32_end2(
k ^ (static_cast(key[2]) << 16), key);
}
constexpr uint32_t murmur3_32_end(uint32_t hash,
const char* key, int rem)
{
return rem == 0 ? hash :
hash ^ (rem == 3 ? murmur3_32_end3(0, key) :
rem == 2 ? murmur3_32_end2(0, key) :
murmur3_32_end1(0, key));
}
Finalizing the hash is a very similar affair, with 3 stages:
constexpr uint32_t murmur3_32_final1(uint32_t hash)
{
return (hash ^ (hash >> 16)) * 0x85ebca6b;
}
constexpr uint32_t murmur3_32_final2(uint32_t hash)
{
return (hash ^ (hash >> 13)) * 0xc2b2ae35;
}
constexpr uint32_t murmur3_32_final3(uint32_t hash)
{
return (hash ^ (hash >> 16));
}
constexpr uint32_t murmur3_32_final(uint32_t hash, int len)
{
return
murmur3_32_final3(
murmur3_32_final2(
murmur3_32_final1(hash ^ static_cast(len))));
}
And that's all there is to it: all that remains is to stitch the pieces together and provide the driver function:
constexpr uint32_t murmur3_32_value(const char* key, int len,
uint32_t seed)
{
return murmur3_32_final(
murmur3_32_end(
murmur3_32_loop(key, len/4, seed),
key+(len/4)*4, len&3),
len);
}
constexpr uint32_t murmur3_32(const char *key, uint32_t seed)
{
return true ?
murmur3_32_value(key, strlen(key), seed) :
throw err::murmur3_32_runtime_error;
}
Doing strlen
non-naively
One more thing: strlen
crept in there. This is of course a constexpr
version of strlen
. The naive way to do this is to step down the string linearly, as seen in the FNV1 algorithm. With a max recursive depth of 512, I thought this fairly limiting... it might not be uncommon to have 1k string literals that I want to hash?
So the way I implemented strlen
was to measure the string by chunks, constraining a max recursion depth of say 256. Instead of a plain linear recursion, it's basically recursing for every 256 bytes of string length, and then recursing on the chunk inside that. So the max depth is 256 + 256 = 512, and we can deal with strings close to 64k in size (depending on how deep we are already when calling strlen
). There's a chance that strlen
might be made constexpr
in the future - I think it's already implemented as an intrinsic in some compilers. So maybe I can throw away that code someday.
Anyway, now we have a complete implementation of 32-bit Murmur3 in C++11 constexpr
style, and the tests go something like this (and can be checked online):
static_assert(murmur3_32("hello, world", 0) == 345750399,
"murmur3 test 1");
static_assert(murmur3_32("hello, world1", 0) == 3714214180,
"murmur3 test 2");
static_assert(murmur3_32("hello, world12", 0) == 83041023,
"murmur3 test 3");
static_assert(murmur3_32("hello, world123", 0) == 209220029,
"murmur3 test 4");
static_assert(murmur3_32("hello, world1234", 0) == 4241062699,
"murmur3 test 5");
static_assert(murmur3_32("hello, world", 1) == 1868346089,
"murmur3 test 6");
OK. That was a lot of code for a blog post. But after the trivial FNV1 and the only slightly harder Murmur3, I wanted more of a challenge. What about MD5? Or SHA-256? That's for the next post.
This is neat! It’s neat to see that it’s possible in C++ 11. Have you noticed how this impacts compile time at all?
I hope MS puts a “compile time profiler” in VS soon. This would be a great excuse to get that in to help with constexpr stuff, as well as in general compile times 😛
Some of the math functions do impact compile time, as you’d expect: they work by iterating to convergence, and that can be slowish sometimes. I haven’t tested the string hashing with large strings, but for the ones I tested there is no humanly-noticeable difference to compile time. Hash functions (at least noncryptographic ones) are generally made to be fast, using cheap operations.
Is this achievable in C99?
Probably it’s can be done only with C-macros.
Any idea?