hashes.h File Reference

Hash function API. More...

Detailed Description

#include <stddef.h>
#include <inttypes.h>
+ Include dependency graph for hashes.h:

Go to the source code of this file.

Functions

uint32_t djb2_hash (const uint8_t *buf, size_t len)
 djb2 More...
 
uint32_t sdbm_hash (const uint8_t *buf, size_t len)
 sdbm More...
 
uint32_t kr_hash (const uint8_t *buf, size_t len)
 Kernighan and Ritchie. More...
 
uint32_t sax_hash (const uint8_t *buf, size_t len)
 Shift, Add, XOR. More...
 
uint32_t dek_hash (const uint8_t *buf, size_t len)
 Donald E. More...
 
uint32_t fnv_hash (const uint8_t *buf, size_t len)
 Fowler–Noll–Vo. More...
 
uint32_t rotating_hash (const uint8_t *buf, size_t len)
 Rotating. More...
 
uint32_t one_at_a_time_hash (const uint8_t *buf, size_t len)
 One at a time. More...
 

Function Documentation

◆ dek_hash()

uint32_t dek_hash ( const uint8_t *  buf,
size_t  len 
)

Knuth

HISTORY Proposed by Donald E. Knuth in The Art Of Computer Programming Vol. 3, under the topic of "Sorting and Search", Chapter 6.4.

Parameters
bufinput buffer to hash
lenlength of buffer
Returns
32 bit sized hash

◆ djb2_hash()

uint32_t djb2_hash ( const uint8_t *  buf,
size_t  len 
)

HISTORY This algorithm (k=33) was first reported by Dan Bernstein many years ago in comp.lang.c. Another version of this algorithm (now favored by bernstein) uses XOR:

 hash(i) = hash(i - 1) * 33 ^ str[i];

The magic of number 33 (why it works better than many other constants, prime or not) has never been adequately explained.

Parameters
bufinput buffer to hash
lenlength of buffer
Returns
32 bit sized hash

◆ fnv_hash()

uint32_t fnv_hash ( const uint8_t *  buf,
size_t  len 
)

NOTE For a more fully featured and modern version of this hash, see fnv32.c

Parameters
bufinput buffer to hash
lenlength of buffer
Returns
32 bit sized hash

◆ kr_hash()

uint32_t kr_hash ( const uint8_t *  buf,
size_t  len 
)

HISTORY This hash function appeared in K&R (1st ed) but at least the reader was warned:

 "This is not the best possible algorithm, but it has the merit
 of extreme simplicity."

This is an understatement. It is a terrible hashing algorithm, and it could have been much better without sacrificing its "extreme simplicity." [see the second edition!]

Many C programmers use this function without actually testing it, or checking something like Knuth's Sorting and Searching, so it stuck. It is now found mixed with otherwise respectable code, eg. cnews. sigh. [see also: tpop]

Parameters
bufinput buffer to hash
lenlength of buffer
Returns
32 bit sized hash

◆ one_at_a_time_hash()

uint32_t one_at_a_time_hash ( const uint8_t *  buf,
size_t  len 
)

found on http://burtleburtle.net/bob/hash/doobs.html

Parameters
bufinput buffer to hash
lenlength of buffer
Returns
32 bit sized hash

◆ rotating_hash()

uint32_t rotating_hash ( const uint8_t *  buf,
size_t  len 
)

found on http://burtleburtle.net/bob/hash/doobs.html

Parameters
bufinput buffer to hash
lenlength of buffer
Returns
32 bit sized hash

◆ sax_hash()

uint32_t sax_hash ( const uint8_t *  buf,
size_t  len 
)
Parameters
bufinput buffer to hash
lenlength of buffer
Returns
32 bit sized hash

◆ sdbm_hash()

uint32_t sdbm_hash ( const uint8_t *  buf,
size_t  len 
)

HISTORY This algorithm was created for sdbm (a public-domain reimplementation of ndbm) database library. It was found to do well in scrambling bits, causing better distribution of the keys and fewer splits. it also happens to be a good general hashing function with good distribution.

The actual function is

 hash(i) = hash(i - 1) * 65599 + str[i];

What is included below is the faster version used in gawk. [there is even a faster, duff-device version] the magic constant 65599 was picked out of thin air while experimenting with different constants, and turns out to be a prime. this is one of the algorithms used in berkeley db (see sleepycat) and elsewhere.

Parameters
bufinput buffer to hash
lenlength of buffer
Returns
32 bit sized hash