Fash64: The Fast Hash Function

Douglas Crockford
2017-02-02

Fash64 is an efficient hashing function. It crunches 64 bits at a time to produce a 64 bit result. It can be used for implementing data structures (hash tables) and checksums.

Fash64 relies on multiplication. In a single instruction, 64 bit multiply can do up to 64 left shifts and 64 additions. On most CPUs, the product of multiplication can be 128 bits, divided over two registers. Fash64 uses the high 64 bits of the product as a 64 bit right shift that can quickly be fed back into the hash. Without that massive right shift, multiply tends to lose information that spills out the left, which would make it unsuitable for hashing. But we get good feedback from that high part, yielding good hashes.

This is an implementation in a mythical language.

def prime := 11111111111111111027   # a large prime number

# The state of the hash function is kept in two variables.

var result: uint64                  # running result
var sum: uint64                     # running sum


# The fash64_begin function initializes the result and sum variables.

def fash64_begin() {
    result := 8888888888888888881
    sum := 3333333333333333271
}


# The fash64_word function hashes one 64 bit word.

def fash64_word(word: uint64) {

# The fash64_word function does the work.
# It takes a 64 bit word, and scrambles it into the hash.

    var high: uint64    # The high part of the product
    var low: uint64     # The low part of the product

# Mix the word with the current state of the hash
# and multiply it with the big prime.

    high ; low := (result xor word) * prime

# Add the high part to the sum. This is to defend against
# result equaling the word, which would cause loss of
# all memory of the previously hashed stuff.

    sum += high

# Mix the low part with the sum.

    result := low xor sum
}


# The fash64_block function hashes an array of words.

def fash64_block(block: array of uint64) {
    block.each(fash64_word)
}


# The fash64_end function returns the result.

def fash64_end() {
    return result
}

Most CPUs know how to do a multiply that produces a 128 bit product, but most programming languages do not, tossing away the valuable high bits. It is tragic that practical languages do not allow a statement like

high ; low := (result xor word) * prime

that deposits the product of the multiplication into the high and low variables.

The sum variable deals with the possibility of result xor word producing zero. If we used

result := low xor high

then result can become zero when result equals word, which loses the influence of everything that was hashed up to this point. We mitigate this with

sum += high
result := low xor sum

The sum variable retains the influence of the earlier material, so the hash will still be good. This borrows an idea from Fletcher’s Checksum. The likelihood that a word will match the current result and cause a reset is 1 in 264. The sum makes a reset even less likely.

Use of Fash64 is pretty simple. First call fash64_begin to initialize the hash function. Call fash64_word to hash each individual word, or fash64_block to hash a block of words. After all of the material has been hashed, call fash64_end to obtain the result.

# Example

fash64_begin()
fash64_block(packet)
fash64_word(session_check_key)
packet_check := fash64_end()