Douglas Crockford

2017-02-02

Fash64 is an efficient hashing function. It crunches 64 bits at a time to produce a 64 bit result. It can be used for implementing data structures (hash tables) and checksums.

Fash64 relies on multiplication. In a single instruction, 64 bit multiply can do up to 64 left shifts and 63 additions. On most CPUs, the product of multiplication can be 128 bits, divided over two registers. Fash64 uses the high 64 bits of the product as a 64 bit right shift that can quickly be fed back into the hash. Without that massive right shift, multiply tends to lose information that spills out the left, which would make it unsuitable for hashing. But we get good feedback from that high part, yielding good hashes.

This is an implementation in a mythical language.

def prime_11 := 11111111111111111027 def prime_8 := 8888888888888888881 def prime_3 := 3333333333333333271 # The state of the hash function is kept in two variables. var result: uint64 # running result var sum: uint64 # running sum # The fash64_begin function initializes the result and sum variables. def fash64_begin() { result := prime_8 sum := prime_3 } # The fash64_word function hashes one 64 bit word. def fash64_word(word: uint64) { # The fash64_word function does the work. # It takes a 64 bit word, and scrambles it into the hash. var high: uint64 # The high part of the product var low: uint64 # The low part of the product # Mix the word with the current state of the hash # and multiply it with the big prime. high ; low := (result xor word) * prime_11 # Add the high part to the sum. This is to defend against # result equaling the word, which would cause loss of # all memory of the previously hashed stuff. sum += high # Mix the low part with the sum. result := low xor sum } # The fash64_block function hashes an array of words. def fash64_block(block: array of uint64) { block.each(fash64_word) } # The fash64_end function returns the result. def fash64_end() { return result }

Most CPUs know how to do a multiply that produces a 128 bit product, but most programming languages do not, tossing away the valuable high bits. It is tragic that practical languages do not allow a statement like

high ; low := (result xor word) * prime

that deposits the product of the multiplication into the `high`

and `low`

variables.

The `sum`

variable deals with the possibility of```
result xor
word
```

producing zero. If we used

result := low xor high

then `result`

can become zero when `result`

equals
`word`

, which loses the influence of everything that was hashed
up to this point. We mitigate this with

sum += high result := low xor sum

The `sum`

variable retains the influence of the earlier material,
so the hash will still be good. This borrows an idea from
Fletcher’s Checksum.
The likelihood that a `word`

will match the current
`result`

and cause a reset is 1 in 2^{64}. The
`sum`

makes a reset even less likely.

Use of Fash64 is pretty simple. First call `fash64_begin`

to
initialize the hash function. Call `fash64_word`

to hash each
individual word, or `fash64_block`

to hash a block of words.
After all of the material has been hashed, call `fash64_end`

to
obtain the result.

# Example fash64_begin() fash64_block(packet) fash64_word(session_check_key) packet_check := fash64_end()