This is an implementation of the Skein hash function, as described in The Skein Hash Function Family, version 1.3.
The implementation is targeted towards the ARM Cortex-A8 processor, and uses the NEON SIMD instructions to perform 64-bit operations, where possible in parallel.
There are also Skein-256 and Skein-512 implementations that do not require NEON support.
The code is based on the optimized C version written by Doug Whiting, with the block functions rewritten in ARM assembly language.
For long messages the implementation reaches the following speeds in cycles per byte when tested on a Cortex-A8 processor:
Skein-256 | 20.3 |
Skein-512 | 15.4 |
Skein-1024 | 20.2 |
Without NEON:
Skein-256 | 21.7 |
Skein-512 | 25.2 |
See performance_test.txt for more detailed test output.
The Skein test program can be compiled using GCC as follows:
gcc *.c skein_block_cortexa8.S -DSKEIN_USE_ASM=256+512+1024
Or without NEON:
gcc *.c skein_block_noneon.S -DSKEIN_USE_ASM=256+512
In order for the performance test to have an accurate timer, you will need to enable user mode access to the ARM performance monitor registers.
One way to do this is to use the kernel module included in the userperf folder.