Last months I was mainly involved in a project called
JavaDC3 which is a java porting of DC++. In this project I worked on implementation of Tiger Tree Hashing alghorithm (
TTH).
TTH and
THEX is clearly explained
here.
All these technology are based on the definition of the
Merkle Hash Tree. The Merkle Hash Tree, invented by
Ralph Merkle, is a hash construct that exhibits desirable properties for verifying the integrity of files and file subranges in an incremental or out-of-order fashion.
I made my google search and found that in
sf.net a standard implementation. The problem was that standard implementation was not usable in a production environment infact performances were not suitable for a real project and with a big file (1.5 GB) on a standard machine system crashes.
So I needed to re-implement the alghorithm. (Also guys from Limewire did this and maybe better than me...)
Here is what I did:
One of the main problems was that standard implementation uses too much memory because it uses a Vector to hold "hashed" blocks (aka leaves). And it "hashes" all the file in one cycle then it uses other Vectors to hold hashed result of two leaves and so on along the tree until it has only hash the final one.
On contrary, my implementation uses only two
Stacks. I fill the two stacks continuosly: the first one with leaves hashed from file the second with blocks hashed with two leaves and so on consuming all the bits of the file but not the memory.
I'll go more in deep in next posts.
Marco.