0

In BitTorrent v2 there's pieces root key (string) which has root sha256 of a file encoded in binary form, in documentation there's written:

"pieces root" is the the root hash of a merkle tree with a branching factor of 2, constructed from 16KiB blocks of the file. The last block may be shorter than 16KiB. The remaining leaf hashes beyond the end of the file required to construct upper layers of the merkle tree are set to zero. As of meta version 2 SHA2-256 is used as digest function for the merkle tree. The hash is stored in its binary form, not as human-readable string.

I need to extract this hash to use it on my torrent tracker, so in info web page users could see original hashes of files of torrent, how do I do that? How could I decode that binary string and I don't know if those are concatenation of all piece hashes.

PHP or C is preferred or maybe some docs. I'm a noob regarding encoding, so please explain thoroughly. Thanks a ton!!

I tried unpack() function, but I'm missing something.

  • What do you mean by "decode"? Can you share sample input, and the expected output? – Nico Haase Jan 26 '23 at 16:44
  • @NicoHaase Sure, in new BitTorrent v2 torrent info dictionary every file contains its original hash encoded in binary string, in short you can't read it by a text editor, here's a pic https://i.postimg.cc/2y5SC7pd/2023-01-26-085936.png, as you can see it's encoded, I want to decode it. – greenandgreen Jan 26 '23 at 17:02
  • Please add all clarification to your question by editing it. This should also include the sample input, and the code you've used to resolve your problem – Nico Haase Jan 26 '23 at 18:37
  • Why do you want to show this value to the user? What do you expect them to do with it? – Anon Coward Jan 27 '23 at 02:29
  • @AnonCoward Many benificial things from this: – greenandgreen Jan 27 '23 at 06:49
  • I'm going to also add search option by hash, f.e. you want to download a software, inside this software there's a package, that nobody downloaded or only 1-2 people with slow speeds seeding, because torrent was updated and you downloaded old version, you can search for other swarms for this package by its hash, and it's not exactly software, it could me movies, docs, iso, dead artifacts and etc. – greenandgreen Jan 27 '23 at 06:55

2 Answers2

1

The hash as stored in the torrent file is not encoded, it's in its native representation that computers deal in: a sequence of bytes. In the case of SHA2-256 that would be 32 bytes (256 bits).

If you need it representable in text then you'll have to encode it. There are many ways to do this. Hexadecimal is a common choice, also frequently used to display the infohash of a torrent.

I don't know if those are concatenation of all piece hashes.

As the BEP says, the pieces root is the root hash of a merkle tree, it can't be obtained by concatenation of individual block hashes.

It can only be computed from the torrent contents themselves. So if you don't have the data you can't recompute it, you can only extract it from the torrent file. But since it uses a fixed construction (independent of the piece size) the pieces root is always the same for files of equal content.

the8472
  • 40,999
  • 5
  • 70
  • 122
  • Damn, this is too hard for me :( I only need to know sha2 root hash of every file to show it in a web page. As far as I understood I need to encode "pieces root" key to hexademical? Would be grateful if you could give me some code to do so. Not that I'm not familiar with torrent structure, it's just that I'm weak at encoding. – greenandgreen Jan 27 '23 at 16:52
  • Well, since you mention php, [bin2hex](https://www.php.net/manual/en/function.bin2hex.php) should do the job after extracting the bytes from the torrent file. – the8472 Jan 27 '23 at 17:06
  • I will try and reply back, thanks, is this will produce same as original sha256 of file or something different? – greenandgreen Jan 27 '23 at 18:08
  • I did it with bencode php library just to be sure, but unfortunately it seems like that I get the hash which is not exactly the same, strange, could you be further assistance on this please? Here's the code, root hash, and bin2hex output in one picture: https://i.postimg.cc/yNqW3nxV/2023-01-28-071853.png – greenandgreen Jan 28 '23 at 15:27
  • As I said, the root hash of a merkle tree is not the same as simply hashing a file. If it were we wouldn't need the merkle tree construction.You wanted the pieces root hash, which you got. That it doesn't fulfill your needs seems like a different question, an [XY problem](https://en.wikipedia.org/wiki/XY_problem). – the8472 Jan 29 '23 at 16:07
  • Oh I see, actually I was expecting this to happen, that's why in pre-previous comment I asked — "same as original sha256 of file or something different?". My only question now is how to calculate same version of merkle tree for individual files. By that I mean "merkle tree with a branching factor of 2, constructed from 16KiB blocks of the file...."? I know I was boring you, but I assure that I'm grateful, and already choose your answer as accepted. – greenandgreen Jan 29 '23 at 17:42
  • Actually there's no need to answer, I found the answer, thank you again! – greenandgreen Jan 29 '23 at 18:11
  • Holy cow ChatGPT made me working code for this – greenandgreen Jan 30 '23 at 07:20
-1

I wrote a windows command-line tool for extracting/calculating Merkle root hashes.

It can be used to search for desired files among trackers that have acquired BitTorrent v2 support to find seeds for reviving dead torrents for example.

Usage:

Open a command prompt and do the following:

tmrr.exe e example.torrent # For extracting root hashes from a torrent

tmrr.exe d torrent1.torrent torrent2.torrent # For finding duplicates

tmrr.exe c your_file # For calculating Merkle root hash for a file

The tool will output all root hashes of files with their names and sizes. Feel free to give feedback.

I looped through "file tree" dictionary concatenating all directories and file names, extracted file hashes passing each "pieces root" key to bin2hex() function and compiled this code for windows.