Update 10-30-2019:
=> Please see the following discussion for feature request to IPFS: git-diff feature: Improve efficiency of IPFS for sharing updated file. Decrease file/block duplication
=> Please see th following discussion for additional information. Does IPFS provide block-level file copying feature?
For example userA added a file sized 1 GB. IPFS add file.txt
and userB get that file into his storage through IPFS. Later userA released a mistake and changed only a single character on the file and wants to share this updated version with userB.
So userA again added same file with a small change into IPFS via ipfs add file
, and userB have to fetch that 1 GB of file instead of updating that single character. Is there any better approach to solve this issue, where only the updated version should be pulled by userB like how git works when we do git pull
?
Git have much better approach please see (https://stackoverflow.com/a/8198276/2402577). Does IPFS uses delta compression for storage (https://gist.github.com/matthewmccullough/2695758) like Git? or similar approach?
Further investigation:
I did a small experiment. First I have added 1GB file into IPFS. Later, I have updated a small line on the file, that is already shared via IPFS. I observe that userA pushes complete 1GB file all over again, instead only pushing the block that contains the changed data. That is very expensive and time consuming in my opinion. I have shared the hash of the new updated file and again complete file is downloaded via IPFS on userB instead of downloaded only the block that contains the changed character.
- Step 1:
userA
$ fallocate -l 1G gentoo_root.img
$ ipfs add gentoo_root.img
920.75 MB / 1024.00 MB [========================================>----] 89. 92added QmdiETTY5fiwTkJeERbWAbPKtzcyjzMEJTJJosrqo2qKNm gentoo_root.img
userB
$ ipfs get QmdiETTY5fiwTkJeERbWAbPKtzcyjzMEJTJJosrqo2qKNm
Saving file(s) to QmdiETTY5fiwTkJeERbWAbPKtzcyjzMEJTJJosrqo2qKNm
1.00 GB / 1.00 GB [==================================] 100.00% 49s
- Step 2:
userA
$ echo 'hello' >> gentoo_root.img
$ ipfs add gentoo_root.img # HERE node pushing 1 GB file into IPFS again. It took 1 hour for me to push it, instead only updated the changed block.
32.75 MB / 1.00 GB [=>---------------------------------------] 3.20% 1h3m34s
added Qmew8yVjNzs2r54Ti6R64W9psxYFd16X3yNY28gZS4YeM3 gentoo_root.img
userB
# HERE complete 1 GB file is downloaded all over again.
ipfs get Qmew8yVjNzs2r54Ti6R64W9psxYFd16X3yNY28gZS4YeM3
[sudo] password for alper:
Saving file(s) to Qmew8yVjNzs2r54Ti6R64W9psxYFd16X3yNY28gZS4YeM3
1.00 GB / 1.00 GB [=========================] 100.00% 45s
[Q] At this point what is the best solution via IPFS to share the updated file without re-sharing the whole version of the updated file and for IPFS to share only the updated blocks of the file?
In addition to that; on the same node whenever I do ipfs cat <hash>
it keep downloads same hash all over again.
$ ipfs cat Qmew8yVjNzs2r54Ti6R64W9psxYFd16X3yNY28gZS4YeM3
212.46 MB / 1.00 GB [===========>---------------------------------------------] 20.75% 1m48s
$ ipfs cat Qmew8yVjNzs2r54Ti6R64W9psxYFd16X3yNY28gZS4YeM3
212.46 MB / 1.00 GB [===========>---------------------------------------------] 20.75% 1m48s
Analyse:
Both (updated and original file) have the same increase on the repo size:
First I create 100 MB file ( file.txt)
NumObjects: 5303
RepoSize: 181351841
StorageMax: 10000000000
RepoPath: /home/alper/.ipfs
Version: fs-repo@6
$ ipfs add file.txt
added QmZ33LSByGsKQS8YRW4yKjXLUam2cPP2V2g4PVPVwymY16 file.txt
$ ipfs pin add QmZ33LSByGsKQS8YRW4yKjXLUam2cPP2V2g4PVPVwymY16
Here number of objects increased 4. Changed repo size (37983)
$ ipfs repo stat
NumObjects: 5307
RepoSize: 181389824
StorageMax: 10000000000
RepoPath: /home/alper/.ipfs
Version: fs-repo@6
Than I did echo 'a' >> file.txt
then ipfs add file.txt
Here I observe that number of objects increased 4 more so it added the complete file, changed repo size (38823)
$ ipfs repo stat
NumObjects: 5311
RepoSize: 181428647
StorageMax: 10000000000
RepoPath: /home/alper/.ipfs
Version: fs-repo@6