7

I have written a backup tool that is able to backup files and images of volumes for Windows. To detect which files have changed I use the Windows Change Journal. I already use the shadow copy functionality to do a consistent copy of both the files and the volume images.

To detect which blocks have changed I use hashes at the moment. This means the whole volume has to be read once (because to see which block has changed hashes of all blocks have to be calculated). The backup integrated into Windows 7 is able to create incremental volume images without checking all blocks. I wasn't able to find an API for a kind of block level change journal.

Does anybody know how to access this information? (I'm willing to dive deep into NTFS internals - even reading and parsing special files)

UrOni
  • 431
  • 4
  • 9

2 Answers2

3

I don't think block level change info is available anywhere. Most probably what the Windows 7 integrated backup does is it installs a File System Filter Driver like some backup products does and anti-virus software. A filter driver can intercept all file system calls and in this way know which blocks changed. If you do this you can basically build your own change journal that works block level but only for the files that you are interested in.

I would really like to know a better answer myself here.

Hannes de Jager
  • 2,903
  • 6
  • 37
  • 57
  • Creating a (filter) driver would be just fine. The problem is, it is a open source project and I cannot afford (or better: do not want to afford) the 500$ yearly fee for a (64-bit) windows driver certificate. The only option left that I see is creating a shadow copy and then somehow finding out what windows saves into this shadow copy. This has two disadvantages: The shadow copy cannot be deleted (costs space) and there is no documentation about shadow copies (on disk format) available. One would have to reverse engineer the whole thing. – UrOni Feb 21 '11 at 21:26
  • 2
    Had no idea there was a fee involved. The other problem with the shadow copy approach is that the implementation may vary depending on the VSS provider that is used. – Hannes de Jager Feb 22 '11 at 06:48
  • (reboot) A bit late to this discussion, but how much is gained by backing up at the block rather than file, vs the cost of maintaining/calculation hashes? Just curious. – Peter Krnjevic Jun 15 '13 at 03:11
  • @PeterKrnjevic The software (UrBackup) actually does both file and image backups and does not look at unchanged files for incremental file backups (using the Windows Change Journal). For image backups you get the hashes basically for free (once you have read the data into memory hash operations are not that expensive) and often not everthing on a volume changes between image backups, so it is definitely worth it. – UrOni Nov 08 '13 at 20:42
  • @UrOni: Are you saying image backups read everything into memory? Doesn't this take quite a while these days with multi-TB drives? – Peter Krnjevic Dec 04 '13 at 02:31
0

When you say Windows Change Journal I take it you are referring to the NTFS USN? It looks very much like the Windows 7 backup uses a combination of VSC and NTFS USN to detect changes and create incremental images much like you are already doing.

Jackobyte
  • 1
  • 1