0

I have files with different extensions, some are text files, others are zipped files or images. How can I programmatically add a checksum to the files?

For example, my idea was to add a checksum somewhere in the metadata of the files. I tried doing it with PowerShell, but the properties of the files are read-only. I don't want to create a separate file that contains the checksum of the files. I want the checksum itself to be included somewhere in the file itself or in its metadata.

Edelweiss
  • 79
  • 6
  • 3
    What problem are you trying to solve? Setting the expected checksum on the file itself in case the file is corrupted? How would you guarantee the expected checksum metadata isn't itself corrupted, or worse, *maliciously-changed*? – codewario Apr 05 '22 at 20:50
  • 5
    Metadata is not going to be available on all file systems. It's rare to see it used for anything. – Mark Ransom Apr 05 '22 at 20:51
  • 1
    @MarkRansom That is an excellent point. Filesystems aside, OP, you need to publish checksums in a well-known location and compare your downloaded file checksum to the known checksum. You can't rely on any checksum provided with the program or downloaded software itself, because literally anyone else could change the expected checksum to modify those values and you'd be none the wiser. Checksums provided with a download are convenient but rely on a checksum stored elsewhere when used with automation. – codewario Apr 05 '22 at 20:54
  • 1
    I think this is more or less answered by this question: https://stackoverflow.com/questions/64597009/use-powershell-to-edit-a-files-metadata-details-tab-of-a-file-in-windows-file – Darin Apr 05 '22 at 21:02
  • @Bender the Greatest If the checksum in the metadata gets corrupted or the file itself gets corrupted, when calculating the file data and comparing it with the checksum in the metadata it's *very* unlikely that the check will pass. Regarding your second point - I'm not interested in authentication or signing, I just want a checksum to ensure a file was correctly downloaded, extracted, etc. – Edelweiss Apr 06 '22 at 12:14
  • 1
    Checksum is also used to determine the validity of a file. If a downloaded file is corrupted, sure, the checksum won't match, and it's unlikely to match the attribute value either. But an attacker can easily modify the binary and then modify the expected attribute to use the new value for your checksum. This is an insecure solution even though you aren't approaching it as a security centric use case. If the checksum is there, others besides you will expect to use the checksum as checksums are expected to be used. – codewario Apr 06 '22 at 13:50

1 Answers1

3

On Windows, with NTFS filesystem, you can use Alternate Data Streams.

They act exactly like files, but hidden and attached to the main file - until it's copied on a non-NTFS partition.

Otherwise, you can't just add a checksum to a file (even a short CRC32) without consequences, and how would you be SURE that the last N bytes are your checksum, and not file's data? You'll need to add a header (so even more bytes), etc. and it can mess up the file loading - simply think about a simple, plain text file, if you add N bytes of binary data at end!

Wisblade
  • 1,483
  • 4
  • 13
  • Wow! It has been years since I last seen anything related to alternative data streams! Completely forgot such a thing even exists! Here is link with examples: http://powershellcookbook.com/recipe/XilI/interact-with-alternate-data-streams – Darin Apr 05 '22 at 21:16
  • @Darin Thanks. I just f****g love this feature, I use it often to cache data. Example: source is an XML file, with potential errors => an ADS for corrected XML. Then, another ADS for translated file [project-specific format]. And so on. When you inject new data, you cache all preprocessing into file itself, add a CRC/authentication ADS on top of that (to check modified data in particular), and you'll save great time on next runs WITHOUT putting work files everywhere. Very, very underrated NTFS feature, like compression/encryption of individual files. Thanks for the link. – Wisblade Apr 05 '22 at 21:21
  • thanks for sharing the idea. I haven't tried the code in that link yet, just found it in a web search. But I can really see how that could simplify my work flow. – Darin Apr 05 '22 at 21:41
  • @Darin You're welcome. I also use ADS to store "resources" for file types that don't handle that (it's fine for a DLL, but for a `.docx` file...), or for alternate versions (i.e. MS Office documents, with and without macros), or for attach the Excel file to the Word file used for presentation, etc. – Wisblade Apr 05 '22 at 22:54