3

According to the b2_list_file_names documentation "This call returns at most 1000 file names per transaction" and you can use the nextFileName field to get the next 1000 files, and so on. It doesn't say in which order these files are returned, but the documentation for the similar b2_list_file_versions says "in alphabetical order by file name" so one might suspect it's the same for b2_list_file_names.

This would imply that any new files uploaded between calls to b2_list_file_names will not appear at the end of the list.

Is it possible to either choose chronological order for b2_list_file_names or in any other way get the latest uploaded files?

In order to avoid an XY Problem situation, this is what I really want to do:

Set legal_hold to "on" for all files, and do it for all new files once a week. Since I have ~400k files it costs money (I think) to list all files every time.

So another way to solve my problem would be to list all files with legal_hold not set to "on". Is that possible?

Peter Jaric
  • 5,162
  • 3
  • 30
  • 42
  • 1
    Turns out I hadn't read the billing information correctly. Listing 400 000 files converts to 400 class C operations (assuming you specify 1000 files per request), and you get 2 500 class C operations for free each day. So no problem for me! I'll leave the question, though, since this could be a problem for someone else. – Peter Jaric Oct 09 '21 at 20:22
  • Hi @Peter - could you turn your comment into an answer so that people can more easily see the solution? – metadaddy Jan 12 '22 at 17:55
  • Actually I think my comment is not an answer to the question as I have stated it ("How do I list only new files via the Backblaze B2 API"). The comment is more along the lines of "I don't need to know the answer since I have few enough files right now that I can list all files without running into the limit". – Peter Jaric Jan 13 '22 at 09:59
  • Fair enough - I'll write an answer that addresses your original question. I didn't want to do so before asking. – metadaddy Jan 13 '22 at 16:34

1 Answers1

1

The way to do this with B2 (or indeed S3) is to use the file name. A common approach is to use a date prefix, naming your files with a convention such as yyyy-MM-dd/fileName. You can then query for all the files for a given day using the prefix parameter.

Since you're working on a weekly basis, you can optimize this by using yyyy-ww/fileName where ww is the week in the year.

If, on the other hand, you wanted a chronological ordering, you could use yyyy-MM-dd HH:mm:ss/fileName and then use the startFileName parameter to get all files since a given point in time.

metadaddy
  • 4,234
  • 1
  • 22
  • 46