What makes RecordIO attractive

Asked May 25 '20 at 19:36

Active May 25 '20 at 19:36

Viewed 463 times

I have been reading about RecordIO here and there and checking different implementations on github here, and there. I'm simply trying to wrap my head around the pros of such a file format.

The pros I see are the following:

Block compression. It will be faster if you need to read only a few records because less to decompress.
Because of the somehow indexed structure you could lookup a specific record in acceptable time (assuming keys are sorted). This can be useful to quickly locate a record in an adhoc fashion.
I can also imagine that with such a file format you can have finer sharding strategies. Instead of sharding per file you can shard per block.

But I fail to see how such a file format is faster for reading over some plain protobuf with compression.

Essentially I fail to see a big pro in this format.

asked May 25 '20 at 19:36

jeremie

What makes RecordIO attractive

0 Answers0

Linked