I have some byte data(millions of rows), and currently, I am converting to base64 first, and storing it as TEXT. The data is indexed on a row that contains base64 data. I assume Postgres does the conversion to base64 itself.
Will it be faster if I store using BYTEA data type instead?
How will the indexed queries be affected on two data types?
Asked
Active
Viewed 1.0k times
7

Shubham Chaudhary
- 1,322
- 2
- 18
- 38
1 Answers
18
Converting bytes to text using Base64 will consume 33% more space than bytes. Even this would be faster, you will use quite more space on disk. Loading and storing data should be slower as well. I see no advantage in doing that.
Postgres supports indices on BYTEA
columns. Since the bytes are shorter than the text, byte columns with indexes should be faster than text columns with indices as well.
-
bytea is slower than text, http://engineering.pivotal.io/post/bytea_versus_text_in_postgresql/ – AuthorProxy May 23 '18 at 10:06
-
@AuthorProxy: Unfortunately, the test is not very meaningful because it does not select according to the column. The test also includes implicit encoding and decoding operations that can produce completely different timings with other programming languages or adapters. – clemens May 23 '18 at 11:06
-
1@ciemens, Incorrect info of how many times more space base64 encoding uses. It's 1.33%: https://stackoverflow.com/a/13378842/2821632 – Ahmad Ferdous May 25 '18 at 16:32
-
@AhmadFerdous: Thanks, you‘re right. I‘ve updated my post. – clemens May 25 '18 at 17:02
-
@nick-t why did you change 1.33% back to 33% in `@clemens` answer? – uberrebu Jul 17 '22 at 09:52
-
1@uberrebu "1.33 more space" to me implies 2.33x total space. Could also phrase it "1.33 times the original space" – Nick T Jul 21 '22 at 13:20
-
read @AhmadFerdous comment above and `@clemens` comment – uberrebu Jul 21 '22 at 15:39