1

I have a SQL Server table with a column X of type varbinary(max). It is storing the output of a GZIP of an XML string. I can export the column records to a Windows Native format BCP file like this:

bcp "select top 5 X from mytable" -N foo.bcp

I have a very large number of records. PYODBC is 100X slower than BCP, so rather than query the binaries directly into Python, which I can do, I am trying to BCP them into a flat file first.

I know there is a DECOMPRESS function in SQL Server 2016+, but I am unfortunately limited to SQL Server 2014 which does not have this function. So DECOMPRESS on the server side before BCP is not an option for me. I have to get the data out of the native format file into Python and then decompress.

Question: How to I extract the 5 rows from the file separately as type binary objects B in Python 3, so that I can decompress them into Unicode with

gzip.decompress(B).decode('utf-8')

?

Note: I am not trying to BCP out and then BCP back in again. I am trying to extract the varbinary records from the native-encoded BCP file using Python.

Lars Ericson
  • 1,952
  • 4
  • 32
  • 45
  • As I recall (from tests run long ago) `bcp` is not a particularly sophisticated format as it essentially just dumps row metadata, followed by the columns in SQL's native binary format (which are a mix of fixed-size data for fixed-size columns, and length-prefixed data for variable sized columns). For a single column of a known type you can ignore the metadata and get to the first row. You could simply try it on small test data and analyze the resulting file with a hex editor. GZIP has a distinctive header (`1f 8b 08`) that will allow you to separate BCP metadata from column data. – Jeroen Mostert Aug 31 '20 at 19:38
  • That's a great clue, I will try it. Definitely helps to know the header code. – Lars Ericson Aug 31 '20 at 23:50
  • Why not just export in a non proprietary format like tab delimited? What is the big picture of what you're trying to do? – Nick.Mc Sep 01 '20 at 00:18
  • The field is a GZIP compressed XML. It is varbinary data. How do I export the binary data to tab delimited ASCII form of binary data, like a hex code stream, using BCP and a SQL Server 2014 SELECT statement.? NOTE: SQL Server 2016 has a DECOMPRESS command in which case I'd be done, but 2014 does not, so I'm stuck. – Lars Ericson Sep 01 '20 at 02:20
  • To preserve binary in a textual representation, you _uuencode_ it. So you can uuencode your binary data (in a view), extract the view to a text file using BCP, then uudecode it on the other side to convert back to binary. Sounds convoluted but uuencode is the standard way of doing this exercise. Here's some sample T-SQL code to do that https://stackoverflow.com/questions/5082345/base64-encoding-in-sql-server-2005-t-sql – Nick.Mc Sep 15 '20 at 12:55

0 Answers0