I am trying to get a process going in python to write data into a .csv which can then be BCP'd into a MSSQL database.
The basic to_csv command I am using is:
df.to_csv(csv_path, sep = "«", header = False, index = False, line_terminator="[~~]")
The one issue I have been seeing is how the program handles double quotes within text. If a set of double quotes is found, it quote-escapes it.
So a row which looks like:
1, 7, Executed Job: "abcdf" Complete
Gets turned into this when loaded into the data:
1, 7, "Executed Job: ""abcdf"" Complete"
This is frustrating because it means if I want to use the data post-BCP, I need to process it further on the SQL side. Since my ETL runs over hundreds of tables, implementing this would be a huge headache.
I am wondering if there is a way to handle quotes in the data BEFORE running the to_csv command via the pandas dataframe.replace() function. Is there any option to feed replace which would turn quotes in the dataframe into something to_csv could read as double quotes without causing escape problems?
For example something along the lines of:
df.replace('"','\"')
df.to_csv(csv_path, sep = "«", header = False, index = False, line_terminator="[~~]")
Edit: For reference I have tried using doublequote=False and setting csv quoting to QUOTE NONE, but both lead to 'need to escape character and lack an escape char' errors.