0

CSV File:

No Headers

8=Fix1.1<SOH>9=70<SOH>35=AE<SOH>10=237 ----This entire message is FIXData value
8=Fix1.1<SOH>9=71<SOH>35=AE<SOH>10=238
8=Fix1.1<SOH>9=72<SOH>35=AE<SOH>10=239
8=Fix1.1<SOH>9=73<SOH>35=AE<SOH>10=240

Code Snippet of how I get the data into a CSV file from querying MS DB.

cnxn = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};SERVER=server;DATABASE=abc;UID=user;PWD=password')

sql = """select FIXData from table WHERE MessageTimestamp BETWEEN '2021-09-20 07:00:00' AND '2021-09-20 07:01:00';"""

 

rows = cursor.execute(sql)

 

with open(r'HOME/sqlresults.csv', 'w', newline='') as csvfile:

    writer = csv.writer(csvfile)

#    writer.writerow([x[0] for x in cursor.description]) --- dont want headers

    for row in rows:

        writer.writerow(row)

Issue:

  • As can be seen from the CSV file, the entire FIX message is squished into 1 cell (thinking Excel here).
  • The FIX message is separated by , and basically I want to separate the tags into separate cells.

Desired Output:

  • CSV file format (ideally separated by ';') so that when I open the csv in excel, the tag values will be in there own cell and not compressed in one.
8=Fix1.1;9=70;35=AE;10=237
8=Fix1.1;9=71;35=AE;10=238
8=Fix1.1;9=72;35=AE;10=239
8=Fix1.1;9=73;35=AE;10=240 

Bonus Desired Output:

  • This is optional, but if can be done within the above would be helpful.
  • Remove the first part of tag

CSV File

Fix1.1;70;AE;237
Fix1.1;71;AE;238
Fix1.1;72;AE;239
Fix1.1;73;AE;240

Note:

  • Please note I cannot use third party modules (Pandas dataframe not to be used etc)
test1 1
  • 47
  • 7
  • I would urge you to fix your data structure. You are violating 1NF which is why this such a challenge. If that isn't feasible then I would suggest doing this in your query. It is fairly easy with a decent split function. There is a built in splitter in the newer versions but it somewhat limited because they failed to provide the rank in the output which is borderline criminal. – Sean Lange Sep 23 '21 at 21:54
  • Agree on the 1NF is that means why they couldn't dump the message into separate rows/columns. Unfortunately, not much I can argue, given Networks team dumps messages into the DB as so. Would you be able to elaborate on the split function please? I'm given a read only account so cannot modify table etc. – test1 1 Sep 24 '21 at 11:55

1 Answers1

0

In the end, I figured I will just use the Python replace method to convert the SOH character to , which gives me the above desired output.

I managed to get a solution using the thread, How to search and replace text in a file?

# Read in the file
with open('file.txt', 'r') as file :
  filedata = file.read()

# Replace the target string
filedata = filedata.replace('x01', ',') ---- Using RHEL based system so found '\u001' didnt work for me. Dont know the details of why....

# Write the file out again
with open('file.txt', 'w') as file:
  file.write(filedata)
Jeremy Caney
  • 7,102
  • 69
  • 48
  • 77
test1 1
  • 47
  • 7