0

I need to create a file with each line like:

ABC11225_2023__Sacramento________John_________White_________05/13/1990

from an SQL server query that selects that data out of a database. Each column of data must have a fixed width, using the above example the widths would be:

id year city firstname lastname birthdate
9 spaces 6 spaces 20 spaces 15 spaces 12 spaces 10 spaces

any space not taken up by data from the query has to be filled with white space (here I filled them with _ as I am not good at markdown), and any space exceeding the given column width has to be trimmed.

There are 188 columns total

I have all the column widths stored in a csv, and first I get those into a list. I am then using pyodbc to run the stored procedure on SQL server, to iterate through them.

file = open('columnWidths.csv', 'r')
data = list(csv.reader(file, delimiter=","))
file.close()
cnxn = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};SERVER=' +
                      server+';Database='+database+';Trusted_Connection=yes;')
cursor = cnxn.cursor()
cursor.execute("{call [dbo].[GetApplicationData]}")
applications = cursor.fetchall()

The issue I don't know how to efficiently solve is that each row may be missing data, or may need to be transformed. For example, a boolean value in the database for column 10 may be Y, but I have to transform it to a 1 for Y 0 for N. It may also be NULL in the database, which shows up as None in the pyodbc row object returned from fetchall().

I can create a loop and for each entry pad and trim appropriately, like so

for app in applications:
    a = ''
    a += app[0].ljust(int(data[0][0]))[0:int(data[0][0])]
    a += app[1].ljust(int(data[1][0]))[0:int(data[1][0])]
    a += app[2].ljust(int(data[2][0]))[0:int(data[2][0])]

but I am worried about performance if I have to add a match:case statement for each column to catch if the row contained a null in the database. Something like

for app in applications:
    for i in range(0, len(data)-1):
        columnWidth = int(data[i][0])
        r = app[i]
        match r:
            case str():
                a += app[i].ljust(columnWidth)[:columnWidth]
            case int():
                a += str(app[i]).ljust(columnWidth)[:columnWidth]
            case date():
                a += r.strftime("%m/%d/%Y").ljust(columnWidth)[:columnWidth]
            case None:
                a += ''.ljust(columnWidth)[:columnWidth]

Is there a better/smarter way to go about generating a fixed length file with these requirements?

  • The old-fashioned printf formatting can do columns (`print("%-10s" % "abcd")`), but it does not LIMIT larger strings. If it were me, I'd create a function that accepts a value and a column width, and returns a correctly aligned and padded result. – Tim Roberts Jun 09 '23 at 22:10
  • i'd probably do the padding on the sql server side. cast every column to CHAR(X) and you get the spaces automatically. Watch out for truncation – siggemannen Jun 09 '23 at 23:27

0 Answers0