0

I am trying to import some csv files in a sorted manner. Here are a few filenames.

market_rate_352_1.csv, market_rate_352_5.csv, market_rate_352_10.csv, market_rate_352_25.csv, market_rate_352_100.csv, market_rate_352_200.csv

Now while importing I use sorted(glob.glob('market_rate_*.csv')). But it sorts the file in this way,

market_rate_352_100.csv, market_rate_352_10.csv, market_rate_352_1.csv, market_rate_352_200.csv, market_rate_352_25.csv, market_rate_352_5.csv

But I want it as:

market_rate_352_1.csv, market_rate_352_5.csv, market_rate_352_10.csv, market_rate_352_25.csv, market_rate_352_100.csv, market_rate_352_200.csv

I tried splitting the names and then sorting using files.sort(key=lambda x: int(x.strip('market_rate_'))), where files is the array of filenames. But it throws an error, AttributeError: 'str' object has no attribute 'sort'. I guess this is because my file still contains an underscore and .csv string. I am not sure how to split twice and then sort through the files. Or if there is a much easier way to do it.

Emotional Damage
  • 138
  • 1
  • 1
  • 9

1 Answers1

1

Your sort is doing an alphabetic sort but you want it to be (partly) numeric.

The easiest way to write the key= function is to write it as a named function rather than to cram it as a one-liner into a lambda.

def split_numeric(s):
    # Take the stem of the filename, no extension, and split on underscores
    elements = s.partition(".")[0].split("_")
    # Convert the numeric elements into integers for numeric sorting
    return [int(e) if e.isdigit() else e for e in elements]

filenames = ['market_rate_352_10.csv', 'market_rate_352_5.csv', 'market_rate_352_1.csv']

filenames.sort(key=split_numeric)

Result:

['market_rate_352_1.csv', 'market_rate_352_5.csv', 'market_rate_352_10.csv']
BoarGules
  • 16,440
  • 2
  • 27
  • 44