Yes, you can. You can sort by the last 3 digits in each test substring:
# The string to be sorted by digits
s = "Test DATA_g004, Test DATA_g003, Test DATA_g001, Test DATA_g002"
# Create a list by splitting at commas, sort the last 3 characters of each element in the list as `ints`.
l = sorted(s.split(','), key = lambda x: int(x[-3:]))
print l
# [' Test DATA_g001', ' Test DATA_g002', ' Test DATA_g003', 'Test DATA_g004']
You'll want to trim the elements of l
if that's important to you, but this will work for all Test
s that end in 3 digits.
If you don't want Test DATA_
, you can do this:
# The string to be sorted by digits
s = "Test DATA_g004, Test DATA_g003, Test DATA_g001, Test DATA_g002"
# Create a list by taking the last 4 characters of sorted strings with key as last 3 characters of each element in the list as `int`s.
l = sorted((x[-4:] for x in s.split(',')), key = lambda x: int(x[-3:]))
print l
# ['g001', 'g002', 'g003', 'g004']
If your data is well-formed (i.e., g
followed by 3 digits), this will work quite well. Otherwise, use a regex from any of the other posted answers.
Another alternative is to push strings into a PriorityQueue
as you read them:
test.py
from Queue import PriorityQueue
q = PriorityQueue()
with open("example.txt") as f:
# For each line in the file
for line in f:
# Create a list from the stripped, split-at-comma string
for s in line.strip().split(','):
# Push the last four characters of each element in the list into the pq
q.put(s[-4:])
while not q.empty():
print q.get()
The benefit of using a PQ is that it will add them in sorted order, which takes the burden off of you, and it is done in linear time.
example.txt
Test DATA_g004, Test DATA_g003, Test DATA_g001, Test DATA_g002
And the output:
13:25 $ python test.py
g001
g002
g003
g004