Here's a working example using regular expressions thanks to package re
:
>>> import re
>>> line = "-rw-r--r-- 1 jttoivon hyad-all 25399 Nov 2 21:25 exception_hierarchy.pdf"
>>> pattern = r"([\d]+)\s+([A-z]+)\s+(\d{1,2})\s+(\d{1,2}):(\d{1,2})\s+(.+)$"
>>> output_tuple = re.findall(pattern, line)[0]
>>> print(output_tuple)
('25399', 'Nov', '2', '21', '25', 'exception_hierarchy.pdf')
>>> size, month, day, hour, minute, filename = output_tuple
Most of the logic is encoded in the raw pattern
variable. It's very easy though if you look at it piece by piece. See below, with new lines to help you read through:
([\d]+) # means basically group of digits (size)
\s+ # means one or more spaces
([A-z]+) # means one or more letter (month)
\s+ # means one or more spaces
(\d{1,2}) # one or two digits (day)
\s+ # means one or more spaces
(\d{1,2}) # one or two digits (hour)
: # looking for a ':'
(\d{1,2}) # one or two digits (minute)
\s+ # means one or more spaces
(.+) # anything basically
$ # until the string ends
By the way, here's a working example not using re
(because it's actually not mandatory here):
>>> line = "-rw-r--r-- 1 jttoivon hyad-all 25399 Nov 2 21:25 exception_hierarchy.pdf"
>>> size, month, day, hour_minute, filename = line.split("hyad-all")[1].strip().split()
>>> hour, minute = hour_minute.split(":")
>>> print(size, month, day, hour, minute, filename)
25399 Nov 2 21 25 exception_hierarchy.pdf