2

I'm able to search through a folder to all the version log lines, but I am trying to select the newest version in the list, but I don't know how because the elements of the list contains both characters and numbers.

Below is my code for find and creating a list called matched_lines that contains all the lines that states the version number of the log. I hope to find the newest version from this list created, and compare this newest version with the actual latest version outside of the log. For example, a generated list would consist:

['Version 2.13.1.1', 'Version 2.12.1.0', 'Version 2.10.1.4']

In this example, I would hope to select "Version 2.13.1.1", and compare this with the latest version number the log, for example, "Version 2.14.1.0".

    for filename in files:

            #print('start parsing... ' + str(datetime.datetime.now()))
            matched_line = []
            try:
                with open(filename, 'r', encoding = 'utf-8') as f:
                    f = f.readlines()
            except:
                with open(filename, 'r') as f:
                    f = f.readlines()                 

            # print('Finished parsing... ' + str(datetime.datetime.now()))

            for line in f:
                #0strip out \x00 from read content, in case it's encoded differently
                line = line.replace('\x00', '')

                #regular expressions to fidn the version log lines for each type
                RE1 = r'^Version \d.\d+.\d.\d' #Sample regular expression

                pattern2 = re.compile('('+RE1+')', re.IGNORECASE)

                #for loop that matches all the available version log lines
                for match2 in pattern2.finditer(line):
                    matched_line.append(line)

After finding the newest version in this list, I hope to then compare it with the actual latest version number that may be outside of the folder.

D. Wu
  • 81
  • 1
  • 9
  • Your code wasn't syntactically valid; I've assumed you have a list of strings but [edit] to clarify otherwise. Also please show what you've done and what the specific problem is, and don't revert legitimate edits. – jonrsharpe Jun 22 '18 at 22:08
  • you should post your code. Hence the downvote. Also, I haven't worked with log files, but I'm pretty they must contain some form of timestamp. – pyeR_biz Jun 22 '18 at 22:10
  • *"sample regular expression to find version number"*?! – jonrsharpe Jun 22 '18 at 22:34

4 Answers4

7

It can be easily achieved by packaging.version.parse which is pertinent with the current PEP 440.

>>> from packaging import version
>>>
>>> vers = ['Version 2.13.1.1', 'Version 2.12.1.0', 'Version 2.10.1.4']
>>>
>>> for n, i in enumerate(vers):
...     vers[n] = version.parse(i)
...
>>> max(vers)
<LegacyVersion('Version 2.13.1.1')>
>>>
0x48piraj
  • 395
  • 3
  • 12
6

First you need to capture the version number from the string and turn it into a tuple of int of the form (major, minor, micro). Using this as key for the max function will return the latest version.

Code

import re

def major_minor_micro(version):
    major, minor, micro = re.search('(\d+)\.(\d+)\.(\d+)', version).groups()

    return int(major), int(minor), int(micro)

Example

versions = ['Version 2.13.1.1', 'Version 2.12.1.0', 'Version 2.10.1.4']
latest = max(versions, key=major_minor_micro)

print(latest) # 'Version 2.13.1.1'
Olivier Melançon
  • 21,584
  • 4
  • 41
  • 73
0

Building on the answer from @Olivier, if you don't require that all versions have the three major, minor, micro groups, then you should change the function to:

import re

def major_minor_micro(version):
    major, minor, micro = re.search("(\d*)\.*(\d*)\.*(\d*)", version).groups()
    
    return int(major or 0), int(minor or 0), int(micro or 0)
-1

You can sort the list and then get the largest (last) item. But you would like natural sort, eg: 'Version 2.4.1.1' < 'Version 2.13.1.1'.

I found a function to do this in Does Python have a built in function for string natural sort?. Here's an example of how

import re

def sorted_nicely(an_iterable):
    """ Sorts the given iterable in the way that is expected.

    Required arguments:
    an_iterable -- The iterable to be sorted.

    """
    convert = lambda text: int(text) if text.isdigit() else text
    alphanum_key = lambda key: [convert(c) for c in re.split('([0-9]+)', key)]
    return sorted(an_iterable, key = alphanum_key)

version_list = ['Version 2.13.1.1', 'Version 2.123.1.0', 'Version 2.4.1.4']

print(sorted_nicely(version_list)[-1])  # "Version 2.123.1.1"
figbeam
  • 7,001
  • 2
  • 12
  • 18