-1

I am new to python.I have below text lines from a text file:

05/01/2023 05:39:35 Exit with status 0 but no images found
05/01/2023 05:39:35 server12349 is considered prod.  Total environments: "production"
05/01/2023 05:39:35 Platform is wnv and datatype is os
05/01/2023 05:39:35 Alertdatatype os  is Windows OS
05/01/2023 05:39:35 Windows OS backup of server12349 (srv.lab.ch.os.wnv.server12349.xyz) succeeded
-

05/01/2023 05:39:35 Exit with status 0 but no images found
05/01/2023 05:39:35 server7329 is considered prod.  Total environments: "production"
05/01/2023 05:39:35 Platform is wnv and datatype is os
05/01/2023 05:39:35 Alertdatatype os  is Windows OS
05/01/2023 05:39:35 Windows OS backup of server7329 (srv.lab.ch.os.wnv.server7329.xyz) succeeded

I waned to capture below using regex function:

Exit with status 0 but no images found
backup of server12349 (srv.lab.ch.os.wnv.server12349.xyz)

Below pattern matches Exit with status 0 but no images found (or) backup of server12349 (srv.lab.ch.os.wnv.server12349.xyz) but I wanted have pattern to search for both in the text file. Any help would be much appreciated.

import re

pattern =re.compile(r'(Exit(.*)\sfound) |backup of\s(\w+)\s\((.*?)\)',re.MULTILINE)
with open('c:\\tmp\\Notext.txt', 'r') as myfile:
    
    for i in myfile:
        if pattern.search(i) != None:
            res=re.findall(pattern,i)
            #print(res)
            st=list(res[0])
            print(st[0],st[1])
Alexander
  • 16,091
  • 5
  • 13
  • 29
jada
  • 3
  • 2
  • you can just use `re.findall()` on the whole file. – Alexander Feb 21 '23 at 06:36
  • Does this answer your question? [Regular expression matching a multiline block of text](https://stackoverflow.com/questions/587345/regular-expression-matching-a-multiline-block-of-text) – SajanGohil Feb 21 '23 at 06:42
  • @ Alexander I am new to python if you can give your suggestion how can I user re.findall to match the text in the files that helps. – jada Feb 21 '23 at 06:57

2 Answers2

0

You may have to tweak it to fit your exact output requirements, but you can use a non-capturing group for the regex. I took one of the logs and just put it all into a string to make it easier to test, but as someone else mentioned you can call findall on the whole file.

import re
string = "05/01/2023 05:39:35 Exit with status 0 but no images found 05/01/2023 05:39:35 server12349 is considered prod.  Total environments: \"production\" 05/01/2023 05:39:35 Platform is wnv and datatype is os 05/01/2023 05:39:35 Alertdatatype os  is Windows OS 05/01/2023 05:39:35 Windows OS backup of server12349 (srv.lab.ch.os.wnv.server12349.xyz) succeeded"

matches = re.findall(re.compile(r"(Exit(.*)\sfound) (?:.*)backup of\s(\w+)\s\((.*?)\)"), string)
print(matches)

# output of matches
[('Exit with status 0 but no images found', ' with status 0 but no images', 'server12349', 'srv.lab.ch.os.wnv.server12349.xyz')]

The only thing that was added to the regex compared to yours is (?:.*) which gets it to match any character up till it hits "backup of .." and then disregard those characters. May not be the exact output you're looking for but it should get you headed in the right direction.

Shorn
  • 718
  • 2
  • 13
  • @ shorn, have tried your pattern by replacing my patter in the above code but its not giving me any output pattern =re.compile(r'(Exit(.*)\sfound) (?:.*) backup of\s(\w+)\s\((.*?)\)',re.MULTILINE) – jada Feb 21 '23 at 07:10
  • `re.MULTILINE` isn't doing what you think it does. It's just a way to [adjust how it interprets `^` and `$` in the regex pattern](https://docs.python.org/3/library/re.html#re.MULTILINE) by making it also accept start and end lines. If you do `myfile.read()`, it should return the entirety of the file as a single string, which you then apply the regex to. – Shorn Feb 21 '23 at 07:20
0

Assuming you want to print the entire line from the file that contains either of the patterns then:

import re

pattern = re.compile(r'(^.*Exit.*found$)|(^.*backup of \w+ \(.*\).*$)')

with open('c:\\tmp\\Notext.txt') as txt:
    for line in map(str.strip, txt):
        for m in pattern.findall(line):
            for t in m:
                if t:
                    print(t)

Output:

05/01/2023 05:39:35 Exit with status 0 but no images found
05/01/2023 05:39:35 Windows OS backup of server12349 (srv.lab.ch.os.wnv.server12349.xyz) succeeded
05/01/2023 05:39:35 Exit with status 0 but no images found
05/01/2023 05:39:35 Windows OS backup of server7329 (srv.lab.ch.os.wnv.server7329.xyz) succeeded
DarkKnight
  • 19,739
  • 3
  • 6
  • 22
  • I am looking for a pattern which contains both pattern and not either.If you have a suggestion please do help me out. – jada Feb 22 '23 at 05:42
  • I have tried as you have requested but getting only the below output and not the other one as have reqeusted. "05/01/2023 05:39:35 Windows OS backup of server7329 (srv.lab.ch.os.wnv.server7329.xyz) succeeded – jada Feb 22 '23 at 07:10