2

I've an example.txt which contains hexadecimal data like this.

09 06 07 04 00 00 01 00 1d 03 4b 2c a1 2a 02 01   
b7 09 01 47 30 12 a0 0a 80 08 33 04 03 92 22 14   
07 f0 a1 0b 80 00 81 00 84 01 00 86 00 85 00 83   
07 91 94 71 06 00 07 19

09 06 07 04 r0 00 01 00 1d 03 4b 2c a1 2a 02 01   
b7 09 01 47 30 1s a0 0a 80 08 33 04 03 92 22 14   
07 f0 a1 0b 80 00 81 0d 84 01 00 86 00 85 00 83   
07 91 94 71 06 

09 06 07 04 r0 00 01 00 1d 03 4b 2c a1 2a 02 01   
b7 09 01 47 30 1s a0 0a 80 08 33 04 03 92 22 14   
07 f0 a1 0b 80 00 81 0d 84 01 00 86 00 85 00 83   
b7 09 01 47 30 1s a0 0a 80 08 33 04 03 92 22 14

b7 09 01 47 30 1s a0 0a 80 08 33 04 03 92 22 14   
07 f0 a1 0b 80 00 81 0d 84 01 00 86 00 85 

What I want to do is to look for a specific string and if exits continue at that point looking for another string and so on. Once I know that patterns exits then I want to remove a file and if that pattern doesn´t exits remove another.

My code is this:

import os

with open('example.txt') as file:
    if '12' in file.read():
        if ('80' or '25' or 'a6' or '1b') in file.read():
            if '04' in file.read():
                if '07' in file.read():
                    command1 = 'rm -f file2.json'
                    os.system(command1)
     else:
     command2 = 'rm -f file1.json'
     os.system(command2)
Saul
  • 33
  • 5
  • Welcome to StackOveflow. Questions on the site need to be specific. Asking for contributors to design a program for you is not the purpose of Stackoverflow. Can you reframe your question as a specific problem? Take a look at https://stackoverflow.com/help/how-to-ask for guidance on how to ask a good question for Stackoverflow. – Chris Feb 02 '21 at 09:43
  • @Chris this seems like a pretty straight forward question including code he already tried to me. – Semmel Feb 02 '21 at 09:49
  • Yes, on reflection, agreed. – Chris Feb 02 '21 at 10:00
  • I don't see a question here... Is there something wrong with the presented code? If so - please give details about it. If not - this is better suited for [codereview.se] – Tomerikoo Feb 02 '21 at 11:48
  • It was a new version to show what I was doing to a person who asked me. It's solved now – Saul Feb 02 '21 at 11:49
  • Also please see: [Why can't I call read() twice on an open file?](https://stackoverflow.com/questions/3906137/why-cant-i-call-read-twice-on-an-open-file) and [Can Python test the membership of multiple values in a list?](https://stackoverflow.com/questions/6159313/can-python-test-the-membership-of-multiple-values-in-a-list) – Tomerikoo Feb 02 '21 at 11:51
  • Then please see [What should I do when someone answers my question?](https://stackoverflow.com/help/someone-answers) – Tomerikoo Feb 02 '21 at 11:51
  • Also see [How to delete a file or folder?](https://stackoverflow.com/questions/6996603/how-to-delete-a-file-or-folder). There is no reason to call external shell commands. You can call Python functions to do that – Tomerikoo Feb 02 '21 at 11:53

1 Answers1

1

You can use regular expressions (regex) to find all groups with this structure in the file.

import os
import re

file_path = "example.txt"
delete_file_path = "delete_me"
delete_file_ending = ".txt"
pattern = re.compile("12.*(?=[90|25|30]).*(?=40).*(?=20)")  # add a proper regex here to match all you required strings properly

with open(file_path) as file:
    text = file.read()
paragraphs = text.split(os.linesep)
paragraph_tokens = [re.findall(pattern, paragraph) for paragraph in paragraphs]

for i in range(paragraph_tokens):
    if paragraph_tokens[i]:
        os.remove(delete_file_path +s tr(i) + delete_file_ending)

you could also get re.match, if you only want to know if any matched pattern is in there, but then you would change the if condition a little bit since re.match returns an object.

Semmel
  • 575
  • 2
  • 8
  • This looks great but how could add several strings in re.compile(). Could I do this? `re.compile("12" and( "80" or "25")` – Saul Feb 02 '21 at 10:05
  • Because my idea is once "12" exits start seaching for a "80" or "25" – Saul Feb 02 '21 at 10:06
  • This is a tutorial for regex: https://docs.python.org/3/howto/regex.html in your case you either capture all tokens like "[12|80|25] and do the logic later (which does probably not help a lot in your case) or you caputure a group by using look-ahead or look behind features described here https://stackoverflow.com/questions/6109882/regex-match-all-characters-between-two-strings in this case I would strongly advise to import regex instead of import re, since the regex library also supports look-aheads which the default re library does not. – Semmel Feb 02 '21 at 10:27
  • https://python.readthedocs.io/en/stable/howto/regex.html is the correct ressource to learn about lookahead in regex in this case. – Semmel Feb 02 '21 at 10:40
  • It gives me this error that I dont know why because I'm working with strings `unsupported operand type(s) for |: 'str' and 'str'`. I'm using your initial idea using re – Saul Feb 02 '21 at 10:47
  • You can see on the post – Saul Feb 02 '21 at 11:19
  • pattern = re.compile("12.*(?=[80|25|a6|1b]).*(?=04).*(?=07)") – Semmel Feb 02 '21 at 11:39
  • That is for regex? – Saul Feb 02 '21 at 11:41
  • yes :) explanation: 12 then anything then either 80 or 25 or a6 or 1b, then anything, then 04, then anything, then 07. – Semmel Feb 02 '21 at 11:43
  • That's great. Change your answer I will give as corrected. Thanks! – Saul Feb 02 '21 at 11:48
  • I forgot to comment that I want to stop looking for that sequence when I have a jump line. I mean I have a huge text file which is divided into paragraphs so I want to search for that pattern in ech paragraph and when the paragrahp is finished to start again doing the searching from the starting point – Saul Feb 02 '21 at 12:55
  • 1
    I think the way to do that is like establishing a break point whenever It encountered a blank space but I dont know how to implement. Any idea? @Semmel – Saul Feb 02 '21 at 14:10
  • 1
    I added this feature. If anything else comes up (like reading line by line if the file is bigger than the available ram ...), pls ask a separate question for it since the original question was hopefully properly answered by now :) – Semmel Feb 02 '21 at 17:31
  • That's great. Thanks a lot! – Saul Feb 02 '21 at 18:57