0

I have an auto_generated file that could have duplicate data, which causes my parser to crash. How can I check line-by-line and remove the unwanted lines based on a character that it has on bash ? For example:

   for line in file.txt:
            if '(1)' in line:
                delete line
            elif '(2)' in line:
                delete line
            elif '(3)' in line:
                delete line
            else:
                return (file.txt with those lines removed )

Sample Input

Hello my name is john
Hello my name is eric
Hello my name is jonh(2)
Hello my name is ray
Hello my name is john (1)
Hello my name is eric (3)

Sample Output

Hello my name is john
Hello my name is eric
Hello my name is ray
Crankdat
  • 11
  • 5
  • 2
    Please add sample input and your desired output for that sample input to your question. – Cyrus Dec 02 '15 at 19:03
  • 1
    Have you tried `grep -v`? – chrisaycock Dec 02 '15 at 19:04
  • it looks like you've been here for long enough and asked a lot of questions without accepting a single answer even though there are many good ones. Please take the [tour] and see [What does it mean when an answer is "accepted"](https://stackoverflow.com/help/accepted-answer). Don't post a thank you comment, just click the check mark (which gives some reputation to both users) if the answer helps you. In fact [that's not an expected behavior and may be flagged](https://meta.stackoverflow.com/q/258004/995714). [Same to thanks in the post](https://meta.stackexchange.com/q/2950/230282) – phuclv Apr 01 '19 at 16:26

1 Answers1

1

To exclude lines that have pattern ( + letter + ), you could do:

grep -v '(.)' file

If you want the letter to be a number:

grep -v '([0-9])' file

If you want to exclude a single specific number:

grep -v '(1)' file

If you want to exclude multiple specific numbers:

grep -v '([123])' file

If you want to exclude multiple different patterns:

grep -v -e pattern1 -e pattern2 -e pattern3 file
janos
  • 120,954
  • 29
  • 226
  • 236
  • 1
    I wanted to give an example with multiple patterns using multiple `-e` flags. But you're right, that point was not clear enough. I updated to make that more clear, thanks for pointing out – janos Dec 02 '15 at 20:17
  • Thank you very much that was helpful – Crankdat Dec 02 '15 at 20:18
  • apologize for the formatting. Please use the below code. `import re fle = open('text', 'r') dat = fle.readlines() for i in dat: if not re.findall(r'\([0-9]{1,3}\)', str(i)): print i.rstrip()` – rickydj Dec 03 '15 at 07:17