how to delete all lines containing Letters and characters for a textfile

Question

I have a textfile containing words, numbers, and characters. I want to delete all lines with the characters and words, and keep the lines with numbers. I found out that all those lines with words and characters have the letter of "r". so I wrote my code as:

The textfile contains these lines as an example:

-- for example
-- 7 Febraury 2022
5 7 1 5 3.0 2
3*2 3 5 7.0 3

and I want to keep these 2 lines:

5 7 1 5 3.0 2
3*2 3 5 7.0 3

This is the code written: textfile = open('test.txt', 'r') A = textfile.readlines()

L = []
for index,name in enumerate(A):
    if 'r' in name:
        L.append(index)

for idx in sorted(L, reverse = True):
    del A[idx]

I know it is not a good way to do that, is there any suggestion to do that?

Same question was asked here: https://stackoverflow.com/questions/11968998/remove-lines-that-contain-certain-string — Fareed Khan, Feb 07 '22 at 06:33
Hi, one possible solution could be using regular expressions using the python "re" module. For each line in the file you could search for a \w+. If true, skip that line. — M. Villanueva, Feb 07 '22 at 06:52

Tal Folkman · Accepted Answer · 2022-02-07T07:45:41.800

1

you can find only the words using regex

import re
with open(r'text_file.txt', 'r') as f:
    data = f.readlines()

with open(r'text_file.txt', 'w') as f:
    for line in data:
        if re.findall(r"(?!^\d+$)^.+$", line):
            f.write(line)

edited Feb 07 '22 at 07:45

answered Feb 07 '22 at 06:38

Tal Folkman

2,368
1
7
21

thanks for your reply. I didn't know about regex. Can we exclude some letters or characters from that? For example, if there is an * in the lines, it keeps the line but for other characters, it deletes the lines? – pymn Feb 07 '22 at 07:00
of course! you can add what you want to the regex, if you need help doing this just say ;) @PeymanBahrami – Tal Folkman Feb 07 '22 at 07:05
Actually yes, I need a little bit help on this. I am reading the documents about regex. it says to negate a character or letter use hyphen. but I dont understand where to use that. in the same line? – pymn Feb 07 '22 at 07:18
you need to use it in the regex. I recommend you to look at this site - https://regexr.com/ to try some regex. if you need more help, tell me the problem – Tal Folkman Feb 07 '22 at 07:22
I updated the question for more details. sure, thank you very much. I think I need to read more about it. – pymn Feb 07 '22 at 07:29
This Regex is so complex. your code now returns an empty list. I searched your code: (?!...) Negative lookahead assertion (^$ ) finds the patterns I dont understand when they are in combination – pymn Feb 07 '22 at 08:40

DarkKnight · Answer 2 · 2022-02-07T14:13:35.043

1

If you want to do this without importing anything (e.g., re) then you could do this:

keep_these = []

def is_valid(t):
    try:
        float(t.replace('*', '0'))
        return True
    except ValueError:
        pass
    return False

with open('test.txt', encoding='utf-8') as infile:
    for line in infile:
        if all(is_valid(t) for t in line.strip().split()):
            keep_these.append(line)

print(keep_these)

Thus the keep_these list will contain references to the lines you want to keep which you could, for example, use to re-write the file

edited Feb 07 '22 at 14:13

answered Feb 07 '22 at 06:59

DarkKnight

19,739
3
6
22

thank you for your reply. in my textfile there are some lines having both numbers and words. I want to delete them as well. The problem with this code is that it keeps those lines. – pymn Feb 07 '22 at 07:10
That is **NOT** what you asked for in your question. I quote: "I want to delete all lines with the characters and words, and keep the lines with numbers" – DarkKnight Feb 07 '22 at 07:21
Thank you Olvin, your code works very well on my example. If you dont get mad at me, may I ask your help for another part. my text file is so big. so I couldnt bring it here. In some lines I have the numbers like 3.0. your code could not exclude such a format and considers the line to be deleted. – pymn Feb 07 '22 at 08:48
Answer edited to allow for new information about the input/output requirements – DarkKnight Feb 07 '22 at 14:14

score 0 · Answer 3 · answered Feb 07 '22 at 06:38

0

You can use the regex library re. One way to do that is to loop through the lines and then keep the line only if re.match("[^0-9 ]", line) == None.

answered Feb 07 '22 at 06:38

vjh

115
3

how to delete all lines containing Letters and characters for a textfile

3 Answers3