0

I need a program to find a string (S) in a file (P), and return the number of thimes it appears in the file, to do this i decided tocreate a function:

def file_reading(P, S):
  file1= open(P, 'r')
  pattern = S
  match1 = "re.findall(pattern, P)"
    if match1 != None:
      print (pattern)

I know it doesn't look very good, but for some reason it's not outputing anything, let alone the right answer.

sco1
  • 12,154
  • 5
  • 26
  • 48
j.evans
  • 9
  • 5
  • 5
    Why are you encapsulating your `re.findall()` call in quotation marks? And not using your input file? – sco1 Feb 12 '18 at 15:01
  • Fix your indentation. Badly indented Python code is nonsense. – khelwood Feb 12 '18 at 15:02
  • 1
    You never actually read the file. – pault Feb 12 '18 at 15:02
  • That's the way i was taught to do it, Is that not right? – j.evans Feb 12 '18 at 15:03
  • 1
    No. Your function isn't going to get called if it's a string. You are misremembering what you were taught. – sco1 Feb 12 '18 at 15:06
  • Please, give us more info i.e. sample of file content and pattern that you are using. – Ilija Feb 12 '18 at 15:09
  • 1
    @j.evans please do not approve edits that drastically change how the code in your question performs. If you desire to make such changes please add them as an [edit] to the question so the necessary context for existing answers/comments is not lost. – sco1 Feb 12 '18 at 15:44

3 Answers3

0

There are multiple problems with your code.

First of all, calling open() returns a file object. It does not read the contents of the file. For that you need to use read() or iterate through the file object.

Secondly, if your goal is to count the number of matches of a string, you don't need regular expressions. You can use the string function count(). Even still, it doesn't make sense to put the regular expression call in quotes.

match1 = "re.findall(pattern, file1.read())"

Assigns the string "re.findall(pattern, file1.read())" to the variable match1.

Here is a version that should work for you:

def file_reading(file_name, search_string):
    # this will put the contents of the file into a string
    file1 = open(file_name, 'r')
    file_contents = file1.read()
    file1.close()  # close the file

    # return the number of times the string was found
    return file_contents.count(search_string)
pault
  • 41,343
  • 15
  • 107
  • 149
  • for GodOrWhoeverIsInChargeOrNot's sake, __close that file__ ! – bruno desthuilliers Feb 12 '18 at 15:18
  • @brunodesthuilliers I made an update based on your suggestion- I agree it's better to be explicit, but my understanding was that the file object would close when it goes out of scope: https://stackoverflow.com/questions/2404430/does-filehandle-get-closed-automatically-in-python-after-it-goes-out-of-scope **EDIT**: I take that back, I just read the second answer which explains the case when exceptions are raised. – pault Feb 12 '18 at 15:23
0

There are a few errors; let's go through them one by one:

  1. Anything in quotes is a string. Putting "re.findall(pattern, file1.read())" in quotes just makes a string. If you actually want to call the re.findall function, no quotes are needed :)
  2. You check whether match1 is None or not, which is really great, but then you should return that matches, not the initial pattern.
  3. The if-statement should not be indented.

Also:

  • Always a close a file once you have opened it! Since most people forget to do this, it is better to use the with open(filename, action) syntax.

So, taken together, it would look like this (I've changed some variable names for clarity):

def file_reading(input_file, pattern):
    with open(input_file, 'r') as text_file:
        data = text_file.read()
        matches = re.findall(pattern, data)

        if matches:
            print(matches)  # prints a list of all strings found
Marlo
  • 490
  • 3
  • 7
  • 21
  • The pythonic way to test against `None` is using identity testing, ie `if matches is not None:` (`None` is garanteed to be a singleton). But since `re.findall()` returns an empty list if nothing matches, the correct test here is a plain `if matches:` – bruno desthuilliers Feb 13 '18 at 09:42
  • @brunodesthuilliers thanks for pointing that out! This is my first time editing an answer based on feedback. Should I edit the code or add a note at the end? – Marlo Feb 13 '18 at 15:23
0

You can read line by line instead of reading the entire file and find the nunber of time the pattern is repeated and add it to the total count c

def file_reading(file_name, pattern):
  c = 0
  with open(file_name, 'r') as f:
    for line in f:
      c + = line.count(pattern)
  if c: print c 
Keerthana Prabhakaran
  • 3,766
  • 1
  • 13
  • 23