0

This may seem like a bit of a basic question but why am I unable to perform two separate 'functions' on a file when I open it. To clarify, in the below code I iterate through the .json files in a directory and for each I want to perform two seperate 'functions' (I know they aren't actually functions): 1) find out how many times "text" appears in it, and 2) get rid all list items that contain "total comments".

import os
import json
import re

for filename in os.listdir():
    y = 0
    if ".json" in filename:
        with open(filename, 'r', encoding='utf8') as f:
            print(filename) 
           #~~~~~~~~~~~~~~~~~~~~~~~#
            x = f.read()
            u = re.findall("text", x)
            print(len(u))
            y+=len(u)
           #~~~~~~~~~~~~~~~~~~~~~~~#            
            comments = json.load(f)
            for item in comments:
                if 'total comments' in item:
                    comments.remove(item)
            print(comments)

However, the first 'function' is the only one that seems to actually work; in this instance the code will result in a JsonDecodeError, and if the order is switched around then the regex 'function' won't find any results. The only explaination I can think of is that the variable f is changing its type when I perform read() or json.load() on it, but that almost seems counter intuitive. Can anyone shed some light on this?

  • I understand why calling `read()` twice wouldn't work but I am still unsure why that would apply to `json.load()`. Is it the underlying mechanism the same insomuch that the read pointer is at the end of the file? – Charlie Armstead Mar 10 '21 at 17:14
  • 1
    That's what I was hinting at, so did you try seeking to the beginning before doing the `json.load()`? – Random Davis Mar 10 '21 at 17:26
  • I have added `seek(0)` between them and it has fixed the problem so I suppose it must be! – Charlie Armstead Mar 10 '21 at 17:26

0 Answers0