Reading text file and matching against a value threshold

Question

I have a large number of .txt files (N > 1000) that have data of interest, and I wish to identify files whose "mean" value exceeds a given threshold (say, 0.5), and print the name of the file in which that is the case. The data in each file are organized like this:

[
    {
      "parameter": {
          "max": 0.6640571758027143,
          "mean": 0.13404294175225137,
          "min": 0.0,
          "std": 0.09435715828616785
      },
      {
        "intensity": [
            {
                "max": [
                    3.1719575216784217
                ],
                "mean": [
                    -3.552713678800501e-17
                ],
                "min": [
                    -2.707115982837323
                ],
                "std": [
                    1.0000000000000004
                ]
                ...

To make matters slightly more complicated, I only wish to read the "mean" value for the "parameter" and not for "intensity".

I had the idea that I should read this file in using a for loop, roughly containing the following code:

subjects = [allmyfilenames]
for subj in subjects:
    file = open('C:/%s.txt' %subj, 'r')
    for line in file.readlines(): print line

From there, I am a bit lost. How might I identify the correct line to use in matching against my threshold (0.5)?

If that is a valid JSON file, which it looks like, [this](http://stackoverflow.com/a/2835672/322909) answer might be of some use to you. — John, Nov 06 '12 at 03:18
Your input file is ill-formed -- all the brackets and braces don't come in matched sets. — martineau, Nov 06 '12 at 03:30
yes, sorry, I extracted the input from a larger set for simplicity, but apparently I did not match the brackets and braces properly for this example. — user1801867, Nov 06 '12 at 03:33
@anijhaw: Your edits did not fix the input file -- it's not quite that simple...so I did a rollback to the OP's original version. — martineau, Nov 06 '12 at 11:24

score 0 · Accepted Answer · answered Nov 06 '12 at 03:59

0

Try something like this, I wasnt entirely sure of your data format but something like this might work for the data format above. Not tested**

subjects = [allmyfilenames]
    for subj in subjects:
        with open('C:/%s.txt' %subj, 'r') as datafile:
            data = json.load(datafile)
            if data[0]['parameter']['mean'] > 0.5:
                print subj

answered Nov 06 '12 at 03:59

anijhaw

8,954
7
35
36

Thanks - I wasn't familiar with the json file format, but this was exactly what I needed! – user1801867 Nov 06 '12 at 04:10

Reading text file and matching against a value threshold

1 Answers1