I have a large number of .txt files (N > 1000) that have data of interest, and I wish to identify files whose "mean" value exceeds a given threshold (say, 0.5), and print the name of the file in which that is the case. The data in each file are organized like this:
[
{
"parameter": {
"max": 0.6640571758027143,
"mean": 0.13404294175225137,
"min": 0.0,
"std": 0.09435715828616785
},
{
"intensity": [
{
"max": [
3.1719575216784217
],
"mean": [
-3.552713678800501e-17
],
"min": [
-2.707115982837323
],
"std": [
1.0000000000000004
]
...
To make matters slightly more complicated, I only wish to read the "mean" value for the "parameter" and not for "intensity".
I had the idea that I should read this file in using a for loop, roughly containing the following code:
subjects = [allmyfilenames]
for subj in subjects:
file = open('C:/%s.txt' %subj, 'r')
for line in file.readlines(): print line
From there, I am a bit lost. How might I identify the correct line to use in matching against my threshold (0.5)?