1

I'm using python glob.glob("*.json"). The script returns a file of json files, but after applying some operations it creates a new json file. If I run the same script again it adds this new file in list...

glob.glob("*.json")

Output:

['men_pro_desc_zalora.json',
 'man_pro_desc_Zalando.json',
 'man_pro_desc_nordstrom.json']

End of code:

with open("merged_file.json", "w") as outfile:
      json.dump(result, outfile)

After running addition of file merged_file.json if I run again glob.glob("*.json") it returns:

['men_pro_desc_zalora.json',
 'man_pro_desc_Zalando.json',
 'man_pro_desc_nordstrom.json',
merged_file.json]
xavirubio
  • 13
  • 1
Mobin Al Hassan
  • 954
  • 11
  • 22

2 Answers2

2

You can make the pattern less exclusive as some comments mention by doing something like glob.glob('*_*_*_*.json'). More details can be found here https://docs.python.org/3.5/library/glob.html#glob.glob.

This isn't ever clean and since glob isn't regular regex it isn't very expressive. Since ordering doesn't seem very important you could do something like

excludedFiles = ['merged_file.json']
includedFiles = glob.glob('*.json')

# other code here

print list(set(includedFiles) - set(excludedFile))

That answers your question however I think a better solution to your problem is separate your raw data and generated files into different directories. I think that's generally a good practice when you're doing adhoc work with data.

William Hammond
  • 623
  • 4
  • 11
0

If you want to remove only the latest file added, then you can try this code.

import os
import glob
jsonFiles = []
jsonPattern = os.path.join('*.json')

fileList = glob.glob(jsonPattern)

for file in fileList:
  jsonFiles.append(file)
print jsonFiles

latestFile = max(jsonFiles, key=os.path.getctime)
print latestFile

jsonFiles.remove(latestFile)
print jsonFiles

Output:

['man_pro_desc_nordstrom.json', 'man_pro_desc_Zalando.json', 'men_pro_desc_zalora.json', 'merged_file.json']
merged_file.json
['man_pro_desc_nordstrom.json', 'man_pro_desc_Zalando.json', 'men_pro_desc_zalora.json']