I've done some wrong manipulation of a 100 json files. Not sure what happened, but most of my json files now have a random number of the last characters repeated (as per image below). Is there a way to clean a json file by deleting characters starting from the last one, until the json file has returned into a clean json format ?
Asked
Active
Viewed 122 times
0
-
1The short answer is `no`. It looks like you've overwritten part of the file with a shorter version, leaving the original. You are supposed to parse the file with the `json` library, make changes to the resulting object and rewrite the file as text. – quamrana Mar 30 '23 at 08:10
-
1Please include your code and the output as text and not as images. – ewokx Mar 30 '23 at 08:11
-
@ewokx there is no code and no output to show unfortunately. That's what I'm actually looking for... – LBedo Mar 30 '23 at 08:13
1 Answers
1
You can use regular expressions. An alternative would be string manipulation, but in this case regex is quicker to write, especially for one-time-use code.
import re
files = ['a.json','b.json',...] # populate as needed
for filename in files:
with open(filename,'r') as file:
content = file.read()
new_content = re.match('([\s\S]+\}\]\}\})[\s\S]+?',content).group(1)
with open(filename,'w') as file:
file.write(new_content)
This regex has several parts.
[\s\S]
matches all characters (whereas .
would not match newlines and some other characters).
The greedy [\s\S]+
matches as much as possible, and the lazy [\s\S]+?
matches as little as possible (in this case, the trailing text we don't want).
We then parenthesise the part we do want to keep, ([\s\S]+\}\]\}\})
, and extract that using .group(1)
and write this to the file.
For more information, see Reference - What does this regex mean?, and in future I would suggest manipulating JSON using the builtin json
library.

Mous
- 953
- 3
- 14