0

I am trying to open and parse a Json file using python script and write its content into another Json file after formatting it as I want. Now my source Json file has character /" which I want to replace with a blank. I don't have any issue in parsing or creating news file only the issue is that character is not getting replaced by blank. How do I do it. Earlier I have achieved the same task but then there was no such character in the document that time.

Here is my code

doubleQuote = "\""


try:

    destination = open("TodaysHtScrapedItemsOutput.json","w") # open JSON file for    output
except IOError:
    pass

with open('TodaysHtScrapedItems.json') as f: #load json file
    data = json.load(f)
print "file successfully loaded"
for dataobj in data:
    for news in data[cnt]["body"]:
        news = news.encode("utf-8")
        if(news.find(doubleQuote) != -1): # if doublequotes found in first body tag
        #   print "found double quote"
            news.replace(doubleQuote,"")
        if(news !=""):
            my_news = my_news +" "+ news

    destination.write("{\"body\":"+ "\""+my_news+"\"}"+"\n")
    my_news = ""
    cnt= cnt + 1

Here is how the file looks and the quotes near the red marked text should disappear

Bach
  • 6,145
  • 7
  • 36
  • 61
Yogesh D
  • 1,663
  • 2
  • 23
  • 38

2 Answers2

1

Some things to try:

You should write and read the json files as binaries, so "w" becomes "wb" and you need to add "rb".

You can define your search string as unicode, with:

doubleQuote = u'"'

You can lookup the integer value of the character with this command.

ord(u'"')

I get 34 as a response. The reverse function is chr(34). Are the double quotes you are looking for the same double quotes as the json contains? See here for details.

You don't need the if loop to check if news contains the '"'. Doing a replace on 'news' is enough.

Try these steps and let me know if it still doesn't work.

Community
  • 1
  • 1
philshem
  • 24,761
  • 8
  • 61
  • 127
  • I tried this... `doubleQuote = u'"'` `news.replace(doubleQuote,"")` but still the same and even changed the mode to "wb" – Yogesh D Mar 06 '14 at 08:38
  • without trying to replace, please read the '"' character and print the integer value, as described above. Also, please read the file as binary, too ('rb') – philshem Mar 06 '14 at 08:48
  • yes it prints 34 and even changed read mode to 'rb' but still the file contains that character...I think my code finds out the simple double-quotes from the file and replaces them but the ones that are preceded with '\'(like:\") are not replaced... – Yogesh D Mar 06 '14 at 09:02
0

str.replace doesn't change the original string.So you need to assign the string back to news.

    if(news.find(doubleQuote) != -1): # if doublequotes found in first body tag
    #   print "found double quote"
        news = news.replace(doubleQuote,"")
gfreezy
  • 98
  • 7
  • and that solved my problem but previously in other program I just used to call string.replace and it used to work...now I am confused totally....no matter that solved my problem...thnx – Yogesh D Mar 06 '14 at 09:07