0

I want to convert a text file to JSON. The problem I am having is that it somehow adds all the lines from the txt file in one very long line in the JSON. Please help.

import re
import os
import json

# list of files to be sent to Json file
loadfile = open("aivc-tests/baseline/clf_risk_start.txt", "r")
print("Loading file..."
#takes away the extention from name of file
name_of_file = "clf_risk_start"
print("Removing unnecessary items...")
if loadfile.mode == 'r':
    contents = loadfile.read()
    #remove all unnecessary items
    remove_dashes = re.sub("-","", contents)
    remove_hashes =re.sub("##", "", remove_dashes)
    remove_intent =re.sub("intent", "", remove_hashes)
    remove_colan =re.sub(":", "", remove_intent)
    remove_generic =re.sub("Generic", "", remove_colan)
    remove_critical =re.sub("critical", "", remove_generic)
    remove_line_one=re.sub("<! Generated using Chatette v1.6.2 >", "", remove_critical)
    new_line_removed =remove_line_one.strip().replace('\n', ',')
    edited_contents = new_line_removed  
    # print(edited_contents)
    #return(edited_contents)
print("Formating...")
data1 = {}
data1['clf_test_utterances'] = []
data1['clf_test_utterances'].append({
    name_of_file: edited_contents
 })
data2 = {}
data2['testing Suit'] = []
data2['testing Suit'].append({
    'Name': 'test',
    'Description': 'this is just a test',
    'Test Dialogue': '',
    "format_version": 5,
    "clf_test_utterances": data1
})
print("Exporting to json")
# this will write to json file
with open('test_suit_single.json', 'w') as outfile:
    json.dump(data2, outfile)
print('successfully converted')

and the output is what I have below

{
   "testing Suit":[
      {
         "Name":"test",
         "Description":"this is just a test",
         "Test Dialogue":"",
         "format_version":5,
         "clf_test_utterances":{
            "clf_test_utterances":[
               {
                  "clf_risk_start":"I am experiencing signs of covid 19. How can I get tested to understand if I i am well as i think?, I am experiencing symptoms of coronvirus. How can I be checked out to verify if I have it or not?, I am feeling signs of corona. How can I be tested to be sure if I I should self isolate?, I am feeling signs of coronvirus. How can I test to learn if I i am healthy?, I am ms of covid 19. How can I "
               }
            ]
         }
      }
   ]
}

How can I have them all in separate lines?

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278

1 Answers1

0

That's just the way JSON works. It doesn't split your text based on width, because a literal newline in string in JSON is illegal. (Newlines will always be escaped.)

You should just turn on word wrapping in your text editor. If you really want to split the lines into multiple rows, you can use an array.

Replace:

new_line_removed =remove_line_one.strip().replace('\n', ',')

With:

new_line_removed =remove_line_one.strip().split('\n')

Božo Stojković
  • 2,893
  • 1
  • 27
  • 52