6

I have an array of objects. I need to convert it in .jsonl format and send it as response using node in a lambda function i have been trying to change it as a string and add '\n' to make it a new line but it didn't work

  • 2
    could you post sample data and expected format? – richytong May 30 '20 at 18:22
  • 1
    Please clarify. You can put an array of objects in JSON format all on one line and it's a valid single-record JSONL file. If you want each item of the array to be a separate line/record, just convert each element to a string individually and join the resulting strings together with newlines. – Mark Reed May 30 '20 at 18:25
  • 1
    https://www.npmjs.com/package/jsonlines – user120242 May 30 '20 at 18:26
  • 1
    You should probably post that as an answer instead of a comment, @user120242. – Mark Reed May 30 '20 at 18:28
  • the package posted above is 6 years old, don't use it. same for https://www.npmjs.com/package/json-to-jsonl - no updates for 5 years. use the top answer from below, it is that simple. – Oleg Abrazhaev May 23 '23 at 09:05

2 Answers2

15

Simple code to generate jsonlines. jsonlines is really just a bunch of one-line JSON objects stringified and concatenated with newlines between them. That's it.
The other issue you will need to deal with is escaping unicode, so when you write to a file, you must use UTF-8 encoding.

repl.it demo using jsonlines npm library: https://repl.it/repls/AngelicGratefulMoto

Simple plain JS demo:

data = [{ jsonlines: 'is awesome' },{ jsonlines: 'is awesome' },{ jsonlines: 'is awesome' },{ jsonlines: 'is awesome' },{ jsonlines: 'is awesome' }]

console.log(
data.map(x=>JSON.stringify(x)).join('\n')
)
user120242
  • 14,918
  • 3
  • 38
  • 52
  • 1
    the link https://replit.com/repls/AngelicGratefulMoto which is repl.it demo using jsonlines npm library is dead – imsheth May 26 '21 at 06:11
-1

Approaches to resolve the issue for larger amount of data conversion from .json to .jsonl :

  1. Monkey patching trial before implementing @user120242's answer failed due to presence of { , }, [, ] in the data

    const sampleData = [{ jsonlines: 'is awesome' },{ jsonlines: 'is awesome' },{ jsonlines: 'is awesome' },{ jsonlines: 'is awesome' },{ jsonlines: 'is awesome' }]
    
    console.log(JSON.stringify(sampleData).replace('[', '').replace(']', '').replaceAll('},{', '}\n{'));
  2. @user120242's answer works (I wanted a solution that was free from any external libraries or packages as far as possible) for smaller data and is indeed a clean solution which worked for me upto data which was ~100 MB of array of objects, beyond that it fails (my solution was working in node.js v14.1.0 being executed by Docker version 20.10.5, build 55c4c88 using DockerOperator in airflow v2.0.1 upto data which was ~100 MB of array of objects and it was failing miserably for data in the range of ~750 MB of array of objects with this issue - JSON.stringify throws RangeError: Invalid string length for huge objects)

  3. Trail for similar solution to https://dev.to/madhunimmo/json-stringify-rangeerror-invalid-string-length-3977 for converting .json to .jsonl didn't work with same issue as above - JSON.stringify throws RangeError: Invalid string length for huge objects

  4. Implementing for...of from @Bergi's answer - Using async/await with a forEach loop worked out with great performance (my implementation was working in node.js v14.1.0 being executed by Docker version 20.10.5, build 55c4c88 using DockerOperator in airflow v2.0.1 upto data which was ~750 MB of array of objects)

const fsPromises = require('fs').promises;
const writeToFile = async () => {
    const dataArray = [{ jsonlines: 'is awesome' },{ jsonlines: 'is awesome' },{ jsonlines: 'is awesome' },{ jsonlines: 'is awesome' },{ jsonlines: 'is awesome' }];
    for (const dataObject of dataArray) {
        await fsPromises.appendFile( "out.jsonl" , JSON.stringify(dataObject) + "\n");
    }
}

P.S. : You'll face Node JS Process out of memory with larger data (typically >100 MB)if you haven't already provided extra memory above the default to node.js v14.1.0, the following worked out for usage inside Dockerfile (replace 6144 with the amount of memory in MB which you want to allocate)

CMD node --max-old-space-size=6144 app.js
imsheth
  • 31
  • 2
  • 18
  • 36