2

I'm looking for a more sufficient way (or lib) to parse csv/tsv that contains around 5000 ~ 10000 rows (using it to render a table with cdk virtual scroll for previewing the file). My current implementation is quite banana for that amount of rows.

this.httpClient.get(this.dataSrc, {
  headers: {
    Accept: 'text/plain'
  },
  responseType: 'text'
}).subscribe(res => {
  // handles tsv or csv content
  const lines = res.split(/\r|\n/);
  const separator = lines[0].indexOf('\t') !== -1 ? '\t' : ',';
  this.parsedCSV = lines.map(l => l.split(separator));
});
halfer
  • 19,824
  • 17
  • 99
  • 186
Ethan Vu
  • 2,911
  • 9
  • 25
  • [fast-csv](https://www.npmjs.com/package/fast-csv) is a pretty lightweight package for that. – ahsan Nov 18 '21 at 10:48

1 Answers1

1

It looks good, but parsing big amount of data will freeze the thread. You should sleep and await the thread so the parser won't freeze.

Here is how I would do it, have a look.

const sleep = (msTime) => {
 return new Promise((resolve, reject) => {
    setTimeout(() => {
      resolve();
    }, msTime);
  });
}

const parse(csv, onProgress) => {
  const lineSplitter = RegExp('\r|\n', 'g');
  const result = []
  var index =0;
  let match;
  // split one each time, so the thread won't freeze
  while ((match = lineSplitter.exec(csv)) !== null) {
    const line = match[0];
    const separator = line.indexOf('\t') !== -1 ? '\t' : ',';
    result.push(line.split(separator))
    if (index % 30 === 0)
       await sleep(10); // This will make sure the thread won't freeze
    if (onProgress)
       await onProgress((index / lines.length) * 100);
    index++;
  }
 return result;
}
halfer
  • 19,824
  • 17
  • 99
  • 186
Alen.Toma
  • 4,684
  • 2
  • 14
  • 31
  • Hi there Alen. I wonder, would you be able to use an English spelling checker on your work? It would help volunteer editors out enormously. – halfer Dec 11 '21 at 11:04