1

I'm having a mapping issue when trying to convert a client supplied CSV into JSON data using JS. One of the columns in the CSV contains address data which will contain commas. I've tried changing the delimiter but for some reason when i read the CSV JS ignores the set delimiter and converts them to commas. This means the mapping when converting the data to JSON is incorrect. code and outputs below :-

JS :-

    $(document).ready(function() {
    $.ajax({
        type: "GET",
        url: "result.csv",
        dataType: "text",
        success: function(data) { $("body").append(csvJSON(data));}
     });
});

function csvJSON(csv){

    var lines=csv.split("\n");

    var result = [];

    var headers=lines[0].split(",");

    for(var i=1;i<lines.length;i++){

        var obj = {};
        var currentline=lines[i].split(",");


        for(var j=0;j<headers.length;j++){

            obj[headers[j]] = currentline[j];
        }

        result.push(obj);

    }

    return JSON.stringify(result); 
  }

CSV DATA :-

"type","geometry__type","geometry__coordinates__001","geometry__coordinates__002","properties__name","properties__address","properties__address2","properties__city","properties__state","properties__postal","properties__country","properties__phone","properties__phoneFormatted","properties__email","properties__web"
"Stockist","Point",-110.788,43.4705,"Whitechapel Ltd","Box 11719, 1135 Maple Way","Jackson,","Wyoming","WY",83002,"US",13077399478,"+1 307 739-9478","whitechapel@wyoming.com","www.whitechapel-ltd.com"
"Stockist","Point",103.82705,1.30637,"Thrive Design & Trading","19, Tanglin Road, #03-35","Tanglin Shopping Centre","Singapore",,247909,"Singapore","65-67357333","65-67357333","francis@thrive-products.com.sg",

CURRENT RESULT :-

   {  
      "type":"Stockist",
      "geometry__type":"Point",
      "geometry__coordinates__001":"-110.788",
      "geometry__coordinates__002":"43.4705",
      "properties__name":"Whitechapel Ltd",
      "properties__address":"\"Box 11719",
      "properties__address2":" 1135 Maple Way\"",
      "properties__city":"\"Jackson",
      "properties__state":"\"",
      "properties__postal":"Wyoming",
      "properties__country":"WY",
      "properties__phone":"83002",
      "properties__phoneFormatted":"US",
      "properties__email":"13077399478",
      "properties__web\r":"+1 307 739-9478"
   },

DESIRED RESULT :-

  {  
      "type":"Stockist",
      "geometry__type":"Point",
      "geometry__coordinates__001":"-110.788",
      "geometry__coordinates__002":"43.4705",
      "properties__name":"Whitechapel Ltd",
      "properties__address":"Box 11719, 1135 Maple Way",
      "properties__address2":"Jackson,",
      "properties__city":"Wyoming",
      "properties__state":"WY",
      "properties__postal":"83002",
      "properties__country":"US",
      "properties__phone":"13077399478",
      "properties__phoneFormatted":"+1 307 739-9478",
      "properties__email":"whitechapel@wyoming.com",
      "properties__web":"www.whitechapel-ltd.com"
   },
M T
  • 136
  • 1
  • 8
  • 1
    in which field you want to remove comma – Negi Rox Nov 15 '18 at 12:53
  • " I've tried changing the delimiter but for some reason when i read the CSV JS ignores the set delimiter and converts them to commas."...the code above uses commas throughout. Can you show us an example of what you did, exactly? Your statement doesn't entirely make sense...code doesn't "ignore" things, it just does what it's told, so perhaps you made some small mistake. Also please show the source CSV data which would translate to the JSON above. That will make it a lot clearer. Normally a CSV puts double-quotes around the fields so you can tell which bits is a delimiter and which is content – ADyson Nov 15 '18 at 12:54
  • I dont want to remove the commas, as you can see im splitting the CSV data by comma, this is causing data to not match up to the correct header. Changing the delimiter doesn't seem to work, JS always reads the file with comma as the delimiter. – M T Nov 15 '18 at 12:55
  • what is content of result.csv – Kamuran Sönecek Nov 15 '18 at 12:55
  • can you please post a simple fiddle with sample data (csv) ? – fanjabi Nov 15 '18 at 12:56
  • Anyway this problem has been solved before, e.g. https://www.papaparse.com/ and probably others, if you search. So unless you're writing this as an exercise to learn the techniques, you're probably better off just using one of the existing tools – ADyson Nov 15 '18 at 12:58
  • because some fields like `"properties__address2":"Jackson,"` has comma and `javascript` cannot sense it is part of string – RGhanbari Nov 15 '18 at 12:58
  • P.S. your picture is unreadable, and anyway CSV data is not an image, it's text, just like your JSON. Copy and paste, please. Then people can see it properly and also use it to work with your code. – ADyson Nov 15 '18 at 12:58
  • Ive added the CSV text to the question, it looks messy to me even though it appears correctly in excel/libre – M T Nov 15 '18 at 13:07

1 Answers1

6

Since all your values are wrapped in quotes, you should change your split parameter to a regular expression that only splits when a comma is outside of a pair of quotes.

Because of the way your data is formatted you should keep in mind that you'll have a lot of unnecessary escaped quotation strings that you'll need to clean up. But you have to make sure you don't accidentally clean up any quotes that were already escaped in the csv to begin with. But I suppose these edge cases are why people use a prebuilt library.

EDIT

In regards to my previous statement. You might be able to get away with simply removing only the quotes at the beginning and end of the string. But I'm just gonna throw in a disclaimer and say your mileage with this result may vary depending on your data.

const csv=`\"type","geometry__type","geometry__coordinates__001","geometry__coordinates__002","properties__name","properties__address","properties__address2","properties__city","properties__state","properties__postal","properties__country","properties__phone","properties__phoneFormatted","properties__email","properties__web"
"Stockist","Point",-110.788,43.4705,"Dummy address","Box 11719, 1135 Maple Way","Jackson,","Wyoming","WY",83002,"US",12313213213213,"+111111111","dummy ","dummy web address"`

console.log(csvJSON(csv))

function csvJSON(csv){

    var lines=csv.split("\n");

    var result = [];
    
    var commaRegex = /,(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)/g
    
    var quotesRegex = /^"(.*)"$/g

    var headers = lines[0].split(commaRegex).map(h => h.replace(quotesRegex, "$1"));

    for(var i=1;i<lines.length;i++){

        var obj = {};
        var currentline=lines[i].split(commaRegex);


        for(var j=0;j<headers.length;j++){

            obj[headers[j]] = currentline[j].replace(quotesRegex, "$1");
        }

        result.push(obj);

    }
    return result;
    //return JSON.stringify(result); 
  }
M T
  • 136
  • 1
  • 8
Khauri
  • 3,753
  • 1
  • 11
  • 19
  • I edited the CSV data in your answer to include all of my data not just the first two rows, notice the error, I'm having a bit of trouble finding the undefined value, i'm a bit nooby when it comes to chromes debugging tools, any ideas? – M T Nov 15 '18 at 14:08
  • I saw. The error comes from a trailing newline at the end of the input. Remove it and it works. Note that I used a template literal string, so any newline in the code will also be in the string. But this is another edge-case reason to use a pre-built parser. You don't have to deal with empty space at the beginning or end of the file. – Khauri Nov 15 '18 at 14:10
  • Whitespace is actually a small problem for your homebrew parser. But If you want a quick work around you can just use [trim](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/trim) on csv string before you start parsing. And even potentially on each line too. But that's up to you. – Khauri Nov 15 '18 at 14:12
  • sorted, you're solution works perfectly. note to self: stop trying to stay away from regex – M T Nov 15 '18 at 14:23