0

I am trying to split strings I input by commas (.split(',')) but I have ran into an issue with my strings.

Sometimes, the strings I'm provided have commas that are part of a single name. For example: "John, Smith".

My strings usually look like this: "Emily, Sasha Flora, Camille-O'neal" etc which results in 3 objects as I need but sometimes I'm provided with a string like "Emily, Sasha Flora, Camille-O'neal, John, Smith" and the result is that I get are 5 objects when it's actually 4. How can I make the code work with provided strings that contains commas and still get the result I need?

My code in case is this:

var myValue = ["Emily, Sasha Flora, John, Smith, Camille-O'neal"]
var names = myValue.split(',').map(item => ({ name: item.trim() }));

I've tried getting help from ChatGPT but it didn't help.

I'm expecting to get the results like so:

Emily
Sasha Flora
John, Smith
Camille-O'neal
Barmar
  • 741,623
  • 53
  • 500
  • 612
  • Does this answer your question? [How can I parse a CSV string with JavaScript, which contains comma in data?](https://stackoverflow.com/questions/8493195/how-can-i-parse-a-csv-string-with-javascript-which-contains-comma-in-data) Edit: A lot of the answers on that actually don't bother trying, due to the string not being "valid" (conforming to a standard) to begin with. Maybe changing the source, as suggested below is the play here. – Tim Lewis Aug 09 '23 at 19:52
  • 2
    Do you have any control over the incoming string? Can you request a different delimiter? – mykaf Aug 09 '23 at 19:52
  • 3
    I would go back to the source and see if that can be improved. – imvain2 Aug 09 '23 at 19:53
  • 5
    How do you realistically expect the code to figure out which commas are delimiters and which are part of a name? I can't even figure it out myself. How do you know? – Barmar Aug 09 '23 at 19:59
  • 2
    @TimLewis That question is different because there are quotes around the field that shouldn't be split. Here there aren't even any quotes. – Barmar Aug 09 '23 at 20:02
  • You cannot. There is no solution to the problem as stated. – bhspencer Aug 09 '23 at 20:02
  • @Barmar Remember that to _normal people_, computers are magic. That said, I wonder if this would actually be an appropriate use-case for an LLM to classify and extract names from an unstructured string? – Dai Aug 09 '23 at 20:03
  • 1
    @Dai I'm not even sure a LLM could do this. Normally we write "Surname, Firstname". But the one he wants to keep is "Firstname, Surname". How is it supposed to figure that out if it's not in training data? – Barmar Aug 09 '23 at 20:04
  • @Barmar Ah whoops; I didn't even see those... You're correct; it's not a 1:1 duplicate. I'll leave it as a duplicate flag since there's still some valid information there (and the question was correctly closed anyway). Cheers! – Tim Lewis Aug 09 '23 at 20:06
  • 1
    This is a problem that needs to be handled at the source of the data. Without any additional delimiters, like quotes or different delimiters between the names, there really isn't a way to program the ability for the browser to distinguish a name separated by commas or a list of names. – imvain2 Aug 09 '23 at 20:10
  • Most comments already say this is a problem that cannot be solved under given circumstances, and some advise good workarounds. Why is this question closed? I think the question is legit and clear. Being unsolvable doesn't make it unclear. – özüm Aug 10 '23 at 06:38
  • @özüm Stackoverflow doesn't actually have a close reason for an "unanswerable" question. For instance, I voted to close as a duplicate, while the other 2 close-voters used the "Needs Details or Clarity" reason. It's maybe not the perfect fit, but the other option would be a Custom one, but those are rarely used. That all being said, it _should_ be closed, as a closed question cannot be answered (the system prevents answers from being added), which seems appropriate for a question that cannot be answered Additionally, you said "under given circumstances", which "more details" could help with – Tim Lewis Aug 11 '23 at 18:13
  • @TimLewis, thanks for the clarification. Your vote is legit because duplication is a valid close reason. However, for other voters, I still have the same stand because the question is unsolvable "under given circumstances", but sometimes given circumstances could not be changed. So we should accept the question's circumstances as it is if OP does not want to add additional details. (I'm not just talking about this question but about general principles). Sometimes, technology evolves and unsolvable problems become solvable (Obviously not this one :)) ) – özüm Aug 11 '23 at 18:37
  • @özüm No problem, and that's totally fair! I would still say that even if we reopened the question, not much would change, unless the OP decided to edit the question in a way to make it answerable (i.e. adjusting the syntax to something similar to the linked duplicate). The downside to reopening is that we would likely see more "It's not possible" answers, like the one below, which is redundant after the first instance. I agree the close verbiage isn't great, but I would say that this is better off closed, unless OP can provide a valid reason to reopen it. Cheers – Tim Lewis Aug 11 '23 at 18:43

1 Answers1

2

You cannot solve the problem as written. How will you know if the correct answer is:

"Emily", "Sasha Flora", "John, Smith", "Camille-O'neal"
"Emily", "Sasha Flora", "John", "Smith, Camille-O'neal"
"Emily, Sasha Flora", "John", "Smith, Camille-O'neal"
etc.

"Garbage in, garbage out" as they say. You need a data format that disambiguates.

Phrogz
  • 296,393
  • 112
  • 651
  • 745