0

I am trying to read below from CSV and split accordingly:

"1000";" ";"";;0;0;0;0
"1001";"tit;le1";"desc";0;0;0;0
"1002";"title2";"desc2";0;0;0;0
"1003";"title3";"desc123 desc23 desc2 de
sc34 dfd desc45 desc454;,

dfd desc desc";0;0;0;0
"1004";"tilte4";"desc5";0;0;0;0

I am using [^;"][^"]+ and could get outcome of the strings between " " if not empty in above snippet.

I need the empty contents as well. Can you please help to correct the regex here [^;"][^"]+

Andrew Morton
  • 24,203
  • 9
  • 60
  • 84

2 Answers2

0

You want to get the stuff between ""? And there are no other " than those used as delimiters? Then "(.*?)" should work.

ByteEater
  • 885
  • 4
  • 13
  • The same without a reluctant quantifier: `"([^"]*)"` or `"([^"]*).`. – ByteEater Nov 17 '20 at 18:19
  • Neither of the 3 expressions worked for me. I am trying in URL (https://regex101.com/r/mw03oC/2) – Usha Ramani Nov 18 '20 at 10:58
  • Did you enter them including the "s? – ByteEater Nov 18 '20 at 13:38
  • Oh, and if I understand correctly, there's a multiline string to be matched. In that case you should use the `s` flag with the expression. Thus: `/"(.*?)"/gs`. – ByteEater Nov 18 '20 at 13:42
  • tried this, but it doesnt capture the 0s and empty ex - ;;0;0;0;0 in the first line and similarly in other lines all 0s. – Usha Ramani Nov 18 '20 at 17:13
  • Tried adding "(.*?)"|[^;"][^"]+ , except for the null/empty between ;; , everything seems to be now captured. i can split 0;0;0;0 further . but how to capture something between ;; – Usha Ramani Nov 18 '20 at 17:19
  • Then I didn't understand from you question what you wanted. So the ;s are separators, except when between "s, which always come in pairs and delimit strings – is that the complete specification? If so, see my new answer. – ByteEater Nov 18 '20 at 18:20
  • Thank you ByteEater. I have achieved with below regex "(.*?)"|;;|[^;"][^"]+ . ; are seperators but i need anything in between ;; as it could be an empty one that need to be captured as a result. – Usha Ramani Nov 19 '20 at 10:03
0

With the spec from my latest comment the right expression is /(("?).*?\2)(;|$)/gsm.

ByteEater
  • 885
  • 4
  • 13
  • I included parentheses for grouping in the previous answer, assuming that you don't want the "s. Now I think you do, but the parentheses are needed for a different reason: your target with this expression is the 1st capturing group, not the whole match. – ByteEater Nov 18 '20 at 18:29
  • For completeness, a version stripping the "s from values:`/ ("?)(.*?)\1(;|$)/gs`, with capturing group 2. – ByteEater Nov 18 '20 at 18:36
  • This worked as well. I am actually trying to split as each row, but because i couldnt split as rows, i am now thinking to split as each string and then create a row item later. I think you might be able to help me with my actual problem , because if i can split as single row , it would save me a lot. I had posted another post for an answer. https://stackoverflow.com/questions/64807334/regular-expression-to-split-strings-in-csv?noredirect=1#comment114583387_64807334. It would be great if you can help me here as well. – Usha Ramani Nov 19 '20 at 10:23
  • I've just noticed that there are no ;s at line ends, yet you want separation there (unless inside a string). So I edited the answer adding the `m` flag, so that e.g. `0` and `"1001"` aren't lumped together. – ByteEater Nov 20 '20 at 09:13
  • The question you linked is closed. Please edit it and apply for reopening or post yet another one which hopefully won't be closed. But specify your objective clearly, and preferably include the desired list of matches under the sample text. – ByteEater Nov 20 '20 at 09:15