I am trying to split a string using regex. I need to use regex in nifi to split a string into groups. Could anyone helps me how to split below string using regex.
I have a string like this:
"abc","-9223371901096288826","/home/test/20170614","abc.com","Hello,Test","7462200","4622012","1296614","1029293","893529","a:ce:o:5:l:p:MMM dd HH:mm:ss","Logs","UTF8","<111>Jun 14 12:43:20 logs: Info: 1497462198.717 13073 1.22.333.44 TCP/200 168 TCP_CONNECT 1.22.33.44:443 ""GO\ABC.COM"" DIRECT/img.abc.com - test_abc_7-DefaultGroup-DefaultGroup-NONE-NONE-NONE-DefaultGroup <IW_adv,3.9,-,""-"",-,-,-,-,""-"",-,-,-,""-"",-,-,""-"",""-"",-,-,IW_adv,-,""-"",""-"",""Unknown"",""Unknown"",""-"",""-"",0.10,0,-,""-"",""-"",-,""-"",-,-,""-"",""-"",-,-,""-""> - -"
I want to split by commas but I need to ignore commas in quotes. I want result something like this :
group 1 - abc
group 2 - -9223371901096288826
group 3 - /home/test/20170614
group 4 - abc.com
group 5 - Hello,Test
group 6 - 7462200
group 7 - 4622012
group 8 - 1296614
group 9 - 1029293
group 10 - 893529
group 11 - a:ce:o:5:l:p:MMM dd HH:mm:ss
group 12 - Logs
group 13 - UTF8
group 14 - <111>Jun 14 12:43:20 logs: Info: 1497462198.717 13073 1.22.333.44 TCP/200 168 TCP_CONNECT 1.22.33.44:443 ""GO\ABC.COM"" DIRECT/img.abc.com - test_abc_7-DefaultGroup-DefaultGroup-NONE-NONE-NONE-DefaultGroup <IW_adv,3.9,-,""-"",-,-,-,-,""-"",-,-,-,""-"",-,-,""-"",""-"",-,-,IW_adv,-,""-"",""-"",""Unknown"",""Unknown"",""-"",""-"",0.10,0,-,""-"",""-"",-,""-"",-,-,""-"",""-"",-,-,""-""> - -
I tried so many regex to split but unable to get proper result.
I tried ,(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)
regex found from this link.
Above regex works great in Java for split() function but I don't want to use in Java.
I tried (?<=\")([^,]*)(?=\")
regex and split the string in groups by commas but it also split inside double quotes also.
Could anyone help me. Thanks in Advance.