1

I have a QString in a format similar to QString word = "123,12,1,"This is, a test"; (Extracted from a CSV file). I would like to split it up at each comma, excluding any commas in the string in the last cell. The list would be similar to {"123", "12", "1", "\"This is, a test\""}.

The format is a number that has a maximum of 3 digits, then a number with a maximum of 2 digits, then a number with a maximum of 1 digit, followed by a string that can include commas. There should always be 4 QStrings in the list. Here is what I'm trying

QString word = "123,12,1,\"This is, a test\"";
QStringList list = word.split( QRegExp( "(\\d+)," ) );

I got the code for this from here. This code only saves the 4th QString in the list, the first 3 are blank. Could someone help me out?

Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
TFischer
  • 1,278
  • 2
  • 24
  • 42
  • Since your requirements are a bit complicated, I suggest you simply use a loop for string parser. This way it will be much more flexible and readable. – Abhishek Bansal Nov 30 '13 at 18:42
  • Added a test below in my post. Also, how does this `QString word = "123,12,1,"This is, a test";` even parse in C++ ? –  Dec 01 '13 at 01:11

3 Answers3

1

Will this work?

Edit - Here is a little test.

If you don't get the expected result, then you Can't use QT's split.
You should parse it in a while loop to populate the list.

 QString str      = "\"123,12,1,\"This is, a test\"";
 QStringList list =
      str.split(
         QRegExp("(\"[^\"\\\\]*(?:\\\\[\\S\\s][^\"\\\\]*)*\"|\\d+)|,"),
         QString::SkipEmptyParts
      );
  • The line I tested it on was this (the first line of the CSV): `1,1,9,"This is a sentence`. `list[0]` is `1`, `list[1]` is `9,"This is a sentence`. So it is skipping either the first or second cell, and putting everything else in list[1] – TFischer Nov 30 '13 at 20:17
  • @maj0rtom - I've added a test for you. If it works, great. If not, you can't use Qt's split like this. –  Dec 01 '13 at 01:09
  • It works, but it only saves the string (What should be in `cell[3]`) in `cell[0]`. – TFischer Dec 01 '13 at 23:11
  • 1
    @maj0rtom - Sounds like it won't work. In the regex, 3 separators are defined, only 1 is a real separator. 2 that aren't are contained in an alternation within a capture group. Most langs split()'s will make _captured_ separators a separate list item. Otherwise the sep's are not in the list. So `123,12,1,"This is, a test"` using `(""|\d+)|,` normally split as `['123','12','1','"This is, a test"']. There is no other way to split out a quoted string that may have embedded sep's (unless post processing). If you didn't get a list like that, Qt's split won't work on this. –  Dec 02 '13 at 00:31
0

That regular expression only works with digit. You should use something like

QRegExp( "[A-Za-z\\d]+," )
HAL9000
  • 3,562
  • 3
  • 25
  • 47
0

using regexp sample, it's easy to study the appropriate pattern:

enter image description here

place the context of "Escaped Pattern" in your QRegExp constructor

CapelliC
  • 59,646
  • 5
  • 47
  • 90
  • So I tried `QStringList list = word.split( QRegExp( "(\\d+),(\\d+),(\\d+),(\\\".*\\\")" ) );`, but this puts everything into list[0] – TFischer Nov 30 '13 at 18:18
  • I think you have in list[1],list[2] etc all captured patterns. In list[0] there is the entire match. – CapelliC Nov 30 '13 at 18:26
  • `list[1]`, `list[2]`, `list[3]` are all out of range. I also tried The capturedText( ) way, and that gives me errors. I also tried to use capturedTexts similar to the example here without success: http://qt-project.org/doc/qt-4.8/qregexp.html#capturedTexts – TFischer Nov 30 '13 at 18:29
  • well, capturedTexts works for me. I never used QString.split with regex, then my comment above could be misleading. Sorry I deleted the prious one where I hinted about capturedTexts – CapelliC Nov 30 '13 at 18:30