4

I am trying to parse a string into array and find a very concise approach.

string line = "[1, 2, 3]";
string[] input = line.Substring(1, line.Length - 2).Split();
int[] num = input.Skip(2)
                 .Select(y => int.Parse(y))
                 .ToArray();

I tried remove Skip(2) and I cannot get the array because of non-int string. My question is that what is the execution order of those LINQ function. How many times is Skip called here?

Thanks in advance.

Yuval Itzchakov
  • 146,575
  • 32
  • 257
  • 321
David
  • 1,646
  • 17
  • 22
  • 1
    What do you mean "it does not work anymore" after you removed the `Skip()`? What was it doing with it that it is not doing without it? – krillgar Mar 11 '15 at 13:58
  • 1
    Instead of using substring which can fail you should use `.Trim('['],)`. The `Split` also doesn't work anymore if you remove the spaces, for example: `1,2,3` – Tim Schmelter Mar 11 '15 at 13:58
  • @TimSchmelter I think you meant `('[', ']')` – cbr Mar 11 '15 at 14:00
  • 1
    You need to split with `,` not just Split. – Sriram Sakthivel Mar 11 '15 at 14:00
  • Skip(2) is called, then the object returned executes de Select() statement, and then the object returned from the Select is converted to an array. – Caio César S. Leonardi Mar 11 '15 at 14:01
  • Why aren't you parsing this with a JsonSerializer? – Yuval Itzchakov Mar 11 '15 at 14:01
  • "what is the execution order of those LINQ function. How many times is Skip called here?" Skip is called on the whole collection once, afterwards a select, that´s it. – MakePeaceGreatAgain Mar 11 '15 at 14:01
  • Have you tried stepping through this at all? [Simply writing them out](https://ideone.com/oHsQ7a) shows that skipping 2 does work, but the first 2 elements won't parse as ints – Sayse Mar 11 '15 at 14:04
  • if you want to use the comma as a seperator then you need to do line.Split(','); if you then trim for "[ ]" you will get just the numbers – MikeT Mar 11 '15 at 14:04
  • @GrawCube: yes of course. I was working on [my answer](http://stackoverflow.com/a/28988762/284240) which shows what i meant. So i couldnt fix my comment. – Tim Schmelter Mar 11 '15 at 14:08

4 Answers4

6

The order is the order that you specify. So input.Skip(2) skips the first two strings in the array, so only the last remains which is 3. That can be parsed to an int. If you remove the Skip(2) you are trying to parse all of them. That doesn't work because the commas are still there. You have splitted by white-spaces but not removed the commas.

You could use line.Trim('[', ']').Split(','); and int.TryParse:

string line = "[1, 2, 3]";
string[] input = line.Trim('[', ']').Split(',');
int i = 0;
int[] num = input.Where(s => int.TryParse(s, out i)) // you could use s.Trim but the spaces don't hurt
                 .Select(s => i)
                 .ToArray(); 

Just to clarify, i have used int.TryParse only to make sure that you don't get an exception if the input contains invalid data. It doesn't fix anything. It would also work with int.Parse.

Update: as has been proved by Eric Lippert in the comment section using int.TryParse in a LINQ query can be harmful. So it's better to use a helper method that encapsulates int.TryParse and returns a Nullable<int>. So an extension like this:

public static int? TryGetInt32(this string item)
{
    int i;
    bool success = int.TryParse(item, out i);
    return success ? (int?)i : (int?)null;
}

Now you can use it in a LINQ query in this way:

string line = "[1, 2, 3]";
string[] input = line.Trim('[', ']').Split(',');
int[] num = input.Select(s => s.TryGetInt32())
                 .Where(n => n.HasValue)
                 .Select(n=> n.Value)
                 .ToArray();
Community
  • 1
  • 1
Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
  • 1
    Very concise and yet still detailed answer, very well answered – MikeT Mar 11 '15 at 14:24
  • 2
    I strongly recommend against using out parameters in LINQ queries like this. Though this particular usage happens to be safe, you can easily make minor changes that get you into situations where the variable is mutated multiple times and later parts of the query are looking at the *current* value of the variable, not at the value you want. – Eric Lippert Mar 11 '15 at 16:44
  • The preferred technique is to write a helper method, say, `static int? MyTryParse(this string str)` which calls `TryParse` internally and returns null if the string cannot be parsed. Then your query becomes `Where(s => s.MyTryParse()).Where(i => i != null).Select(i => i.Value).ToArray()` -- no variable mutation required. – Eric Lippert Mar 11 '15 at 16:47
  • @EricLippert: i also used such a method earlier. But then i could not think of such a situation. Do you have an example? If i select the local variable right after the `Where`, is it possible that it's the wrong value? – Tim Schmelter Mar 11 '15 at 16:49
  • 1
    Sure. Suppose we have a sequence `var seq = new List { "1", "blah", "3" };` We wish to extract a sequence of the numbers, and if one of them is not a number, use zero instead. Easily done `var nums = from item in seq let success = int.TryParse(item, out tmp) select success ? tmp : 0; ` Looks fine, right? Now what happens if we make a small edit and say "and I wish to order by the item". Predict the result of executing the query `from item in seq let success = int.TryParse(item, out tmp) orderby item select success ? tmp : 0; ` -- does your prediction match reality? – Eric Lippert Mar 11 '15 at 16:54
  • Do you see why the `orderby` clause introduces the problem? – Eric Lippert Mar 11 '15 at 16:56
  • Im currently not at a computer, so i will check it later. But i think that it will order by item which is the string, then it will select either 0 or the parsed value in that order. So 1,3,0. If thats true i find it predictable. – Tim Schmelter Mar 11 '15 at 17:01
  • 1
    Your plausible but incorrect belief is exactly why this pattern is dangerous. This query actually means "produce a series of pairs and side effects. The first pair is {"1", true} and the side effect is tmp = 1. The second pair is {"blah", false} and the side effect is tmp = 0. The third pair is {"3", true} and the side effect is tmp = 3. Now order those pairs by the first item. Now for each of those pairs in the ordered sequence, check the second item in the pair, and if it is false, produce zero, and if it is true, produce the current value of tmp" . The current value of tmp is 3. – Eric Lippert Mar 11 '15 at 17:07
  • Would you care to revise your prediction now that you know what the query actually means? – Eric Lippert Mar 11 '15 at 17:09
  • Now, I emphasize that in your original answer you did the right thing -- you immediately get the mutated variable out of the variable and into a projected sequence. But it is far better to simply not go there in the first place; just don't mutate a variable inside a query. It is far, far too easy to write something that works but is very brittle in the face of simple edits, like adding an order-by in the wrong place. – Eric Lippert Mar 11 '15 at 17:12
  • Thanks for the interesting example and for remembering me that the helper method which returns a nullable is to be preferred. But why is there no such method in the framework? I know its trivial but it would help to prevent such things because many people think that int.TryParse in a LINQ query is not harmful. But actually it can be a timebomb. – Tim Schmelter Mar 11 '15 at 17:43
  • Had there been nullable types in v1 of the .NET framework then surely `TryParse` would return a nullable int. But there were not, so it takes an out parameter. There's little incentive for Microsoft to spend the time and money on adding such a method to the framework -- there are no inexpensive features, only less expensive features -- when you can do so yourself in just a couple lines of code. – Eric Lippert Mar 12 '15 at 15:55
  • Although i can imagine that it costs time and money if you modify an existing method because that could break existing code, i don't understand why it needs time to implement such a trivial _new_ method that should be safe. It could sit in the `System.Convert` class. However, i've edited my answer to point to the problem with `int.TryParse` in a LINQ query. – Tim Schmelter Mar 12 '15 at 16:11
  • 3
    @TimSchmelter: The developer time to implement the method is five minutes. But the method needs a specification -- sure, it's a short one, but testing and documentation is going to need the spec to work from. Then you write the tests, and the documentation, and then you translate the documentation into Japanese, and ... and it all adds up into a lot of work. All of that is work that is *not* being spent on more expensive features that have a much greater return on investment. Like I said, there are no inexpensive features, only less expensive features. – Eric Lippert Mar 12 '15 at 19:47
  • 1
    C# has improve the use of out to allow inline declarations of out variables this means you no longer have to share the variable between linq iterations new syntax is `out int i` so you can now do `int.TryParse(s, out int i))?(int?)i :null` – MikeT Jun 14 '18 at 14:44
  • @MikeT yes you're right. I'll update this answer when I've time – Tim Schmelter Jun 14 '18 at 15:06
3

The reason it does not work unless you skip the first two lines is that these lines have commas after ints. Your input looks like this:

"1," "2," "3"

Only the last entry can be parsed as an int; the initial two will produce an exception.

Passing comma and space as separators to Split will fix the problem:

string[] input = line
    .Substring(1, line.Length - 2)
    .Split(new[] {',', ' '}, StringSplitOptions.RemoveEmptyEntries);

Note the use of StringSplitOptions.RemoveEmptyEntries to remove empty strings caused by both comma and space being used between entries.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
0

I think it would be better you do it this way:

JsonConvert.DeserializeObject(line, typeof(List<int>));
Paulo Lima
  • 1,238
  • 8
  • 8
0

you might try

    string line = "[1,2,3]";
    IEnumerable<int> intValues = from i in line.Split(',')
                                 select Convert.ToInt32(i.Trim('[', ' ', ']'));
MikeT
  • 5,398
  • 3
  • 27
  • 43