2

I am trying to write following function in R: it will construct vector of trimmed string parts from a string which contains comma-separated parts.

# parse string into vector of trimmed comma-separated parts
parseLine<-function(str)
{
    inputVector = strsplit(str, ",")
    outputVector = c()
    for(s in inputVector)
    {
        s = gsub("^\\s+|\\s+$", "", s)
        if(nchar(s) > 0)
            outputVector = c(outputVector, s)
    }
    return(outputVector)
}

This function definition is parsed successfully. But when I am executing it like this:

parseLine("a,   b, c, d")

I get result but as well a strange warning:

[1] "a" "b" "c" "d"
Warning message:
In if (nchar(s) > 0) outputVector = c(outputVector, s) :
  the condition has length > 1 and only the first element will be used

And my questions are:

  • What does it mean?
  • What can I do to get rid of it?
ivan.ukr
  • 2,853
  • 1
  • 23
  • 41
  • Why did you mark this question as duplicate? (Probably because I'm a newbie to R) I can't catch what is exactly duplicate in my question and the question pointed by you. Please exaplain in details, or unblock this question, because I have found a correct solution and want to post my new code as answer here. – ivan.ukr Oct 13 '15 at 22:43
  • 1
    I don't think it should be a duplicate of the listed question. The issue really is that strsplit gives a list as its output (It has to since you can input a vector of strings for it to split and each string could split into different lengths). Your for loop is iterating over that list so really the "s" that you have is a vector. So you're passing a vector into the if statement which is why you get the message that you do. It works out ok in this case since you don't have any empty strings and gsub works on vectors just fine. – Dason Oct 13 '15 at 22:47
  • Okay, I removed it. Sorry about that – Rich Scriven Oct 13 '15 at 22:49
  • Yes, I've found this and I also don't think this is duplicate of the listed question. So is there some moderator to resolve this? – ivan.ukr Oct 13 '15 at 22:50
  • @Richard Scriven, Okay, thanks for unlocking the question. – ivan.ukr Oct 13 '15 at 22:54
  • 1
    @ivan.ukr It partially is a duplicate but since the reason is that you misunderstood what the output from strsplit was (a list instead of a vector) it amounted to you thinking you were iterating over the elements of the vector and not the elements of the list which caused the issue. It's still the same issue as the duplicate question though - you were passing a vector into an if statement which is a no-no. – Dason Oct 13 '15 at 22:54
  • @Dason OK, now finally it is completely clear to me what have happened. Thank you for this clarification. – ivan.ukr Oct 13 '15 at 23:33

1 Answers1

1

Update: I have found correct solution. The issue is that strsplit() gives a list as its output.

# parse string into vector of trimmed comma-separated parts
parseLine<-function(str)
{
    inputVector = strsplit(str, ",", TRUE)[[1]] # <<< here was the list
    outputVector = c()
    for(s in inputVector)
    {
        s = gsub("^\\s+|\\s+$", "", s)
        if(nchar(s) > 0)
            outputVector = c(outputVector, s)
    }
    return(outputVector)
}
ivan.ukr
  • 2,853
  • 1
  • 23
  • 41
  • I wouldn't necessary call this the perfect solution unless you can always guarantee that parseLine will only be called with a single string input. If you pass a vector in as your input you won't get the desired results. – Dason Oct 13 '15 at 22:53
  • @Dason Yes, this code assumes that input is single line, that't what I need in my case. But if you can, you are welcome to provide a better solution that resolves the case with taking into account the additional condition you've mentioned. – ivan.ukr Oct 13 '15 at 22:57