1

I am doing an R exercise:

Write a function that splits the string into letters and return the min & max letters according to alphabetical order.

Here's the vector:

cities <- c("New York", "Paris", "London", "Tokyo", "Rio de Janeiro", "Cape Town")

Here's the code I have written:

first_and_last <- function(name){
  name <- gsub (" ", "", name)
  letters <- strsplit(name, split = "")
  c(first = min(letters), last = max(letters)) 
  }

However, I got errors when I run it:

first_and_last(cities)
#Error in min(letters) (from #4) : invalid 'type' (list) of argument

Kindly let me know what's missing in the code? Thanks!

the_skua
  • 1,230
  • 17
  • 30
  • 1
    can you tell what is the expected output? –  Jul 21 '16 at 15:48
  • 1
    `letters` is an assigned object in R so I would call your vector of letters something else. Although it *shouldn't* affect your environment because the change is made only within the function, it's generally not a great idea to replace built-in objects. – Phil Jul 21 '16 at 15:49

2 Answers2

3

First, your function was nearly correct. I've included vapply() loops to perform the min() and max() functions element-wise then return a data frame of the results. As @Zheyuan Li points out you can also use sapply() and this is valid, but I prefer to avoid sapply() when writing functions (see Why is `vapply` safer than `sapply`?), although both get you the answer :-)

return_first_and_last <- function(name) {
  name <- gsub (" ", "", name)
  name <- strsplit(name, split = "")

  first <- vapply(name, min, "")
  last  <- vapply(name, max, "")

  data.frame(
   first = first,
   last  = last
  )
}

return_first_and_last(cities)
#       first last
# 1     e    Y
# 2     a    s
# 3     d    o
# 4     k    y
# 5     a    R
# 6     a    w

Some notes:

  • It's good practice to call your function a verb, so I've suggested 'return_first_and_last()'
  • letters is a built-in object in R, and it's generally a bad idea to reassign these, even in local function environments. I've kept simply replacing name instead as we don't need this outside the function.
  • It looks like capitalisation matters, at least on Linux. So, if we have two of the same letters but one upper-case and one lower-case, min() will return the lower-case version, and max() will return the upper-case version (i.e. your function returns min as a and max as Y, even though there is a lower-case y as well).
Community
  • 1
  • 1
Phil
  • 4,344
  • 2
  • 23
  • 33
  • Thanks Phil. Aforementioned code doesn't return the expected output. The output should display first (min) and last (max) alphabet of a letter (in this case, name of each city). Let's say, if we apply the function on "New York", the output should show us first = "e", Last = "Y". thanks – user6210276 Jul 21 '16 at 16:23
  • @user6210276 I've edited the function to return the results element-wise as a data frame. – Phil Jul 21 '16 at 16:40
  • @ Phil, Thank you very much – user6210276 Jul 23 '16 at 07:45
1

I am assuming you want element-wise operation, i.e., for each element of cities, extract the first and last letter in alphabetic order. This is what you need:

first_and_last <- function(name){
  name <- gsub (" ", "", name)
  myName <- strsplit(name, split = "")
  result <- t(sapply(myName, range))  ## use function `range`
  rownames(result) <- name
  colnames(result) <- c("first", "last")
  return(result)
  }

first_and_last(cities)

#                first last
# New York       "e"   "Y" 
# Paris          "a"   "s" 
# London         "d"   "o" 
# Tokyo          "k"   "y" 
# Rio de Janeiro "a"   "R" 
# Cape Town      "a"   "w" 

I have used function range(). This will return min and max. It is R's built-in implementation for function(x) c(min(x), max(x)).


Follow-up

Thanks, problem solved. I'm taking an online course in R. In their solution, they used the following line of code. If possible could you please explain, what does this line of code mean. Especially, the double bracket part "[[1]]": letters <- strsplit(name, split = "")[[1]]

strsplit returns a list. Let's try:

strsplit("Bath", split = "")
#[[1]]
#[1] "B" "a" "t" "h"

If you want to access the character vector, you need [[1]]:

strsplit("Bath", split = "")[[1]]
#[1] "B" "a" "t" "h"

Only with a vector you can take min / max. For example:

min(strsplit("Bath",split=""))
#Error in min(strsplit("Bath", split = "")) : 
#  invalid 'type' (list) of argument

min(strsplit("Bath",split="")[[1]])
#[1] "a"

I believe the online example you see only takes a single character. If you have a vector input like:

strsplit(c("Bath", "Bristol", "Cambridge"), split = "")
#[[1]]
#[1] "B" "a" "t" "h"

#[[2]]
#[1] "B" "r" "i" "s" "t" "o" "l"

#[[3]]
#[1] "C" "a" "m" "b" "r" "i" "d" "g" "e"

and you want to apply range for each list element, sapply will be handy:

sapply(strsplit(c("Bath", "Bristol", "Cambridge"), split = ""), range)
#     [,1] [,2] [,3]
#[1,] "a"  "B"  "a" 
#[2,] "t"  "t"  "r" 

My function first_and_last above is based on sapply. Yet for nice presentation, I have transposed the result and given row / column names.


Gosh, I just realize you already sked a question on [[]] 2 days ago: Double Bracket [[]] within a Function. So why are you still asking me for explanation???

Community
  • 1
  • 1
Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248
  • I'm not sure what the OP wanted to return now! Either way though, I disagree with the use of `sapply()` within a function because it doesn't always return the same type of object, so your function can break easily. – Phil Jul 21 '16 at 16:10
  • @Phil it would be great if you share with us your knowledge about breaking **sapply()** I am so interested in hearing this. –  Jul 21 '16 at 16:25
  • @ Zheyuan Li, your code works. Can you please explain what the argument name is doing in the function. Thanks – user6210276 Jul 21 '16 at 16:34
  • @Learner @Zheyuan Li I prefer not to use `sapply()` because you get different results depending on the input, and it can silently pass an incorrect (empty) object further down the function making it harder to diagnose problems. See for example the answers to http://stackoverflow.com/questions/12339650/why-is-vapply-safer-than-sapply and 'Vector output: `sapply` and `vapply`' section of http://adv-r.had.co.nz/Functionals.html – Phil Jul 21 '16 at 16:44
  • @Learner @Zheyuan Li There's no problem using `sapply()` interactively but I prefer not to use it inside functions. – Phil Jul 21 '16 at 16:45
  • @Phil cool, so vapply is better than sapply !!! Thanks for message and this information phil –  Jul 21 '16 at 16:48
  • @Learner @Zheyuan Li `sapply()` is perfectly adequate for interactive use because you notice if something has gone wrong (usually!). `vapply()` can be just a little bit easier to debug within functions if something goes wrong :-) – Phil Jul 21 '16 at 16:51
  • @ Zheyuan Li, thanks, problem solved. I'm taking an online course in R. In their solution, they used the following line of code. If possible could you please explain, what does this line of code mean. Especially, the double bracket part "[[1]]. letters <- strsplit(name, split = "")[[1]] – user6210276 Jul 22 '16 at 06:46