12

My question is similar to this. But for strings.

So I have a dataframe, each column contains strings of different length. So, how I can find the maximum string length per column?

Then, how to select the columns, where length is > 1, by sapply or similar.

A typical column of the dataframe looks like this:

clmn=c("XDX", "GUV", "FQ", "ACUE", "HIT", "AYX", "NFD", "AHBW", "GKQ", "PYF")

Thanks

Community
  • 1
  • 1
Kalin Stoyanov
  • 587
  • 2
  • 5
  • 10

2 Answers2

26

We can use nchar

max(nchar(clmn))

For finding the maximum character length for each column

lapply(df1, function(x) max(nchar(x)))

If we need to filter the columns that have maximum string length greater than 1

df1[sapply(df1, function(x) max(nchar(x)))>1]

Or

Filter(function(x) max(nchar(x)) >1, df1)
akrun
  • 874,273
  • 37
  • 540
  • 662
0

Here's a purrr::map() and stringr::str_length() solution.

dat <- structure(list(name = c("Luke Skywalker", "C-3PO", "R2-D2",
                               "Darth Vader", "Leia Organa", "Owen Lars"),
                      skin_color = c("fair", "gold", "white, blue", 
                                     "white", "light", "light"),
                      eye_color = c("blue", "yellow", "red", 
                                    "yellow", "brown", "blue"),
                      len_1 = c("A", "A", "A", "A", "A", "A")),
                 row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))

purrr::map(dat, ~max(stringr::str_length(.x)))
dat[names(which(purrr::map(dat, ~max(stringr::str_length(.x))) > 1))]
Patrick
  • 742
  • 7
  • 19