Remove the numbers < 4 digits in list in a data frame in R

Question

I have a data frame like this this, i need to remove the values less than 4 digits in the item column,

department  item
xyz009   c("1","676547","2","434567","3","567369","4","987654","6","54546676732")

Output

department  item

xyz009      676547,434567,567369,987654,54546676732

Thank you for your help

Can you make *less than 4 digits or less than 9999* clearer? — Rui Barradas, Feb 17 '20 at 11:54
In your example, you don't have a list, but a vector. You can do what you need with a combination of `xyz009[xyz009>9999]` and `nchar(xyz009)`. For example, `xyz009[xyz009>999 & nchar(xyz009)>4]` — fra, Feb 17 '20 at 11:55
@RuiBarradas I should remove the values in the item column iif numbers are less than 4 digits — SriKoy, Feb 17 '20 at 11:57
can you make the language more apparent in order to attract better answer. In the title if possible. — Michael Nelles, Feb 17 '20 at 12:46

ThomasIsCoding · Answer 1 · 2020-02-17T12:14:55.930

2

Maybe you can try nchar+subset

> subset(v,nchar(v)>4)
[1] "676547"      "434567"      "567369"     
[4] "987654"      "54546676732"

DATA

v <- c("1","676547","2","434567","3","567369","4","987654","6","54546676732")

edited Feb 17 '20 at 12:14

answered Feb 17 '20 at 11:57

ThomasIsCoding

score 1 · Answer 2 · answered Feb 17 '20 at 11:57

xyz009 <- c("1","676547","2","434567","3","567369","4","987654","6","54546676732")

2.Suggested solution using base R:

The vector xyz009 is of type character

typeof(xyz009)

[1] "character"

In order to do maths with it (i.e. use >) we have to cast it to numeric using as.numeric

num_xyz <- as.numeric(xyz009)

Now we can use an index to 'filter' values where an expression evaluates to TRUE:

test_result <- num_xyz > 9999

The vector test_result consists of booleans

test_result

[1] FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE

We can use these booleans as an 'index' (R keeps only values where the index is TRUE):

num_xyz[test_result]

This returns:

[1]      676547      434567      567369      987654 54546676732

score 0 · Accepted Answer · answered Feb 17 '20 at 12:04

0

Using base R you can use unlist, and lapply:

xyz009<-c("1","676547","2","434567","3","567369","4","987654","6","54546676732")
unlist(lapply(xyz009,function(x) x[nchar(x)>3]))

The result is:

[1] "676547"      "434567"      "567369"      "987654"      "54546676732"

answered Feb 17 '20 at 12:04

SSD93

Thanks @SSD93 I dont need unlist here. I have applied till lapply It worked ! – SriKoy Feb 17 '20 at 12:38

3 Answers3