1

I have a data frame like this this, i need to remove the values less than 4 digits in the item column,

department  item
xyz009   c("1","676547","2","434567","3","567369","4","987654","6","54546676732")

Output

department  item

xyz009      676547,434567,567369,987654,54546676732

Thank you for your help

SriKoy
  • 13
  • 3

3 Answers3

2

Maybe you can try nchar+subset

> subset(v,nchar(v)>4)
[1] "676547"      "434567"      "567369"     
[4] "987654"      "54546676732"

DATA

v <- c("1","676547","2","434567","3","567369","4","987654","6","54546676732")
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
1

1.Create a minimal reproducible example

xyz009 <- c("1","676547","2","434567","3","567369","4","987654","6","54546676732")

2.Suggested solution using base R:

The vector xyz009 is of type character

typeof(xyz009)

[1] "character"

In order to do maths with it (i.e. use >) we have to cast it to numeric using as.numeric

num_xyz <- as.numeric(xyz009)

Now we can use an index to 'filter' values where an expression evaluates to TRUE:

test_result <- num_xyz > 9999

The vector test_result consists of booleans

test_result

[1] FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE

We can use these booleans as an 'index' (R keeps only values where the index is TRUE):

num_xyz[test_result]

This returns:

[1]      676547      434567      567369      987654 54546676732
dario
  • 6,415
  • 2
  • 12
  • 26
0

Using base R you can use unlist, and lapply:

xyz009<-c("1","676547","2","434567","3","567369","4","987654","6","54546676732")
unlist(lapply(xyz009,function(x) x[nchar(x)>3]))

The result is:

[1] "676547"      "434567"      "567369"      "987654"      "54546676732"
SSD93
  • 39
  • 7