-2

I have data frame "data". I searched for a pattern using grep function and i would like to put result back in data frame to match rows with others.

data$CleanDim<-data$RAW_MATERIAL_DIMENSION[grep("^BAC",data$RAW_MATERIAL_DIMENSION)]

I would like to paste the result into a new column data$CleanDim but i get the following errors.... can someone please help me?

Error in `$<-.data.frame`(`*tmp*`, CleanDim, value = c(1393L, 1405L, 734L,  :  replacement has 2035 rows, data has 1881
Punintended
  • 727
  • 3
  • 7
Toney Honar
  • 57
  • 2
  • 3
  • Please provide a [reproducible sample](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) of your data, or all we can provide is function description and general advice. Generally, the `mutate` function in the `dplyr` package is the easiest way to modify or add columns to dataframes – Punintended Jun 06 '18 at 17:47
  • how do i add that to my grep function.... sorry i am very new to R. I tried mutate but i think i am not doing it right ? – Toney Honar Jun 06 '18 at 18:05
  • Please post your new code and a `dput(head(data))`. And welcome to R / StackOverflow, we've all been there =) – Punintended Jun 06 '18 at 18:38

1 Answers1

0

grep() returns a vector of indices of entries that match the given criteria.

The only way that your code could work here is if the number of rows of data equals some even multiple of the number of matches grep() finds.

Consider the following reproducible example:

data = data.frame(RAW_MATERIAL_DIMENSION = c("BAC","bBAC","aBAC","BACK","lbd"))
> data
  RAW_MATERIAL_DIMENSION
1                    BAC
2                   bBAC
3                   aBAC
4                   BACK
5                    lbd

> grep("^BAC",data$RAW_MATERIAL_DIMENSION)
[1] 1 4

data$CleanDim <- data$RAW_MATERIAL_DIMENSION[grep("^BAC",data$RAW_MATERIAL_DIMENSION)]
Error in `$<-.data.frame`(`*tmp*`, CleanDim, value = 1:2) : 
  replacement has 2 rows, data has 5

Note: this would work out ok (though it would be pretty weird) if the original data object just had its first four rows. In that case, you'd just get repeated values populated in your new column.

But, what you want to do here is to look at the results of grep("^BAC",data$RAW_MATERIAL_DIMENSION) and think about what is going to be sensible in your context. Your operation will only work if the length of this result equals that of your data object, or at least if your data object is a whole multiple of that length.

Michael Ohlrogge
  • 10,559
  • 5
  • 48
  • 76