0

In my global environment, I have variables that have the following names:

  1. filtered_A
  2. unfiltered_A
  3. filtered_B
  4. unfiltered_B

and so on...

The variable filtered_A is a subset of unfiltered_A that is a dataframe of one column containing gene names. I am trying to add a new column in unfiltered_A with two strings: "Passed" or "notPassed". "Passed" are those genes that exist in filtered_A. So basically I am building kind of a match between the two dataframes and write "Passed" or "notPassed" if they don't match.

I have wrote the following code:

```{r unfiltered}
setwd("/home/alaa/Documents/Analysis/genes/WES/unfiltered")

# list that contains sample names 
vcfFiles <- list.files(getwd(), recursive = T)

for (i in vcfFiles) {
    print(i)
    assign(paste0("unfiltered_", i), read.table(i))
}
```


```{r filtered}
setwd("/home/alaa/Documents/Analysis/genes/WES/filtered")

for (i in vcfFiles) {
    print(i)
    assign(paste0("filtered_", i), read.table(i))
}
```

```{r matching}

for (i in vcfFiles){

  y <- grep(i, ls())


  filterd <- get(ls()[y[1]])
  unfilterd <- get(ls()[y[2]])

  name_filterd <- ls()[y[1]]
  name_unfilterd <- ls()[y[2]]

  assign(name_unfilterd, cbind(unfilterd, apply(unfilterd, 1, function(x) ifelse(any(x[1] == filterd), 'Passed','notPassed'))))

}

for (i in ls()){
  if (is.data.frame(get(i)) && ncol(get(i)) == 2 && grepl(pattern = "unfiltered_", x = i)) {
    print(i)
    j <- get(i)
    colnames(j)[2] <- "Situation"
    assign(i, j)
  }
}
#rm(i, j, filterd, unfilterd, name_filterd, name_unfilterd, y)
```

If I run this code, it will fail on the first time saying:

    Error in apply(unfilterd, 1, function(x) ifelse(any(x[1] == filterd),  : 
  dim(X) must have a positive length

I do understand that this is due to unfilterd being dimensionless. However, if I rerun this code, it works without any problem.

Can someone explain me what is wrong please and why is it failing on the first attempt?

In case you are wondering why I am working using the global environment and ls(), its because I have many dataframes to match.

Thanks in advance.

Zen
  • 41
  • 1
  • 5
  • 1
    You really need to provide a reproducible example. https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – CCurtis Sep 27 '17 at 14:37
  • You need to define x as vector before trying to assign a value to an indexed element. Use x<-as.vector(length(unfiltered)) prior to the apply statement. – Dave2e Sep 27 '17 at 17:42
  • @CCurtis thank you for the suggestion. I will be more careful next time ! – Zen Sep 28 '17 at 15:23
  • @Dave2e thank you for your reply ! However, I tried your suggestion but it didn't work. Then I tried to use x <- as.data.frame(length(unfilterd)) and bingo ! it worked :) your suggestion didn't work because a vector is still dimensionless but when used as dataframe, it will consider it with dimensions ! – Zen Sep 28 '17 at 15:23
  • I am glad you were able to solve the problem. Uncommented code without any sample data is very difficult to solve. It is sometimes difficult to make the right assumptions. – Dave2e Sep 28 '17 at 15:43

0 Answers0