1

I'm new to R, but i'm trying to filter out rows of a dataset based on column values. The dataset hg19 is in BED format with column headers, and I just want rows in which the values of columns 7 and 8 are equal, while still keeping the column header. This is what i've tried so far, but it I just get a bunch of rows with zero columns, and no column headers:

non_coding= subset(hg19, hg19[8] == hg[7])

ahi_mahi
  • 15
  • 4
  • 3
    Are you comparing the columns between two datasets (`hg19[8]` and `human_genes[7]`)? Please provide a reproducible example. http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – akrun Apr 27 '15 at 18:05
  • 1
    `non_coding = subset(hg19, hg19[,8] %in% human_genes[,7])` is what you need. But please read the link @akrun posted on how to ask questions. The above code might result in duplicates. if you explain better what exactly you're trying to achive, we will be able to help you better – infominer Apr 27 '15 at 18:20
  • sorry that was a typo on my part, both columns are in the same data set. I'm trying to filter out non-coding genes, columns 7 and 8 in the dataset are for coding start and coding end. If the values in those two columns are the same, then the gene is non-coding and therefore I want to filter it out to use later. – ahi_mahi Apr 27 '15 at 19:03
  • lets say i have a data set like this data v6 v7 v8 v9 x 123 123 x x 123 456 x x 789 789 x x 123 789 x how do I filter so that only the rows in which the values for columns 7 and 8 are equal to each other? i've also tried this code hg19 -> hg19[which(hg19$V7 == hg19$V8),] – ahi_mahi Apr 27 '15 at 20:06
  • You seem to not get that you have a misspelling of the second argument to `"=="`. You code will work that that typo is fixed. Voting to close. – IRTFM Apr 27 '15 at 20:21

0 Answers0