-1

In R, I need to read in a tab delimited text file, but only the rows where column 1 (strings) equals a specific string. I've been told I can do this using the with() function, but have not been able to accomplish it. I can do this in 2 statements, but I need to do it in 1 using with().

Here's how I've done it using the two statements:

dF <- read.table(file, header=TRUE, sep="\t", na='-999')
dF <- subset(dF,dF$C1=="value")[,-1]

Since I'm filtering on column 1, I'm also going to remove it in the new data frame.

Is this possible to do in one with() function? If so, can I also display the results in the same expression? Would indexing help? I can't figure out how to make indexing work for this.

Thank you in advance!

Nate
  • 10,361
  • 3
  • 33
  • 40
Alex B
  • 15
  • 4
  • 2
    Why do you need to use `with`? Also, you are doing a row subset, so I don't think `with` is appropriate anyway. – Rich Scriven Jun 11 '17 at 18:39
  • 1
    Are you confusing `with` with `which`? You can do subsets with `which` like so: `df [which (df[['C1']] == "value"),]` EDIT: See also [this question](https://stackoverflow.com/questions/6918657/whats-the-use-of-which) – patrick Jun 11 '17 at 19:03
  • This is part of a homework assignment :( where the instructor is asking us to use the `with` function to create a logical test for the **value** rows and to use a negative index to remove the first column. Normally, I wouldn't both squeezing the two lines into one, but I don't have much choice here. – Alex B Jun 12 '17 at 15:18

3 Answers3

-1

dF <- read.table(file, header=TRUE, sep="\t", na='-999') dF <- dF[dF$C1=="value",]

You can use an implicit which statement. R allows the user to subset a data.frame with specifying which rows or columns in dt[i, j].

troh
  • 1,354
  • 10
  • 19
-1

I wouldn't recommend trying to squeeze too much activity onto a single line when you're sticking to base R: reading a text file and processing the data are two separate activities, and it makes sense to keep them separate if you want readable code.

If you do like to keep your code tight, and you want to combine your two lines into one, I recommend using the dplyr package which has a great little feature called the pipe %>%. This would allow you to break up your one line of code into readable chunks:

library(dplyr)
dF = read.table(file, header=TRUE, sep="\t", na='-999') %>% filter(C1 == "value") %>% select(-C1)

Here it is again written over several lines:

dF = read.table(file, header=TRUE, sep="\t", na='-999') %>% 
    filter(C1 == "value") %>% # take only the rows where C1 is "value"
    select(-C1) # remove the C1 column
lebelinoz
  • 4,890
  • 10
  • 33
  • 56
-1

Thank you for your help! I've concluded that I can't read in AND subset it 1 command, so i'm subsetting from a previous data frame. Here's what I ended up with:

newDf <- with(Df,Df[Df$C1=='value',-1])

Just another way of subsetting, I guess. R seems to have a ton of ways to get the same results. Pretty interesting program!

Thanks again, all!

Alex B
  • 15
  • 4