-4

This is the code I used to look at a subset of data:

active<-clinic[ 
   (clinic$Days.since.injury.physio > 20 & clinic$Days.since.injury.physio < 35) 
  &(clinic$Days.since.injury.F.U.1 > 27 & clinic$Days.since.injury.F.U.1 < 63)
 , ]

I'd like to select a group of subjects based on two criteria and then analyze their results. To start I was looking at the descriptive data when I noticed na's that exceeded the entire data set.

Subsetting seemed to result in NA's. I've looked at several posts including these two below that seem relevant but I don't understand how to apply the answers.

  1. Why does subset cause na's that don't exist in the full data set? (I think the answer from other posts is that there is an na in another variable?)

  2. How do I work around this?

I'd like to be able to get values from the variables that are present rather than ignoring the whole row if there is a missing value.

Thank you.

Subsetting R data frame results in mysterious NA rows

NA when trying to summarize a subset of data (R)

Community
  • 1
  • 1
dmd
  • 1
  • 1
  • 2
    hey, please put your code in a codeblock so its readable, and please provide the dataset somehow. That'd be awesome. Your post will get downvoted by others if you dont do so. – InfiniteFlash Feb 28 '16 at 21:36
  • 1
    To provide the dataset, edit your question and paste in the output of `dput(head(clinic))` – SymbolixAU Feb 28 '16 at 21:40
  • You could use `which` in your subsetting statement, like: `clinic[which((clinic$Days.since.injury.physio > 20 & clinic$Days.since.injury.physio < 35) & (clinic$Days.since.injury.F.U.1 > 27 & clinic$Days.since.injury.F.U.1 < 63)),]` – Jaap Feb 28 '16 at 21:55

1 Answers1

-1

This is a workaround, a response to your #2

Looking at your code, there is a much easier way of subsetting data. Try this.

Check if this solves your issue.

library(dplyr)

active<- clinic %>% 
filter(Days.since.injury.physio>20,
       Days.since.injury.physio<35, 
       Days.since.injury.F.U.1>27,
       Days.since.injury.F.U.1<63
      )

dplyr does wonders when it comes to subsetting and manipulation of data.

The %>% symbol chains statements together so you don't ever have to use the $ symbol.

If, for some bizarre reason, you don't like this, you should look at the subset function in r.

InfiniteFlash
  • 1,038
  • 1
  • 10
  • 22