0

I have a Strava dataset with 5 columns: edge_id, year, day, hour, commute_count. The day field is a numeric field which ranges between 1 and 365. However, because the count happens hourly and per edge (street), days are repeated (see image for clarity). I need to remove every record for saturday - sundays from the table. I thought I could filter out the multiples of 6 and 7 from the "day" column but I can't figure out how. Any help will be appreciated.enter image description here

user438383
  • 5,716
  • 8
  • 28
  • 43
cccosta
  • 21
  • 1
  • Please don't post data as images. Take a look at how to make a [great reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for ways of showing data. The gold standard for providing data is using `dput(head(NameOfYourData))`, *editing* your question and putting the `structure()` output into the question. – Martin Gal Oct 19 '21 at 11:53
  • You could try `df[df$day %% 7 != c(0,6)]`. – Martin Gal Oct 19 '21 at 11:56
  • Thank you for the advice on how to post data. Apologies for that, I'm new to this. – cccosta Oct 19 '21 at 12:22

1 Answers1

0

you could use the subset function and a condition based on modulo: day %% 6 equals 0 if day is a multiple of 6. Therefore, the condition day%%6 !=0 keep only values that are not multiple of 6.

new.df = subset(df, day%%6 !=0  & day%%7 != 0)
glagla
  • 611
  • 4
  • 9