I have a Strava dataset with 5 columns: edge_id, year, day, hour, commute_count.
The day field is a numeric field which ranges between 1 and 365. However, because the count happens hourly and per edge (street), days are repeated (see image for clarity).
I need to remove every record for saturday - sundays from the table. I thought I could filter out the multiples of 6 and 7 from the "day" column but I can't figure out how. Any help will be appreciated.
Asked
Active
Viewed 81 times
0

user438383
- 5,716
- 8
- 28
- 43

cccosta
- 21
- 1
-
Please don't post data as images. Take a look at how to make a [great reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for ways of showing data. The gold standard for providing data is using `dput(head(NameOfYourData))`, *editing* your question and putting the `structure()` output into the question. – Martin Gal Oct 19 '21 at 11:53
-
You could try `df[df$day %% 7 != c(0,6)]`. – Martin Gal Oct 19 '21 at 11:56
-
Thank you for the advice on how to post data. Apologies for that, I'm new to this. – cccosta Oct 19 '21 at 12:22
1 Answers
0
you could use the subset function and a condition based on modulo:
day %% 6
equals 0 if day is a multiple of 6.
Therefore, the condition day%%6 !=0 keep only values that are not multiple of 6.
new.df = subset(df, day%%6 !=0 & day%%7 != 0)

glagla
- 611
- 4
- 9