How to select certain rows from dataset?

Question

The dataset is Netflix stock price with 8 variables (8 columns) So I am picking the 3 columns I need by using

select("date", "open", "close")

  date        open close
   <date>     <dbl> <dbl>
 1 2011-01-03  25    25.5
 2 2011-01-04  25.9  25.9
 3 2011-01-05  25.9  25.7
 4 2011-01-06  25.2  25.4
 5 2011-01-07  25.5  25.6
 6 2011-01-10  25.7  26.8
 7 2011-01-11  27.1  26.7
 8 2011-01-12  26.9  27.0
 9 2011-01-13  26.9  27.4
10 2011-01-14  27.3  27.4

I wanna pick only the rows where the opening price is higher than previous days closing price
And also, the closing price has to be higher than the opening for that same day So for this dataset only 3 rows are qualifying: Jan 7th, Jan 10th and Jan 12th. If somebody could help me to understand how to code this, I would really appreciate.

It would be helpful to allow others to reproduce your data easily. See https://stackoverflow.com/q/5963269/6607497 for suggestions. — U. Windl, Jan 15 '21 at 07:01

score 0 · Answer 1 · answered Jan 15 '21 at 06:51

Using dplyr, you can do :

library(dplyr)
result <- df %>% filter(open > lag(close), close > open)
result
#        date open close
#1 2011-01-07 25.5  25.6
#2 2011-01-10 25.7  26.8
#3 2011-01-12 26.9  27.0

And the same in base R and data.table :

#Base R
subset(df, open > c(NA, close[-nrow(df)]) & close > open)

#data.table
library(data.table)
setDT(df)[open > shift(close) & close > open]

data

df <- structure(list(date = structure(c(14977, 14978, 14979, 14980, 
14981, 14984, 14985, 14986, 14987, 14988), class = "Date"), open = c(25, 
25.9, 25.9, 25.2, 25.5, 25.7, 27.1, 26.9, 26.9, 27.3), close = c(25.5, 
25.9, 25.7, 25.4, 25.6, 26.8, 26.7, 27, 27.4, 27.4)), row.names = c(NA, 
-10L), class = "data.frame")

I used dplyr and it works! You the man, thanks a lot!!! – Maiki Jan 16 '21 at 02:56 — Maiki, Jan 16 '21 at 02:56

How to select certain rows from dataset?

1 Answers1