0

I have OHLC (Open/High/Low/Close)

data which we can get using Finance API and all.

I want to create a target indicator (-1,0,1) on which I will build stock classification model.

To create this target variable.

I need to create another indicator, log(tomorrow's CLOSE/today's CLOSE)

Which will give me value in (-inf to inf).

Now, I want to create labels=c(-1, 0, 1) from breaks=c(-Inf, range_start, range_end, Inf) of log(tomorrow's CLOSE/today's CLOSE).

My first question is to create this target variable without looking into the future data, as my formula log(tomorrow's CLOSE/today's CLOSE) looks into the future, which is wrong, I want to shift the dataframe/inputs backward by one row and treat today as tomorrow and so on.

and then, calculate the target category, based on range_start, range_end and breaks I will define, the -1, 0,1 .

My 2nd question is how can i define it in best manner, this value, I am taking this as -0.0015,0.0015 as of now.

need some comments and suggestions here, thanks.

masterDF_close <- masterDF %>% dplyr::select('Date', 'Close')


# create a one-row matrix the same length as data
temprow <- matrix(c(rep.int(NA,length(masterDF))),nrow=1,ncol=length(masterDF))
# make it a data.frame and give cols the same names as data
newrow <- data.frame(temprow)
colnames(newrow) <- colnames(masterDF)

# rbind the empty row to data

masterDF <- rbind(newrow,masterDF)
###View(masterDF)

temprow2 <- matrix(c(rep.int(NA,length(masterDF_close))),nrow=1,ncol=length(masterDF_close))
# make it a data.frame and give cols the same names as data
newrow2 <- data.frame(temprow2)
colnames(newrow2) <- colnames(masterDF_close)

# rbind the empty row to data
masterDF_close <- rbind(masterDF_close, newrow2)
masterDF['Close_unshifted'] = masterDF_close$Close
###View(masterDF)
# Shifting data backwards, assuming today Close as tomorrow Close and yesterday Close as today Close

# close <- masterDF$Close
# lead_close <- lag(close, k = -1)
# 
# close[1:10]
# lead_close[1:10]
# 
# log(close/lead_close)
# 
# plot(log(close/lead_close))
masterDF['TargetIndicator'] <- log(masterDF$Close_unshifted/masterDF$Close)
###View(masterDF)
masterDF = masterDF[-1,]
masterDF$TargetIndicator[is.na(masterDF$TargetIndicator)] <- 0


masterDF_ <- masterDF %>% mutate(category=cut(TargetIndicator, 
                                              breaks=c(-Inf, range_start, range_end, Inf), 
                                              labels=c(-1, 0, 1)))

These are two operations, I am doing on the code.

Wisdom258
  • 173
  • 2
  • 11

0 Answers0