0

I'd like to calculate the 1st quartile of numbers in a column that shows the Gap. In this column there are both positive and negative numbers, so I'd like to filter out only positive numbers and calculate the quartile (1st) but I'm having problems with the function (see code below)

#I have tried a function with ">0" and "0.25" to get the 1st quartile but somehow is failing and I don't know why.

library(quantmod)

Symbols <- c("FB","AAPL","AMZN","NFLX","GOOG")

getSymbols(Symbols,from="2018-06-01")

stock_data = sapply(.GlobalEnv, is.xts)

all_stocks <- do.call(list, mget(names(stock_data)[stock_data]))

# Calculate gap

Gap_function <- function(x) {
stock_name <- stringi::stri_extract(names(x)[1], regex = "^[A-Z]+")
stock_name <- paste0(stock_name, ".Gap")
column_names <- c(names(x), stock_name)
x$rgap <- quantmod::Op(x) - lag(quantmod::Cl(x)) 
x <- setNames(x, column_names)
return(x)
}

all_stocks <- lapply(all_stocks, Gap_function)

# Calculate  1st Quantile of positive numbers in .GAP column,  currently it is not working.
# I don't know how to indicate to use only positive numbers (>0) and  (1st quartile 0.25)

Quartile1_function <- function(x) { 
stock_name <- stringi::stri_extract(names(x)[1], regex = "^[A-Z]+")
stock_name <- paste0(stock_name, ".GapQ1")
column_names <- c(names(x), stock_name)
x$quartile1 <- (quantile(x,na.rm=TRUE [,grep(".Gap" >0,      colnames(x))], 0.25))
x <- setNames(x, column_names)
return(x)
}

all_stocks <- lapply(all_stocks, Quartile1_function)


#The desired result is to get a 1st quartile of .Gap column  (taking   only positive numbers) in a new column =(7.609986)

#If you run this code you will get AMZN quartiles only for positive     numbers in the .Gap column. but I don't know what is failing to get a function to do this for all stocks.

quantile(all_stocks$AMZN$AMZN.Gap[which(all_stocks$AMZN$AMZN.Gap >0)])

       0%       25%       50%       75%      100% 
 0.060059  7.609986 11.709961 21.319946 88.640015 
Rick
  • 171
  • 1
  • 13

2 Answers2

1

Something like this ?

x <- c(-10,-5,-7,0:10)
quantile(x[x>=0],0.25)
# 25% 
# 2.5 
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
  • Hi Moddy, not exactly. What I'm looking for is a function similar to the "Quartile1_function" the get the 1st quartile of positive numbers of Gap column in a new column. I need this to be applied to all stocks in my list (all_stocks) that's why I need a function. – Rick Jan 24 '19 at 15:22
  • 1
    I don't understand what you're trying to do, could you include the expected output ? This line has bad syntax `x$quartile1 <- (quantile(x,na.rm=TRUE [,grep(".Gap" >0, colnames(x))], 0.25))` , you're subsetting `TRUE`, and you're comparing a string to a numeric. – moodymudskipper Jan 24 '19 at 15:28
  • Thanks Moody. You are correct this line is the one where the problem is, and I don't know why. I'd like to subset ".Gap" colum, then takes only the positive numbers ">0" and then calculate 1st quantile of the previously filtered positive numbers. – Rick Jan 24 '19 at 15:32
  • 1
    I think I showed you how to do that, I'll adapt to your specific example if you add the expected output – moodymudskipper Jan 24 '19 at 15:34
  • The expected result is a new column in all stocks of the list, for example for AMZN the value should be "7.609986" corresponding to the 1st quartile of the positive numbers in Gap column. quantile(all_stocks$AMZN$AMZN.Gap[which(all_stocks$AMZN$AMZN.Gap >0)]) – Rick Jan 24 '19 at 15:35
  • please take a look here to understand what is a reproducible example and what I mean by expected output : https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example . – moodymudskipper Jan 24 '19 at 15:40
1

I think you want something like below. This will return data in xxx.GapQ1 where the numbers are positive and they will be divided into NA 1, 2, 3, and 4. The NA's are the numbers below 0 (and 0). You can filter out what you don't need later on.

Quartile1_function <- function(x) { 
  stock_name <- stringi::stri_extract(names(x)[1], regex = "^[A-Z]+")
  stock_name <- paste0(stock_name, ".GapQ1")
  column_names <- c(names(x), stock_name)

  # use this grep the column name of interest and plug that in the next section.
  column_of_interest <- grep(".Gap", colnames(x))

  # Find the quantiles based only on positive numbers. 
  # The negative numbers or 0 will be returned as NA 
  x$quartile1 <- as.integer(cut(x[, column_of_interest],
                                quantile(x[x[, column_of_interest] > 0, column_of_interest], na.rm = TRUE),
                                include.lowest = TRUE))

  names(x) <- column_names
  return(x)
}

all_stocks <- lapply(all_stocks, Quartile1_function)
head(all_stocks$AMZN)

           AMZN.Open AMZN.High AMZN.Low AMZN.Close AMZN.Volume AMZN.Adjusted  AMZN.Gap AMZN.GapQ1
2018-06-01   1637.03   1646.73  1635.09    1641.54     3313400       1641.54        NA         NA
2018-06-04   1648.90   1665.68  1645.49    1665.27     3187700       1665.27  7.359985          1
2018-06-05   1672.99   1699.00  1670.06    1696.35     4782200       1696.35  7.719970          2
2018-06-06   1704.51   1714.50  1686.47    1695.75     5473200       1695.75  8.160034          2
2018-06-07   1698.56   1699.90  1676.11    1689.30     3765700       1689.30  2.810059          1
2018-06-08   1681.12   1689.44  1673.01    1683.99     2955100       1683.99 -8.180054         NA
phiver
  • 23,048
  • 14
  • 44
  • 56