I have a dataset that is a record of price changes, among other variables. I would like to mutate the price column into a categorical variable. I understand that the two functions of importance here in R seem to be dplyr
and/or cut
.
> head(btc_data)
time btc_price
1 2017-08-27 22:50:00 4,389.6113
2 2017-08-27 22:51:00 4,389.0850
3 2017-08-27 22:52:00 4,388.8625
4 2017-08-27 22:53:00 4,389.7888
5 2017-08-27 22:56:00 4,389.9138
6 2017-08-27 22:57:00 4,390.1663
>dput(btc_data)
("4,972.0700", "4,972.1763", "4,972.6563", "4,972.9188", "4,972.9763",
"4,973.1575", "4,974.9038", "4,975.0913", "4,975.1738", "4,975.9325",
"4,976.0725", "4,976.1275", "4,976.1825", "4,976.1888", "4,979.0025",
"4,979.4800", "4,982.7375", "4,983.1813", "4,985.3438", "4,989.2075",
"4,989.7888", "4,990.1850", "4,991.4500", "4,991.6600", "4,992.5738",
"4,992.6900", "4,992.8025", "4,993.8388", "4,994.7013", "4,995.0788",
"4,995.8800", "4,996.3338", "4,996.4188", "4,996.6725", "4,996.7038",
"4,997.1538", "4,997.7375", "4,997.7750", "5,003.5150", "5,003.6288",
"5,003.9188", "5,004.2113", "5,005.1413", "5,005.2588", "5,007.2788",
"5,007.3125", "5,007.6788", "5,008.8600", "5,009.3975", "5,009.7175",
"5,010.8500", "5,011.4138", "5,011.9838", "5,013.1250", "5,013.4350",
"5,013.9075"), class = "factor")), .Names = c("time", "btc_price"
), class = "data.frame", row.names = c(NA, -10023L))
The difficulty is in the categories I want to create. The categories -1,0,1
should be based upon the % change over the previous time-lag.
So for example, a 20% increase in price over the past 60 minutes would be labeled 1, otherwise 0. A 20% decrease in price over the past 60 minutes should be -1, otherwise 0.
Is this possible in R? What is the most efficient way to implement the change?
There is a similar question here and also here but these do not answer my question for two reasons-
a) I am trying to calculate % change, not simply the difference between 2 rows.
b) This calculation should be based on the max/min values for the rolling past time frame (ie- 20% decrease in the past hour = -1, 20% increase in the past hour = 1