Just getting started in R and would appreciate your help. I have a series of daily prices for 6 financial indexes (S&P 500, etc.) that I'm trying to analyze.
For whatever reason some prices are coming in as chr instead of num, but when I'm converting chr columns to num columns the numbers are becoming NAs and I can't seem to figure out why!! I've posted the tails so you can see.
The only weird thing I've noticed is in my csv file when the price gets into the thousands it is coded in Excel as "number" vs "general". So for example the S&P 500 index starts with prices in the low teens and in Excel it is coded as "general" but when it crosses into 1,000+ then it becomes "number" (don't know if that's the issue).
I'm not sure how to upload a copy of the csv file here, but would be happy to provide it.
> index_prices <- read.csv(file.choose(), header = TRUE, stringsAsFactors = FALSE, na.strings=c(""))
> index_prices$Date <- as.Date(index_prices$Date, "%m/%d/%Y")
> str(index_prices)
'data.frame': 30790 obs. of 7 variables:
$ Date : Date, format: "1936-03-31" "1936-04-01" ...
$ S.P.500.TR.USD.1936. : chr "14.96" "14.96" "14.96" "14.96" ...
$ BBgBarc.US.Agg.Bond.TR.USD : chr NA NA NA NA ...
$ MSCI.EAFE.NR.USD : chr NA NA NA NA ...
$ BBgBarc.Gbl.Agg.Ex.USD.TR.USD : num NA NA NA NA NA NA NA NA NA NA ...
$ FTSE.Nareit.All.Equity.REITs.TR.USD: chr NA NA NA NA ...
$ FTSE.Treasury.Bill.3.Mon.USD : num NA NA NA NA NA NA NA NA NA NA ...
> tail(index_prices)
Date S.P.500.TR.USD.1936. BBgBarc.US.Agg.Bond.TR.USD MSCI.EAFE.NR.USD
30785 2020-07-12 68,459.51 2,374.28 5,794.25
30786 2020-07-13 67,818.58 2,374.59 5,879.18
30787 2020-07-14 68,736.76 2,377.91 5,848.82
30788 2020-07-15 69,362.39 2,377.53 5,958.46
30789 2020-07-16 69,128.33 2,380.99 5,922.00
30790 2020-07-17 69,328.78 2,379.84 5,921.61
BBgBarc.Gbl.Agg.Ex.USD.TR.USD FTSE.Nareit.All.Equity.REITs.TR.USD
30785 492.20 18,440.27
30786 491.36 18,208.13
30787 492.10 18,366.16
30788 493.63 18,532.83
30789 494.14 18,296.35
30790 493.40 18,554.14
FTSE.Treasury.Bill.3.Mon.USD
30785 659.33
30786 659.33
30787 659.34
30788 659.34
30789 659.34
30790 659.34
> index_prices[,2:7] <- apply(index_prices[,2:7], 2, function(x) as.numeric((x)))
Warning messages:
1: In FUN(newX[, i], ...) : NAs introduced by coercion
2: In FUN(newX[, i], ...) : NAs introduced by coercion
3: In FUN(newX[, i], ...) : NAs introduced by coercion
4: In FUN(newX[, i], ...) : NAs introduced by coercion
> str(index_prices)
'data.frame': 30790 obs. of 7 variables:
$ Date : Date, format: "1936-03-31" "1936-04-01" ...
$ S.P.500.TR.USD.1936. : num 15 15 15 15 15 ...
$ BBgBarc.US.Agg.Bond.TR.USD : num NA NA NA NA NA NA NA NA NA NA ...
$ MSCI.EAFE.NR.USD : num NA NA NA NA NA NA NA NA NA NA ...
$ BBgBarc.Gbl.Agg.Ex.USD.TR.USD : num NA NA NA NA NA NA NA NA NA NA ...
$ FTSE.Nareit.All.Equity.REITs.TR.USD: num NA NA NA NA NA NA NA NA NA NA ...
$ FTSE.Treasury.Bill.3.Mon.USD : num NA NA NA NA NA NA NA NA NA NA ...
> tail(index_prices)
Date S.P.500.TR.USD.1936. BBgBarc.US.Agg.Bond.TR.USD MSCI.EAFE.NR.USD
30785 2020-07-12 NA NA NA
30786 2020-07-13 NA NA NA
30787 2020-07-14 NA NA NA
30788 2020-07-15 NA NA NA
30789 2020-07-16 NA NA NA
30790 2020-07-17 NA NA NA
BBgBarc.Gbl.Agg.Ex.USD.TR.USD FTSE.Nareit.All.Equity.REITs.TR.USD
30785 492.20 NA
30786 491.36 NA
30787 492.10 NA
30788 493.63 NA
30789 494.14 NA
30790 493.40 NA
FTSE.Treasury.Bill.3.Mon.USD
30785 659.33
30786 659.33
30787 659.34
30788 659.34
30789 659.34
30790 659.34