Data tidying: Best way to create time-series data out of a differently structured data.frame without using for loops in R

Question

I have a list of .csv files. Each file represents one day's worth of price data for a large set of securities. However, not all securities have an entry in every file.

What I'd like to do is to organise this into one dataframe with the price dates of the files in column 1 and a unique list of securities that appear in the files as column names. So basically time-series format.

I get there relatively quickly after importing the csvs using dplyr filter and distinct functions to the point where I have a dataframe of about 350x9300.

Where I'm stumped is filling that data.frame with the price data corresponding to each security on each day, if it has a price on that day.

The only solution I've been able to come up with is a double for loop that filters the original data.frame for the price date and the security identifier and then runs through each cell. Obviously that takes ages. Is there a better way of extracting the price data corresponding to each security on a specific day?

Original data structure is like this:

Name | Date | Price

ABC | 4/1/2019 | 2

BCD | 4/1/2019 | 3

CDE | 4/1/2019 | 4

ABC | 4/2/2019 | 3

CDE | 4/2/2019 | 5

and so on.

I'd need it like this:

Date | ABC | BCD | CDE

4/1/2019 | 2 | 3 | 4

4/2/2019 | 3 | NA | 5

I'm sure there's a much more elegant solution than for loops, but I just can't seem to think of it. Thanks for your input!

score -1 · Accepted Answer · answered Sep 03 '20 at 12:20

-1

A base R option using reshape

reshape(df,direction = "wide",idvar = "Date",timevar = "Name")

giving

      Date Price.ABC Price.BCD Price.CDE
1 4/1/2019         2         3         4
4 4/2/2019         3        NA         5

Data

> dput(df)
structure(list(Name = c("ABC", "BCD", "CDE", "ABC", "CDE"), Date = c("4/1/2019", 
"4/1/2019", "4/1/2019", "4/2/2019", "4/2/2019"), Price = c(2L,
3L, 4L, 3L, 5L)), class = "data.frame", row.names = c(NA, -5L
))

answered Sep 03 '20 at 12:20

ThomasIsCoding

96,636
9
24
81

Brilliant. That's it and computationally pretty efficient. Thanks a lot! – CalculatedRisk Sep 03 '20 at 13:35

Data tidying: Best way to create time-series data out of a differently structured data.frame without using for loops in R

1 Answers1