I am working with 10K files from the SEC Edgar site and have been working on a procedure that would convert them to usable dataframes so that I can begin analysis. After having cleaned them a reasonable amount, my 2014 10K consists of 3 columns (consolidated_balance_sheet..., yr ended dec_31_2012, values from yr ended dec_31_2013) and the remaining 10Ks consist of 2 columns (consolidated_balance_sheet..., yr ended dec_31_yr). The first column contains the line item titles while the yr columns contain the account values. My goal is to convert the data frame from long to wide so that the columns consist of items line items.
I wrote a function that's basic enough: it should transpose the dataframe, cast the line items in the first row to the column names, and subset the first row. It is as follows:
t_func <- function(df) {
df <- t(df)
colnames(df) <- df[1, ]
df <- df[-1, ]
}
The first two lines work identically for both the 2014 10K and the remaining ones. The last line, however, only works as intended for the 2014 10K. For the remaining data frames, running this last line nullifies the transpose (flips it back from wide to long) and subsets the first row.
I am hoping for some input on why this might be happening and how to resolve it, because the behavior does not make a ton of sense to me.