-2

I have a list (selected_key_ratios) containing 4 data frames ($nestle ; $unilever ; $pepsico ; $abf). Each data frame contains financial data of a company. All dataframes have the same row index and almost the same columns (only currency differ sometimes). Here is a screenshot of the list.

enter image description here

I'm trying to make a new list where each item would be a column of the dataframe, grouped by company. Here is a graphical exemple:

enter image description here

And so on for each column of the dataframes. I tried things with lapply for hours now but nothing produces the desired result.

Do you have any clues ? Thanks a lot !

rmrndr
  • 31
  • 4
  • 1
    Please give a [mcve] (rather than a screenshot) and show some of your attempts with `lapply`. See this for making a nice reproducible example in R: https://stackoverflow.com/a/5963610/4996248 – John Coleman Feb 08 '18 at 14:40

2 Answers2

1

You could try something like this nested lapply:

# Recreation of your list of dataframes
w <- list(
  abc = data.frame(
    "eps_usd" = runif(10) * 10,
    "eps_gbp" = runif(10) * 8
  ),
  def = data.frame(
    "eps_usd" = runif(10) * 15,
    "eps_eur" = runif(10) * 13
  ),
  ghi = data.frame(
    "eps_gbp" = runif(10) * 35,
    "eps_aud" = runif(10) * 19
  ),
  jkl = data.frame(
    "eps_usd" = runif(10) * 2,
    "eps_aud" = runif(10) * 1.4
  )
)

# Create a new dataframe with the year column
result <- data.frame(year = 2007:2016)

# Apply to each name in the list
lapply(names(w), function(tbl) {
  # Apply to each colname of each df
  lapply(colnames(w[[tbl]]), function(col) {
    # Assign to the reult df column the corresponding column int he list of df's
    result[[paste0(tbl, "_", col)]] <<- w[[tbl]][[col]]
  })
})

Output:

> result
   year abc_eps_usd abc_eps_gbp def_eps_usd def_eps_eur ghi_eps_gbp  ghi_eps_aud jkl_eps_usd jkl_eps_aud
1  2007    8.107360    3.419094   11.660133   9.9744151    3.801628  1.936746299  1.36976914  0.58472812
2  2008    7.527040    2.342307   11.407357   5.6755403   13.433364  8.595490269  0.31085568  0.06655984
3  2009    5.155562    4.272123    8.506886   8.5367400   20.305427 18.191703109  0.01993349  0.31829031
4  2010    2.947270    2.983519    5.686625   5.2630734   14.064397  9.049538589  0.92122668  0.55233980
5  2011    8.645507    2.657100   12.445061   6.9406141    5.056093 18.787235097  0.41227465  0.01664083
6  2012    7.192367    5.695391    3.620765   9.1173421   26.452499  0.002014068  1.84031115  0.38873530
7  2013    4.878473    1.527182   11.769227   9.6991108   16.232696  6.934076956  1.07328960  0.28808505
8  2014    1.766486    5.272151   12.656086   0.7318888   32.855694 15.643783443  1.33677381  1.09871196
9  2015    9.428541    6.462755   11.473938   4.3658361    7.547359 17.634770134  1.27743503  1.35510589
10 2016    6.047083    3.437785   13.845070  12.9766045    7.401827 18.032713128  1.73208881  0.03394082
Eugene Brown
  • 4,032
  • 6
  • 33
  • 47
0

Without a dataset I made up one.

set.seed(5489)
n <- 20
df_list <- list(
    nestle = data.frame(A = runif(n), B = runif(n), C = runif(n)),
    unilever = data.frame(D = runif(n), E = runif(n), F = runif(n)),
    abf = data.frame(G = runif(n), H = runif(n), I = runif(n))
)

The code that follows assumes that you want to extract the first column of each data frame, and that you want to name the columns of the result with a combination of the names of the original df's names and of those first columns.

result <- as.data.frame(do.call(cbind, lapply(df_list, `[[`, 1)))
names(result) <- paste(names(result), sapply(df_list, function(DF) names(DF)[1]))
row.names(result) <- row.names(df_list[[1]])

head(result)
#   nestle A  unilever D     abf G
#1 0.2348625 0.007785561 0.6453142
#2 0.5951392 0.494773356 0.2167643
#3 0.3001674 0.381868381 0.7182713
#4 0.1745270 0.983473145 0.8829462
#5 0.3387269 0.178523104 0.6042962
#6 0.1103261 0.211874225 0.4545857
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66