1

I keep extracting data from Eikon in the following format:

structure(list(GB0002404191 = c("DATE", "30/12/2022", "29/12/2022", 
"28/12/2022", "23/12/2022", "22/12/2022", "21/12/2022", "20/12/2022", 
"19/12/2022", "16/12/2022"), GB0002404191 = c("HIGH", "112.165", 
"112.127", "112.088", "112.41", "112.767", "112.712", "112.867", 
"113.963", "114.324"), GB0002404191 = c("LOW", "111.741", "111.821", 
"111.453", "111.943", "112.064", "112.36", "112.203", "113.046", 
"113.618"), GB0004893086 = c("DATE", "30/12/2022", "29/12/2022", 
"28/12/2022", "23/12/2022", "22/12/2022", "21/12/2022", "20/12/2022", 
"19/12/2022", "16/12/2022"), GB0004893086 = c("HIGH", "104.689", 
"104.879", "104.724", "105.353", "105.678", "105.606", "105.785", 
"107.244", "107.777"), GB0004893086 = c("LOW", "104.145", "104.21", 
"103.849", "104.653", "104.756", "105.002", "104.683", "105.955", 
"106.742"), GB0008932666 = c("DATE", "30/12/2022", "29/12/2022", 
"28/12/2022", "23/12/2022", "22/12/2022", "21/12/2022", "20/12/2022", 
"19/12/2022", "16/12/2022"), GB0008932666 = c("HIGH", "336.582", 
"337.641", "336.927", "337.308", "336.603", "336.053", "336.624", 
"341.35", "341.563"), GB0008932666 = c("LOW", "334.827", "335.018", 
"333.571", "334.383", "335.119", "334.512", "333.98", "337.689", 
"339.277"), GB0008983024 = c("DATE", "30/12/2022", "29/12/2022", 
"28/12/2022", "23/12/2022", "22/12/2022", "21/12/2022", "20/12/2022", 
"19/12/2022", "16/12/2022")), row.names = c(NA, -10L), class = c("data.table", 
"data.frame"), .internal.selfref = <pointer: 0x5649696b0a00>)

The column names are IDs, and the first row of data contains nested column names.

I want to change the data into the following format:

enter image description here

Where each group (ID) is stacked on top of each other and the values are the columns next to it.

When I look at this solution Reshaping data.frame from wide to long format I see that the issue with my data is that I have no clear defined id.vars as they have. My ID vars is the first row (not some set of columns).

Can someone please advise on how I can get the desired shape of the data.

zephryl
  • 14,633
  • 3
  • 11
  • 30
  • See [Reshaping data.frame from wide to long format](https://stackoverflow.com/questions/2185252/reshaping-data-frame-from-wide-to-long-format) – zephryl Feb 07 '23 at 15:06
  • I have seen this but my data shape is different. How would I choose id.vars for example? My id.vars is the first row. How can I specify this? – Jef van Cappellen Feb 07 '23 at 15:12
  • 1
    Read this: https://tidyr.tidyverse.org/reference/pivot_longer.html it explains everything. – Paul Stafford Allen Feb 07 '23 at 15:15
  • Your data has all been read in as character values because you have two header rows. This likely will cause problems. It would be better to properly import the data with numeric values first by dealing with the double header first. – MrFlick Feb 07 '23 at 15:22

1 Answers1

2

You basically have two sets of column names. You'll have to collapse these into just one set of column names, then you can pivot as usual, e.g. using tidyr::pivot_wider(). Including columns names in the first row of data also has the effect of making all columns character; you can use type.convert() or readr::type_convert() to coerce to the appropriate type.

library(tidyr)
library(readr)

cols1 <- names(dat)     # first set of column names
cols2 <- dat[1, ]       # second set of column names
dat2 <- dat[-1, ]       # actual data

# rename data with one collapsed set of column names
names(dat2) <- paste(cols1, cols2, sep = "_") 

# pivot and change column types
dat2 %>%
  pivot_longer(
    everything(),
    names_to = c("ID", ".value"),
    names_sep = "_"
  ) %>%
  type_convert(col_types = cols(DATE = col_date("%d/%m/%Y")))
# A tibble: 36 × 4
   ID           DATE        HIGH   LOW
   <chr>        <date>     <dbl> <dbl>
 1 GB0002404191 2022-12-30  112.  112.
 2 GB0004893086 2022-12-30  105.  104.
 3 GB0008932666 2022-12-30  337.  335.
 4 GB0008983024 2022-12-30   NA    NA 
 5 GB0002404191 2022-12-29  112.  112.
 6 GB0004893086 2022-12-29  105.  104.
 7 GB0008932666 2022-12-29  338.  335.
 8 GB0008983024 2022-12-29   NA    NA 
 9 GB0002404191 2022-12-28  112.  111.
10 GB0004893086 2022-12-28  105.  104.
# … with 26 more rows
zephryl
  • 14,633
  • 3
  • 11
  • 30