3

I'm trying to convert a dataframe from wide to long format using pivot_longer.

The original dataframe is:

df.start <- structure(list(amprise = 0.015, ampfull = 0.005, ampfall = 0.03, freq1 = 250L, freq2 = 500L, freq3 = 750L, relamp1 = 0.01, relamp2 = 0.3, relamp3 = 0.02), class = "data.frame", row.names = c(NA, -1L))

Which looks like:

> df.start
  amprise ampfull ampfall freq1 freq2 freq3 relamp1 relamp2 relamp3
1   0.015   0.005    0.03   250   500   750    0.01     0.3    0.02

I'd like to condense it down to six columns:

> df.end
  harmonic amprise ampfull ampfall freq relamp
1        1   0.015   0.005    0.03  250   0.01
2        2   0.015   0.005    0.03  500   0.30
3        3   0.015   0.005    0.03  750   0.02

I understand from this post that I can use cols = -c("amprise", "ampfull", "ampfall") to leave that group of columns in wide format.

I can also see that I should be able to use regex in names_pattern to remove the prefixes from the remaining column names to populate the harmonic column. But, I'm struggling to bring it all together correctly.

Any help would be greatly appreciated!

pajul
  • 123
  • 9

1 Answers1

6

You can utilise names_pattern :

tidyr::pivot_longer(df.start, 
                    cols = -c("amprise", "ampfull", "ampfall"), 
                    names_to = c('.value', 'harmonic'), 
                    names_pattern = '(.*?)(\\d+)')

#  amprise ampfull ampfall harmonic  freq relamp
#    <dbl>   <dbl>   <dbl> <chr>    <int>  <dbl>
#1   0.015   0.005    0.03 1          250   0.01
#2   0.015   0.005    0.03 2          500   0.3 
#3   0.015   0.005    0.03 3          750   0.02

When we use .value in names_to we want to maintain a part of the original column name as column in the reshaped dataframe. Using names_pattern we can define the groups in which we want to extract the column name.

(.*?)(\\d+) Here we define 2 groups of column names. First group (.*?) extracts everything until a number (\\d+) which is the second group encountered. Since we have used .value for the first part, that part in regex (.*?)) goes as separate column names.

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Thanks so much! :) Would you be able to explain a little about how the `names_to` and regex parameters work please? For some reason, I've really found it difficult to get my head around the official documentation. – pajul Jul 03 '20 at 12:43
  • 1
    @pajul I added some explanation in the answer. Hope it helps. – Ronak Shah Jul 03 '20 at 13:37