0

I have pet survey data from 6 households. The households are split into levels (a,b).

I would like to melt the dataframe by aminal name (id.var), household (var.name), abundance (value.name), whilst adding a new column ("level") for the levels a&b.

My dataframe looks like this: pet abundance data

I can split it using reshape2:melt, but I don't know how to cut the a, b, from the column names and make a new column of them? Please help.

raw_data = as.dataframe(raw_data)

melt(raw_data,
     id.variable = 'Animal', variable.name = 'Site', value.name = 'Abundance')
M--
  • 25,431
  • 8
  • 61
  • 93
Paul
  • 5
  • 3
  • It'd be most helpful if you could paste the output of `dput(head(raw_data))` here for answerers to use - screenshots are hard to work with and test. – Andy Baxter Feb 03 '23 at 16:30
  • Does this answer your question? [Reshaping data.frame from wide to long format](https://stackoverflow.com/questions/2185252/reshaping-data-frame-from-wide-to-long-format) – M-- Mar 31 '23 at 15:30

1 Answers1

0

Having a go on some simulated data, pivot_longer is your best bet:

library(tidyverse)

df <- tibble(
  Animal = c("dog", "cat", "fish", "horse"),
  `1a` = sample(1:10, 4),
  `1b` = sample(1:10, 4),
  `2a` = sample(1:10, 4),
  `2b` = sample(1:10, 4),
  `3a` = sample(1:10, 4),
  `3b` = sample(1:10, 4)
)


df |>
  pivot_longer(
    -Animal,
    names_to = c("Site", "level"),
    values_to = "Abundance",
    names_pattern = "(.)(.)"
  ) |> 
  arrange(Site, level)

#> # A tibble: 24 × 4
#>    Animal Site  level Abundance
#>    <chr>  <chr> <chr>     <int>
#>  1 dog    1     a             9
#>  2 cat    1     a             5
#>  3 fish   1     a             8
#>  4 horse  1     a             6
#>  5 dog    1     b             4
#>  6 cat    1     b             2
#>  7 fish   1     b             8
#>  8 horse  1     b            10
#>  9 dog    2     a             8
#> 10 cat    2     a             3
#> # … with 14 more rows
Andy Baxter
  • 5,833
  • 1
  • 8
  • 22
  • Thanks @Andy Baxter, I've had a look but can't see how to keep the list order of the animals? – Paul Feb 03 '23 at 16:51
  • Do you mean the order they appear in the raw dataframe? `pivot_longer` should preserve that automatically? – Andy Baxter Feb 03 '23 at 16:55
  • no, after using the 'pivot_longer' code, the animals names are bunched together as in your answer. – Paul Feb 03 '23 at 17:04
  • Ah I see! so you want arranged by `Site` and `level` so it lists the four(/six) animal types by each group? Easiest to arrange after pivoting, as in revised code posted above if that helps? – Andy Baxter Feb 03 '23 at 17:21
  • one last question - my actual column names are FS01a, FS01b, FS02a, FS02b and so on... the `names_pattern` code extracts the F and S as Site and Level, respectively. But I'd still like to extract the number (Site) and Level (letter) as before. Pls help. I've read the notes but don't think I understand how it actually extracts. Pls also explain? – Paul Feb 06 '23 at 09:20
  • Ah I see! `names_to = c("Site", "level"), names_pattern = "(.)(.)"` says "split column names into two columns by taking any character - represented by '.' - followed by any character". Brackets denote separate extracted parts. The [regex](https://stringr.tidyverse.org/articles/regular-expressions.html) pattern can be modified. Perhaps in your case `names_pattern = "FS(\\d+)(\\w)"` fits. This says "pattern starts with 'FS' (which is outside brackets so not extracted), followed by one or more digits (\\d+) for first extraction to 'Site' and a letter (\\w) for second extraction to 'Level'". – Andy Baxter Feb 06 '23 at 10:32
  • 1
    BOOM! That's it dude :) and thank you @andybaxter for explaining the code, that makes sense now. – Paul Feb 06 '23 at 10:56