0

Hello how can I convert from wide format to long format with this sample spatial data with R

Sample Data:

origin  origin_x    origin_y destination      dest_x       dest_y       n
Paris   6587.54     44547.58    London     456474.5   8346585.4       1577
Milan   3949.45    5406644.6    Manchester 5437374.464  846347.99     8436
Toronto 86866.44    886254.4    Alberta   8327646.3   676442.98       2000
    

Expected output:

Key                X              Y                     n           
Paris          6587.54          44547.58               1577         
London         456474.5        8346585.4               1577         
Milan          3949.45         5406644.6               8436         
Manchester     5437374.46      846347.99               8436         
Toronto        86866.44        886254.4                2000         
Alberta        8327646.3       8327646.3               2000 

Image attached for further insight

Andy Baxter
  • 5,833
  • 1
  • 8
  • 22
  • You can provide a useable dataset by copying the output of `dput(your_df)` into your question. – mlcyo Mar 16 '22 at 00:57
  • 1
    Does this answer your question? [Reshaping multiple sets of measurement columns (wide format) into single columns (long format)](https://stackoverflow.com/questions/12466493/reshaping-multiple-sets-of-measurement-columns-wide-format-into-single-columns) – zephryl Mar 16 '22 at 00:58

1 Answers1

2

Hopefully this works..I made a sample dataset from your image.

library(data.table)
library(tidyverse)
library(readr)
sheet = read_csv('book.csv')
sheet

> sheet
# A tibble: 3 x 7
  origin  origin_x origin_y destination  dest_x   dest_y     n
  <chr>      <dbl>    <dbl> <chr>         <dbl>    <dbl> <dbl>
1 Paris      6588.   44548. London      456474. 8346585.  1577
2 Milan      9386    23937  Manchester   39764  8239213   8436
3 Alberta     300      400  Toronto        200      100   2000

Separated it into origin and destination dfs.. Then merged it with rbindlist ..ordering it using the last column. This solution only works since n values are same a pair of origin-destination. If you had different n values..this will not work.

But this would only be one way of doing it. I think one can approach this in several ways.

origin = sheet %>% select(starts_with('origin'))
origin$n = sheet$n

destination = sheet %>% select(!starts_with('origin'))

x = rbindlist(list(origin, destination))[order(n)] %>% as.data.frame(.)
names(x) = c('Key', 'X', 'Y', 'n')

x

> x
         Key         X          Y    n
1      Paris   6587.54   44547.58 1577
2     London 456474.50 8346585.40 1577
3    Alberta    300.00     400.00 2000
4    Toronto    200.00     100.00 2000
5      Milan   9386.00   23937.00 8436
6 Manchester  39764.00 8239213.00 8436
kvad
  • 21
  • 1