0

I have an excel sheet that has turned into a large data dump and I need to organize the data. I was hoping to do this in R since I can edit the code to organize multiple items. In essence, I have the data extracted as shown below with each city in its own cell.

1 NYC Chicago Detroit LA Miami
2 Chicago Sacramento Seattle LA NYC
3 Detroit Miami Sacramento NYC Chicago

I need to extract it so that the numbers at the beginning of the row will be organized with the city. It should look like this:

NYC 1 2 3
Chicago 1 2 3
Detroit 1 3
LA 1 2
Miami 1 3
Sacramento 2 3
Seattle 2

What is the easiest way to do this? I was trying to do an "If, Then" statement, but that does not seem to be working and provides an error.

EDIT: This is a snipit of what the current data looks like. There are 107 rows and 12 columns for reference:

Current Data

I am currently using the code

library(readxl);df1 <- read_excel('C:\Users\Brittney\Desktop\Data_V1.xlsx') tidyr::separate_rows(df1, September)

Brittney
  • 11
  • 2
  • 1
    YOu can use `tidyr::separate_rows(df1, col2)` – akrun Jul 09 '20 at 20:28
  • What would be the easiest way to use this function (as in the full code)? I'm having trouble importing the excel sheet, so I'm wondering if I'm doing something wrong back there. There are also multiple columns. – Brittney Jul 09 '20 at 20:45
  • to read the excel you can use `library(readxl);df1 <- read_excel('yourfile.xlsx')` once you read the file, replace the `col2` with the name of the column name of your dataset – akrun Jul 09 '20 at 20:46
  • Unfortunately, that does not seem to be doing anything different. This is what the current data set looks like (and how ti appears after using that function). – Brittney Jul 09 '20 at 21:09
  • You meant it is not reading the file? – akrun Jul 09 '20 at 21:11
  • It is reading the file, but when I use the tidyr, it does not change the appearance of the imported data. – Brittney Jul 09 '20 at 21:13
  • what is the column name. If you could update your post by `edit`ing with `dput(head(yourdata))`, it would be helpful. AFter doing the transformation, you should update the object with assignment i.e. `df1 <- tidyr::separate_rows(df1, col2)` or to a new object – akrun Jul 09 '20 at 21:13
  • I've updated the main post with a picture of the excel data and the code that I am currently running. I want the excel sheet to run all 12 columns (so 12 columns x 107 rows is 1,284 cells). It would take each city (could be in any of the cells) and make those the names of the new columns. The row number that the city came from would then be the new data listed below each column name. – Brittney Jul 09 '20 at 21:22
  • looks like there are 4 columns instead of the one showed earlier with 2 columns. In that case, perhaps you need `df1 %>% mutate(rn = row_number()) %>% pivot_longer(cols = -rn)` – akrun Jul 09 '20 at 21:24
  • There are actually 12 columns. I only put 4 to make it look shorter (it would be a very large data set otherwise). I am now getting this error "Error in pivot_longer(., cols = -rn) : could not find function "pivot_longer". – Brittney Jul 09 '20 at 21:34
  • It is from tidyr – akrun Jul 09 '20 at 21:35
  • I get that error when I run df1 %>% mutate(rn = row_number()) %>% pivot_longer(cols = -rn) – Brittney Jul 09 '20 at 21:39

0 Answers0