0

I'm working with NAICS data for all the counties in the US, there are 435581 rows of data. Each county (county names are in column A and B) in the US has a series of businesses with associated codes which will be in column C. (Column D is a description of the business) Column E is the number of their employees. Each business has been given an individual row so you can imagine each county has tens of rows associated to it. I was wondering if there was a way to rearrange them in a way that each county has only one row, but multiple columns with business codes as their titles and then the number of employees.

I have added pictures so that you can see what I mean.

Before

enter image description here

What I'm looking for

enter image description here

Pᴇʜ
  • 56,719
  • 10
  • 49
  • 73
SamR
  • 45
  • 5
  • 1
    The answer is yes, it's possible, but it's hard to help without a [mcve]. Can you provide a sample of the data, referencing [How to Create a Great Reproducible in R](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) and update your question? – Mako212 Feb 01 '19 at 18:29
  • This type of thing is very easy to do in R. You can search stackoverflow for "long to wide" which is the usual name for reshaping data to fewer rows and many columns. Or try a web search for the `reshape2` or `tidyr` packages which have functions to do that kind of reshaping. – qdread Feb 01 '19 at 18:45
  • 1
    Is there a finite list of labels or can these change dynamically/unpredictably? Is this a one time thing, or something that you will need to repeat? – KingOfTheNerds Feb 01 '19 at 18:55
  • @KingOfTheNerds Essentially there is a finite number of codes but the number fluctuates depending on the number of businesses in each county. – SamR Feb 01 '19 at 21:02
  • @SamR - the suggestion from qdread is definitely the easiest approach, but perhaps this will work for you: https://www.listendata.com/2015/02/excel-formula-convert-data-from-long-to.html. Just make sure that you have less categories than columns. Also, it might be worth you explaining WHY you want to do this? It could be that a pivot table could meet your need or that this isn't necessary at all. – KingOfTheNerds Feb 01 '19 at 21:31
  • @qdread thank you very much, "Long to Wide" was all I needed. I didn't even know what to search for. – SamR Feb 01 '19 at 21:54
  • @KingOfTheNerds I'm doing this to then run diversity index analysis on each county and see if diversity of businesses in each county links to the resilience of each county. It's part of a city planning and geography research. The tidyr worked but I'm just playing with it to see if there is way to have two headings (both the codes and their names in the row beneath that). This is not important, just for clarity. – SamR Feb 01 '19 at 21:58

0 Answers0