0

I've been struggling with this problem for a week and I'm unable to form the logic for this one:

I have this Sort of migration table ( or pivot table):

Region County1 County2 County 3... County 75 Multiplier
County1  0.23    0.56   0.54   ...   0.75     1200
County2  0.54    0.33   0.66   ...   0.66     1321 
...
County75 0.92    0.21   0.12   ...   0.45     1600

I want to multiply the first row element of the Multiplier to the first County column (for example: 1200*(County1))

I was trying to automate something like this:

Table1$County1 <- Table1$Multiplier[1] * Table1$County1

I've been trying to think of either using loops/functions, dplyr, etc. but I'm unable to come up with a working code/my logic is severely flawed. Any help/advice is welcome, I've been scouring through Stack Overflow but I don't know what exactly to search for (or what this operation is called). I don't mean to cause any duplication if a similar solution already exists, and if it does I have been unable to find it :(.

(In excel this would be similar to using $ (dollar sign) with the column name and multiplying)

The output should look something like this:

Region County1 County2 ... County 75 Multiplier
County1  276    739.76 ...   1200     1200
County2  648    435.93 ...   1056     1321 
 ...
County75  1104   277.41...   720      1600

Thank you!

  • 1
    `Table1$County1 <- Table1$Multiplier[1] * Table1$County1` ? It's not clear what output you expect, could you lay it out in your question ? – moodymudskipper Jul 16 '18 at 12:31
  • Yes that's what I want. The multiplier is the future projection of the population for each county. So the first row of the Multiplier (i.e. 1200 will be multiplied to the entire County1 column, 1321 will be multiplied to the entire County2 column...1600 will be multiplied to the County75 column). Each value in the County1 - County75 are the ratio of a certain demographic to the population at a point in time (somewhere in the past). I hope this helps to explain a little bit, that's the requirement by the team lead :(. I know the logic is a little wonky on that one – Anurag Kaushik Jul 16 '18 at 12:37
  • @Moody_Mudskipper I've updated my question according to your comment. Sorry about the confusion – Anurag Kaushik Jul 16 '18 at 12:40
  • It's all good, tell me if my answer is what you expect – moodymudskipper Jul 16 '18 at 12:43
  • Unfortunately, the dplyr code isn't working according to my need. Let me see if I can explain the problem better because the logic is very wonky. So the column (county1, county2,..county75) need to be multiplied by the respective rows (first element of multiplier row, second element of multiplier row,...75th) RESPECTIVELY. It's not a direct gather, multiply and spread type application unfortunately and it's not such a simple mutate :(. That's why I used the [1] notation to show that only the first element multiplies to the entire County1 column – Anurag Kaushik Jul 16 '18 at 14:54

2 Answers2

4

Here is a tidyverse solution:

library(tidyverse)
df1 <- select(df,Region1 = Region,Multiplier)
df2 <- select(df,-Multiplier) %>% gather(Region1,value,-Region)
inner_join(df1,df2,by="Region1") %>%
  mutate(value = Multiplier * value) %>%
  select(-Multiplier) %>%
  spread(Region1,value)

#     Region County1 County2 County75
# 1  County1     276  739.76     1200
# 2  County2     648  435.93     1056
# 3 County75    1104  277.41      720

data

df <- read.table(text="Region County1 County2 County75 Multiplier
County1  0.23    0.56   0.75   1200
County2  0.54    0.33   0.66   1321 
County75 0.92    0.21   0.45   1600",strin=F,h=T)
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
  • Sorry about the earlier comment, I checked it again. There is actually a small problem with this, when it gathers the columns together, the above operation still doesn't give me the results I want. I also tried this: I transposed the Multiplier Column into a row and tried to add it to the bottom of each county (using rbind) but I don't know how to multiply that last row value to all the rows above it. Was I able to explain what I exactly want? It's not very easy to explain :( – Anurag Kaushik Jul 16 '18 at 14:49
  • As I asked in the comments of the question, you should lay out your precise expected output in your question. That's much faster than a long explanation. This question may be useful : https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example?s=1|2598.4385 – moodymudskipper Jul 16 '18 at 14:54
  • 1
    Oh okay! I'll try to put the output! Since the data is secretive I had to put fake numbers in there. Let me edit my question...Sorry – Anurag Kaushik Jul 16 '18 at 14:56
  • I've updated my question now with the output. Apologies, I'm new here so I'm still trying to understand the norms :) thanks for the link above – Anurag Kaushik Jul 16 '18 at 15:00
  • Okay, I'll try it out! Thanks, I would totally prefer dplyr because I would have to put this one in a larger part of a map (purrr) or a loop for all the different states! – Anurag Kaushik Jul 16 '18 at 17:06
  • Really sorry for the late reply! This one worked! :D Thanks a lot! – Anurag Kaushik Jul 30 '18 at 12:27
  • great, always happy to help! – moodymudskipper Jul 30 '18 at 12:53
2

This will Map the function * across elements of df[, 6] and columns of df[, -c(1, 6)] i.e. multiply the first "County" column by the first value of the sixth column, the second "County" column by the second value of the sixth column, etc.

df[, -c(1, 6)] <- Map(`*`, df[, -c(1, 6)], df[, 6])

Map input data:

df <- data.table::fread("
Region County1 County2 County  County75 Multiplier
County1  0.23    0.56   0.54     0.75     1200
County2  0.54    0.33   0.66     0.66     1321 
County2  0.44    0.26   0.46     0.85     1472 
County75 0.92    0.21   0.12     0.45     1600
", data.table = F)

Map output data

#     Region County1 County2 County County75 Multiplier
# 1  County1     276  739.76 794.88     1200       1200
# 2  County2     648  435.93 971.52     1056       1321
# 3  County2     528  343.46 677.12     1360       1472
# 4 County75    1104  277.41 176.64      720       1600
IceCreamToucan
  • 28,083
  • 2
  • 22
  • 38
  • 1
    Nice, that will be faster than mine, but you have to make sure you don't have missing rows/columns and that rows are sorted, as you don't use the `Region` column at all. – moodymudskipper Jul 16 '18 at 13:58
  • @Ryan Oh okay, Thank you! I think I'll stick with the dplyr method over base R as I've got to do this for a bunch of states and they might end up with some missing values. I struggle big time with the data.table so I'll come back to this once I start learning how to use it well :) – Anurag Kaushik Jul 16 '18 at 14:38
  • Unfortunately, the above dplyr code isn't working according to my need. Let me see if I can explain the problem better because the logic is very wonky. So the column (county1, county2,..county75) need to be multiplied by the respective rows (first element of multiplier row, second element of multiplier row,...75th) RESPECTIVELY. It's not a direct gather, multiply and spread type application unfortunately and it's not such a simple mutate :( – Anurag Kaushik Jul 16 '18 at 14:52
  • 1
    I see. I think `Map` should work in that case. see my edit. @AnuragKaushik – IceCreamToucan Jul 16 '18 at 14:56
  • @Ryan okay thanks, I'll try it out and post here ASAP! – Anurag Kaushik Jul 16 '18 at 15:01
  • Okay, I got it to work. My bad, I didn't get the data in the right format. Noob mistake (facepalm) the Map function worked! I have never encountered this function before. It's super cool – Anurag Kaushik Jul 16 '18 at 15:29