0

I'm trying to do a predictive analysis where metrics from year x will predict year x+1.

I want to do the equivalent of an UPDATE query in SQL using R. If I have data like this:

x <- c("Randy Watson", "Cleo McDowell", "Darryl Jenks", "Jaffe Joffer",
       "Randy Watson", "Cleo McDowell", "Darryl Jenks", "Jaffe Joffer",
       "Randy Watson", "Cleo McDowell", "Darryl Jenks", "Jaffe Joffer")
y <- c("2012", "2012", "2012", "2012", 
       "2013", "2013", "2013", "2013", 
       "2014", "2014", "2014", "2014")
z <- c(100, 50, 75, 0, 
       110, 75, 0, 25, 
       125, 25, 10, 50)

df <- data.frame(x, y, z)

colnames(df) <- c("Name", "Year", "Sales")

print(df)

            Name Year Sales
1   Randy Watson 2012   100
2  Cleo McDowell 2012    50
3   Darryl Jenks 2012    75
4   Jaffe Joffer 2012     0
5   Randy Watson 2013   110
6  Cleo McDowell 2013    75
7   Darryl Jenks 2013     0
8   Jaffe Joffer 2013    25
9   Randy Watson 2014   125
10 Cleo McDowell 2014    25
11  Darryl Jenks 2014    10
12  Jaffe Joffer 2014    50

I want the final output to look like:

print(df)
           Name YearX YearX1
1  Randy Watson   100   110
2 Cleo McDowell    50    75
3  Darryl Jenks    75     0
4  Jaffe Joffer     0    25

...

How can I do this in R? I know how to do this in SQL (though I'd rather not use sqldf unless its the best way).

Thanks.

Edit: The solution below isn't quite what I'm looking for. While it works if there are only two years my data has 10 years. I don't need Name, Year1, Year2, Year3, etc... I just want Name, YearX, YearX+1. Sorry if I was unclear about that.

Frank B.
  • 1,813
  • 5
  • 24
  • 44
  • 1
    Your current data format is typically called "long", and your target is called "wide". With those search terms, [this question should have you covered](http://stackoverflow.com/q/5890584/903061). – Gregor Thomas Jul 29 '14 at 19:29
  • How do you define `YearX'? Is it the first year in the series or? – Eric Fail Jul 30 '14 at 12:37
  • In regard to your edit, and to bump the point; how do you define `YearX'? – Eric Fail Aug 01 '14 at 10:30

1 Answers1

2

Like this

# I took the liberty of rearranging your working example a bit   
df <- data.frame(
           Name = c("Randy Watson", "Cleo McDowell", "Darryl Jenks", "Jaffe Joffer",
                    "Randy Watson", "Cleo McDowell", "Darryl Jenks", "Jaffe Joffer"),
           Year = c("2013", "2013", "2013", "2013", "2014", "2014", "2014", "2014"),
           Sales = c(100, 50, 75, 0, 110, 75, 0, 25))

reshape(df, idvar = "Name", timevar = "Year", direction = "wide")

           Name Sales.2013 Sales.2014
1  Randy Watson        100        110
2 Cleo McDowell         50         75
3  Darryl Jenks         75          0
4  Jaffe Joffer          0         25

or to follow your question real close

df_wide <- reshape(df, idvar = "Name", timevar = "Year", direction = "wide")

colnames(df_wide) <- c("Name", "Year0", "Year1") 

print(df_wide)
           Name Year0 Year1
1  Randy Watson   100   110
2 Cleo McDowell    50    75
3  Darryl Jenks    75     0
4  Jaffe Joffer     0    25

Some alternatives methods that should give you the same result

library(reshape)
cast(df, Name ~ Year)

xtabs(Sales  ~  Name + Year, data = df)
Eric Fail
  • 8,191
  • 8
  • 72
  • 128
  • Actually this isn't quite what I was looking for. While your solution works if there are only two years, my data is 10 years deep. I'd just want YearX and YearX+1 next to it and that's it. Make sense? – Frank B. Jul 29 '14 at 23:28
  • This is why it's key for you to supply a working example that fully reflects your problem. Maybe you can expand your working example? Also, did you study the examples Gregor and MrFlick referred to? – Eric Fail Jul 30 '14 at 01:51
  • I did look at those examples, but they weren't wasn't what I was looking for. Also, I can never get SO straight because if I post "too much" I get criticized; "too few" it's the same. It's tough to figure out what the Goldilocks Zone is. – Frank B. Jul 30 '14 at 02:13
  • I saw, in one of your previous questions MrFlick, om Jun 3 at 14:16, already pointed you to the thread [How to make a great R reproducible example?](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). You should study that thread. I would also point to the [How do I ask a good question?](https://stackoverflow.com/help/how-to-ask) article. Finally, my personal favorite, Jon Skeet's [Writing the perfect question](http://tinyurl.com/stack-hints). You will get the hang of it. – Eric Fail Jul 30 '14 at 02:36