0

I have a data set (df) below that needs a variable which for example grabs the date on a column (with names as years below) and grabs the variable start_year and finds the value of it for that year and type (newstart). I have many more years and types in my dataset than below so something that works across any number of years is needed.

Here's what I have for data (a simplification):

st type_num start_year end_year 2000 2001 2002 2003 2004
il 1 2000 2004 10 220 9 10 100
il 2 2001 2004 220 100 220 100 100
il 3 2000 2004 400 400 10 220 220
ak 1 2001 2003 10 220 9 10 100
ak 2 2001 2004 220 100 220 100 100
ak 3 2000 2003 400 400 10 220 220
wa 1 2001 2003 10 220 9 10 100
wa 2 2001 2004 220 100 220 100 100
wa 3 2000 2003 400 400 10 220 220
wa 4 2002 2003 500 600 700 800 900

Where's what I need:

st type_num start_year end_year 2000 2001 2002 2003 2004 newstart newend
il 1 2000 2004 10 220 9 10 100 10 100
il 2 2001 2004 220 100 220 100 100 100 100
il 3 2000 2004 400 400 10 220 220 400 220
ak 1 2001 2003 10 220 9 10 100 10 10
ak 2 2001 2004 220 100 220 100 100 100 100
ak 3 2000 2003 400 400 10 220 220 400 220
wa 1 2001 2003 10 220 9 10 100 220 10
wa 2 2001 2004 220 100 220 100 100 100 100
wa 3 2000 2003 400 400 10 220 220 400 220
wa 4 2002 2003 500 600 700 800 900 700 800

I was trying to get those variables using a couple indexes, tried this

which(colnames(df)==df$end_year[1])

Which seems to grab the column number of the matching date column, but wasn't able to figure out how to use it in an apply() to get it to do what this variable needs to do.

I also tried to make a less specific data set that got some suggestions to use rowwise() and get() but that didn't seem to work exactly, perhaps due to the less specific data. I tried to make something almost exactly what I intend to use for my real output.

Darren Tsai
  • 32,117
  • 5
  • 21
  • 51
Tom
  • 11
  • 1
  • 1
    Hi, if you can it would be helpful to add your data using `dput(df)` or `dput(head(df))`. Read through this post for some hints about how to format a question as a minimal reproducible example: https://stackoverflow.com/q/5963269/12400385 – nniloc Feb 15 '23 at 01:43
  • R typically averts using numeric values as column names (though not impossible). Is this actually what your data look like? – jpsmith Feb 15 '23 at 01:43
  • I think it's how the data looks like, yes. It may still code them as names but this is what's in there. Also, I have data that I cannot put a full head the actual data due to privacy reasons, so I have recreated it as best as I can. – Tom Feb 15 '23 at 01:50
  • 1
    The way you have it now is not easily reproducible. Please edit your question to make it reproducible. Look into manually making it a data frame (`data.frame(st = c(“Il”,…))`) or `dput`. If you read the link provided above it goes through all of this. The easier you make it for folks here the faster and better help you’ll get. (Just genuinely trying to help, not being snarky) – jpsmith Feb 15 '23 at 02:00

1 Answers1

0

I used pivot_longer() on the original data set I was merging in, solved the issue.

jpsmith
  • 11,023
  • 5
  • 15
  • 36
Tom
  • 11
  • 1