0

I have a blank encounter history dataframe with all zeros. I want to fill it with the value '1' where there is an encounter in a specific year.

My data file (datafile) looks somewhat like this:

Date               Name
2007-04-28          a
2007-05-19          a
2007-05-21          b                
2008-04-28          a
2009-05-06          c  

And the 'empty' data-frame (encounter) that has to be recoded

Name  2007   2008   2009   2010
a      0      0      0      0
b      0      0      0      0
c      0      0      0      0
d      0      0      0      0
e      0      0      0      0

I tried using an if statement:

datafile$Date%>%if(datafile$Date==between(01-01-07&31-12-07)) {encounter$2007=="1"}

But got an error

Error in between(1 - 1 - 7 & 31 - 12 - 7) : 
  between has been x of type logical
In addition: Warning message:
In if (.) datafile$Date == between(1 - 1 - 7 & 31 - 12 - 7) else { :
  the condition has length > 1 and only the first element will be used
  • Try `table(format(as.Date(df1$Date), "%Y"), df1$Name)` – akrun Mar 19 '19 at 15:01
  • 1
    Whether you are using [`dplyr::between`](https://www.rdocumentation.org/packages/dplyr/versions/0.7.8/topics/between) or [`data.table::between`](https://www.rdocumentation.org/packages/data.table/versions/1.12.0/topics/between) (or some other?), please read the help docs for it: the function (both versions) takes three arguments, not one. From there, your warning about "condition length" is a frequent one on SO (e.g., https://stackoverflow.com/q/34053043/3358272). Finally, `01-01-07` to R is negative seven, not a date as you might think; you should look into `as.Date`. – r2evans Mar 19 '19 at 15:43
  • @akrun Thank you, but it's giving me the same error – Malavika Madhavan Mar 19 '19 at 16:06
  • @r2evans I'm using between() from dplyr, and for the date, I already converted to the format %d-%m-%y with datafile$Date<- as.Date(datafile$Date, format="%d-%m-%y") Do you know what I could do? Really confused. Thanks! – Malavika Madhavan Mar 19 '19 at 16:07
  • You are using `if` and `between` incorrectly. Have you tried @akrun's suggested code? – r2evans Mar 19 '19 at 16:10
  • @r2evans Thank you. I now looked them up, and used in combination with akrun's code: table(format(as.Date(live$Datum), "%Y"), live$Ringnr) and then library(data.table) live$Datum%>% if(between(live$Datum,2007-01-01,2007-12-31, incbounds=TRUE)){enc$y07=="1"} I'm now getting a result that everything is false. Btw my original two data frames are called 'live' and 'enc', and read 'Datum' for date. Sorry for the confusion, and thanks so much for the help! – Malavika Madhavan Mar 19 '19 at 16:34
  • 1
    @r2evans I used the transmute and tally functions with tidyr, and it worked great. Thank you so much! – Malavika Madhavan Mar 20 '19 at 10:14

1 Answers1

1

There are many ways to do what you said you need. (Data all the way at the bottom.)

library(dplyr)
datafile %>%
  transmute(Year = format(Date, "%Y"), Name) %>%
  xtabs(data = ., ~ Name + Year)
#     Year
# Name 2007 2008 2009
#    a    2    1    0
#    b    1    0    0
#    c    0    0    1

though that produces an object of class "xtabs" "table", not a frame. For that you can use:

library(tidyr)
encounters <- datafile %>%
  transmute(Year = format(Date, "%Y"), Name) %>%
  group_by(Year, Name) %>%
  tally() %>%
  tidyr::spread(Year, n) %>%
  mutate_at(vars(-Name), ~ replace(., is.na(.), 0))
encounters
# # A tibble: 3 x 4
#   Name  `2007` `2008` `2009`
#   <chr>  <dbl>  <dbl>  <dbl>
# 1 a          2      1      0
# 2 b          1      0      0
# 3 c          0      0      1

Some problems with your code.

I think you are intending to pass the Date column into between, so something like this might be closer to what you are trying to do:

datafile$Date %>%
  between(as.Date("2007-01-01"), as.Date("2007-12-31"))
# [1]  TRUE  TRUE  TRUE FALSE FALSE

But that doesn't help us assign a particular value. This doesn't immediately allow you to assign new values back into the frame, but at least I can help you fix your use of between.

Further, the %>% operator/function is passing data forward, it does not immediately allow assigning elsewhere. You can fake it, but I don't think it is how it was intended to work. And since this conditional vector is created from datafile (which is one "shape") and you want to assign values into encounters (which is a completely different "shape"), you will run into logical problems that are really best to avoid.


Data:

datafile <- read.table(header=TRUE, stringsAsFactors=FALSE, text='
Date               Name
2007-04-28          a
2007-05-19          a
2007-05-21          b                
2008-04-28          a
2009-05-06          c')
datafile$Date <- as.Date(datafile$Date)
r2evans
  • 141,215
  • 6
  • 77
  • 149