0

I use R; I hope my answer will not be considered too much "stupid", but I really can't understand the error that I make.

I have a national survey from 2002 to 2014 and each year it is asked the dimension of the company (number of workers) in which the person interviewed works. A numeric code (1,2,..) is associated to each class dimension. From 2002 to 2006 I have 6 classes of dimension, whereas from 2008 to 2014 seven classes:

    2002-2006                          2008-2014
    0-4 workers ->     1               0-4 workers ->      1   
    5-19 workers ->    2               5-15 workers ->     2
    20-49 workers ->   3               16-19 workers ->    3
    50-99 workers ->   4               20-49 workers ->    4
    100-499 workers -> 5               50-99 workers ->    5
    >500 workers ->    6               100-499 workers ->  6
                                       >500 workers ->     7

First, I changed the code of class 3 (16-19 workers) in year 2008-14 in code 2, in order to have the same class dimension (5-20 workers) of code in 2002-06:

d.d <- data.frame(id=c(1,2,3,4,5,6), yr=c("2002", "2004", "2006", "2008", "2010", "2014"), dim=c(1,2,3,3,4,7))

For example:

id   yr    dim
1    2002   1
2    2004   2
3    2006   3
4    2008   3
5    2010   4
6    2014   7

the desired output is:

id   yr    dim
1    2002   1
2    2004   2
3    2006   3
4    2008   2
5    2010   3
6    2014   6 

COMMAND 1

d.d$dim2 <- ifelse(d.d$dim=="3" & d.d$yr=="2008",2,
                    ifelse(d.d$dim=="3" & d.d$yr=="2010",2,
                           ifelse(d.d$dim=="3" & d.d$yr=="2012",2,
                                  ifelse(d.d$dim=="3" & d.d$yr=="2014",2,
                                         d.d$dim))))

where dim is the company dimension and yr is year. In this way I changed correctly from class 3 to class 2 from 2008 to 2014.

Since codes are not associated with the same class dimension (2002-06 code 3 (20-49 workers), 2008-14 code 4 (20-24 workers)) I tried to allign the codes as before:

COMMAND 2

   d.d$dim2 <- ifelse(d.d$dim=="4" & d.d$yr=="2008",3,
                        ifelse(d.d$dim=="4" & d.d$yr=="2010",3,
                               ifelse(d.d$dim=="4" & d.d$yr=="2012",3,
                                      ifelse(d.d$dim=="4" & d.d$yr=="2014",3,
                                             d.d$dim))))

I noticed that the second code changes also the code changed by COMMAND 1

RESULT WITH COMMAND 1

d.d

      id   yr dim dim2
    1  1 2002   1    1
    2  2 2004   2    2
    3  3 2006   3    3
  **4  4 2008   3    2**
    5  5 2010   4    4
    6  6 2014   7    7

RESULT AFTER APPLYING COMMAND 2 (AFTER COMMAND 1)

d.d

  id     yr dim dim2
  1  1 2002   1    1
  2  2 2004   2    2
  3  3 2006   3    3
**4  4 2008   3    3**
  5  5 2010   4    3
  6  6 2014   7    7

I can't understand the error.

David Arenburg
  • 91,361
  • 17
  • 137
  • 196
Laura R.
  • 99
  • 1
  • 10
  • 2
    Why would you embed 4 `ifelse` if they are all doing the exact same thing? Why not ```d.d[d.d$dim==4 & d.d$yr `%in%` c(2008, 2010, 2012, 2014), "dim2"] <- 3``` for instance? Also, why are your numeric values are specified as characters? E.g., Why `"4"` instead of just `4`? Also, please see how to make a [reproducible example in R](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – David Arenburg Oct 17 '16 at 16:31

1 Answers1

1

Try this:

d.d$yr = as.numeric(d.d$yr)
d.d$dim = as.numeric(d.d$dim)

d.d$dim[ d.d$dim >= 3 & d.d$yr >= 2008 ] = d.d$dim[ d.d$dim >= 3 & d.d$yr >= 2008 ] - 1

First, change the year and dim information to numeric. This will simplify the condition for the subset you want modified.

Then substract 1 from dim for each dim and year that satisfies the condition of being 3 or more and from years 2008 forward.

If year or dim are factors then change them to numeric using as.numeric(as.character(...))

R. Schifini
  • 9,085
  • 2
  • 26
  • 32