1

I would like to calculate fixed effects or time invariant effects in panel data by hand by calculating the individual group mean and removing corresponding individuals from the mean. As a result I want just to run a simple linear model without time invariant effects:

library(plm)
library(data.table)
df <- head(Grunfeld,nrow(Grunfeld[1:60,]))
Given 3 individuals:



           firm year    inv  value capital  
        1     1 1935  317.6 3078.5     2.8
        2     1 1936  391.8 4661.7    52.6
        3     1 1937  410.6 5387.1   156.9
        4     1 1938  257.7 2792.2   209.2
        5     1 1939  330.8 4313.2   203.4
        6     1 1940  461.2 4643.9   207.2
        7     1 1941  512.0 4551.2   255.2
        8     1 1942  448.0 3244.1   303.7
        9     1 1943  499.6 4053.7   264.1
        10    1 1944  547.5 4379.3   201.6
        11    1 1945  561.2 4840.9   265.0
        12    1 1946  688.1 4900.9   402.2
        13    1 1947  568.9 3526.5   761.5
        14    1 1948  529.2 3254.7   922.4
        15    1 1949  555.1 3700.2  1020.1
        16    1 1950  642.9 3755.6  1099.0
        17    1 1951  755.9 4833.0  1207.7
        18    1 1952  891.2 4924.9  1430.5
        19    1 1953 1304.4 6241.7  1777.3
        20    1 1954 1486.7 5593.6  2226.3
        21    2 1935  209.9 1362.4    53.8
        22    2 1936  355.3 1807.1    50.5
        23    2 1937  469.9 2676.3   118.1
        24    2 1938  262.3 1801.9   260.2
        25    2 1939  230.4 1957.3   312.7
        26    2 1940  361.6 2202.9   254.2
        27    2 1941  472.8 2380.5   261.4
        28    2 1942  445.6 2168.6   298.7
        29    2 1943  361.6 1985.1   301.8
        30    2 1944  288.2 1813.9   279.1
        31    2 1945  258.7 1850.2   213.8
        32    2 1946  420.3 2067.7   132.6
        33    2 1947  420.5 1796.7   264.8
        34    2 1948  494.5 1625.8   306.9
        35    2 1949  405.1 1667.0   351.1
        36    2 1950  418.8 1677.4   357.8
        37    2 1951  588.2 2289.5   342.1
        38    2 1952  645.5 2159.4   444.2
        39    2 1953  641.0 2031.3   623.6
        40    2 1954  459.3 2115.5   669.7
        41    3 1935   33.1 1170.6    97.8
        42    3 1936   45.0 2015.8   104.4
        43    3 1937   77.2 2803.3   118.0
        44    3 1938   44.6 2039.7   156.2
        45    3 1939   48.1 2256.2   172.6
        46    3 1940   74.4 2132.2   186.6
        47    3 1941  113.0 1834.1   220.9
        48    3 1942   91.9 1588.0   287.8
        49    3 1943   61.3 1749.4   319.9
        50    3 1944   56.8 1687.2   321.3
        51    3 1945   93.6 2007.7   319.6
        52    3 1946  159.9 2208.3   346.0
        53    3 1947  147.2 1656.7   456.4
        54    3 1948  146.3 1604.4   543.4
        55    3 1949   98.3 1431.8   618.3
        56    3 1950   93.5 1610.5   647.4
        57    3 1951  135.2 1819.4   671.3
        58    3 1952  157.3 2079.7   726.1
        59    3 1953  179.5 2371.6   800.3
        60    3 1954  189.6 2759.9   888.9

I would generally do the following:

# First I caluclate the group mean 
    setDT(df)[, al_mean := mean(capital), by = firm]
    setDT(df)[, all_mean := mean(value), by = firm]

    #Then I subtrac each indidiual from group mean
    df$Y_fix <-  df$capital-df$al_mean
    df$X_fix <-  df$value-df$all_mean

         firm year    inv  value capital al_mean    Y_fix all_mean     X_fix
    1:    1 1935  317.6 3078.5     2.8 648.435 -645.635 4333.845 -1255.345
    2:    1 1936  391.8 4661.7    52.6 648.435 -595.835 4333.845   327.855
    3:    1 1937  410.6 5387.1   156.9 648.435 -491.535 4333.845  1053.255
    4:    1 1938  257.7 2792.2   209.2 648.435 -439.235 4333.845 -1541.645
    5:    1 1939  330.8 4313.2   203.4 648.435 -445.035 4333.845   -20.645
    6:    1 1940  461.2 4643.9   207.2 648.435 -441.235 4333.845   310.055
    7:    1 1941  512.0 4551.2   255.2 648.435 -393.235 4333.845   217.355
    8:    1 1942  448.0 3244.1   303.7 648.435 -344.735 4333.845 -1089.745
    9:    1 1943  499.6 4053.7   264.1 648.435 -384.335 4333.845  -280.145
    10:    1 1944  547.5 4379.3   201.6 648.435 -446.835 4333.845    45.455
    11:    1 1945  561.2 4840.9   265.0 648.435 -383.435 4333.845   507.055
    12:    1 1946  688.1 4900.9   402.2 648.435 -246.235 4333.845   567.055
    13:    1 1947  568.9 3526.5   761.5 648.435  113.065 4333.845  -807.345
    14:    1 1948  529.2 3254.7   922.4 648.435  273.965 4333.845 -1079.145
    15:    1 1949  555.1 3700.2  1020.1 648.435  371.665 4333.845  -633.645
    16:    1 1950  642.9 3755.6  1099.0 648.435  450.565 4333.845  -578.245
    17:    1 1951  755.9 4833.0  1207.7 648.435  559.265 4333.845   499.155
    18:    1 1952  891.2 4924.9  1430.5 648.435  782.065 4333.845   591.055
    19:    1 1953 1304.4 6241.7  1777.3 648.435 1128.865 4333.845  1907.855
    20:    1 1954 1486.7 5593.6  2226.3 648.435 1577.865 4333.845  1259.755
    21:    2 1935  209.9 1362.4    53.8 294.855 -241.055 1971.825  -609.425
    22:    2 1936  355.3 1807.1    50.5 294.855 -244.355 1971.825  -164.725
    23:    2 1937  469.9 2676.3   118.1 294.855 -176.755 1971.825   704.475
    24:    2 1938  262.3 1801.9   260.2 294.855  -34.655 1971.825  -169.925
    25:    2 1939  230.4 1957.3   312.7 294.855   17.845 1971.825   -14.525
    26:    2 1940  361.6 2202.9   254.2 294.855  -40.655 1971.825   231.075
    27:    2 1941  472.8 2380.5   261.4 294.855  -33.455 1971.825   408.675
    28:    2 1942  445.6 2168.6   298.7 294.855    3.845 1971.825   196.775
    29:    2 1943  361.6 1985.1   301.8 294.855    6.945 1971.825    13.275
    30:    2 1944  288.2 1813.9   279.1 294.855  -15.755 1971.825  -157.925
    31:    2 1945  258.7 1850.2   213.8 294.855  -81.055 1971.825  -121.625
    32:    2 1946  420.3 2067.7   132.6 294.855 -162.255 1971.825    95.875
    33:    2 1947  420.5 1796.7   264.8 294.855  -30.055 1971.825  -175.125
    34:    2 1948  494.5 1625.8   306.9 294.855   12.045 1971.825  -346.025
    35:    2 1949  405.1 1667.0   351.1 294.855   56.245 1971.825  -304.825
    36:    2 1950  418.8 1677.4   357.8 294.855   62.945 1971.825  -294.425
    37:    2 1951  588.2 2289.5   342.1 294.855   47.245 1971.825   317.675
    38:    2 1952  645.5 2159.4   444.2 294.855  149.345 1971.825   187.575
    39:    2 1953  641.0 2031.3   623.6 294.855  328.745 1971.825    59.475
    40:    2 1954  459.3 2115.5   669.7 294.855  374.845 1971.825   143.675
    41:    3 1935   33.1 1170.6    97.8 400.160 -302.360 1941.325  -770.725
    42:    3 1936   45.0 2015.8   104.4 400.160 -295.760 1941.325    74.475
    43:    3 1937   77.2 2803.3   118.0 400.160 -282.160 1941.325   861.975
    44:    3 1938   44.6 2039.7   156.2 400.160 -243.960 1941.325    98.375
    45:    3 1939   48.1 2256.2   172.6 400.160 -227.560 1941.325   314.875
    46:    3 1940   74.4 2132.2   186.6 400.160 -213.560 1941.325   190.875
    47:    3 1941  113.0 1834.1   220.9 400.160 -179.260 1941.325  -107.225
    48:    3 1942   91.9 1588.0   287.8 400.160 -112.360 1941.325  -353.325
    49:    3 1943   61.3 1749.4   319.9 400.160  -80.260 1941.325  -191.925
    50:    3 1944   56.8 1687.2   321.3 400.160  -78.860 1941.325  -254.125
    51:    3 1945   93.6 2007.7   319.6 400.160  -80.560 1941.325    66.375
    52:    3 1946  159.9 2208.3   346.0 400.160  -54.160 1941.325   266.975
    53:    3 1947  147.2 1656.7   456.4 400.160   56.240 1941.325  -284.625
    54:    3 1948  146.3 1604.4   543.4 400.160  143.240 1941.325  -336.925
    55:    3 1949   98.3 1431.8   618.3 400.160  218.140 1941.325  -509.525
    56:    3 1950   93.5 1610.5   647.4 400.160  247.240 1941.325  -330.825
    57:    3 1951  135.2 1819.4   671.3 400.160  271.140 1941.325  -121.925
    58:    3 1952  157.3 2079.7   726.1 400.160  325.940 1941.325   138.375
    59:    3 1953  179.5 2371.6   800.3 400.160  400.140 1941.325   430.275
    60:    3 1954  189.6 2759.9   888.9 400.160  488.740 1941.325   818.575                                  

So now I could run either plm for panel data fixed effects or simple lm with the new variables:

summary(plm(capital ~ value,df, index=c("firm", "year"), na.action=na.omit, model="within"))
summary(lm(Y_fix ~ X_fix,df))

However Im not sure how to calculate that manually given dozens of variables. Is there a short cut that keeps the intended structure with X_fix and Y_fix and Z_fix and so on?

  • It's easier to help you if you include a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Don't but random "..." in your sample data then we cant' test with it. Give the exact output you want for your sample data. What exactly does "given multiple observations and variables" mean? How does `lm` relate to your desire to subtract the mean? – MrFlick Sep 27 '16 at 19:51
  • Hi @MrFlick, yes, I will try to add reproducible example. –  Sep 27 '16 at 19:53
  • @MrFlick, so sorry again! I think that might be reproducible –  Sep 27 '16 at 20:17

0 Answers0