2

I have a panel data set of 10 obs. and 3 variables. (# of obs. 30 = 10 rows (= countries) * 2 columns (= migration parameters) * 1col for the respective year. My data frame consists of 3 annual data frames, so to say.

How can I apply stargazer on the whole period of time by taking into account that it is a panel data set (so max N=10)? That is, R should start over after every 11th row. I'd like to get the pretty table for descriptive statistics

The data set for the first three years:

structure(list(Population = c(21759420, 8696916, 1946351, 14689726, 
8212264, 491723, 18907008, 4345386, 11133861, 657229, 22549547, 
8944706, 1979882, 15141099, 8489031, 496963, 19432541, 4404230, 
11502786, 673252, 23369131, 9199259, 2014866, 15605217, 8766930, 
502384, 19970495, 4448525, 11887202, 689692), Distance..km. = c(7243L, 
4290L, 9500L, 3789L, 6452L, 2211L, 4667L, 5036L, 4047L, 9140L, 
7243L, 4290L, 9500L, 3789L, 6452L, 2211L, 4667L, 5036L, 4047L, 
9140L, 7243L, 4290L, 9500L, 3789L, 6452L, 2211L, 4667L, 5036L, 
4047L, 9140L), year = c(2008, 2008, 2008, 2008, 2008, 2008, 2008, 
2008, 2008, 2008, 2009, 2009, 2009, 2009, 2009, 2009, 2009, 2009, 
2009, 2009, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 
2010)), .Names = c("Population", "Distance..km.", "year"), row.names = c(1L, 
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 50L, 51L, 52L, 53L, 54L, 
55L, 56L, 57L, 58L, 59L, 99L, 100L, 101L, 102L, 103L, 104L, 105L, 
106L, 107L, 108L), class = "data.frame")

I still get descriptive statistics from N=30, but it should N=10, since I'm looking for the descriptive statistics of the whole period of three years and each yearly data frame needs to be considered isolated for that. Hope I expressed the problem comprehensibly

aluuusch
  • 83
  • 2
  • 10
  • By **time series**, do you mean panel data? a time series is univariate, whereas panel data is multivariate and can have more than one entity. Also, `stargazer` is a package for printing well-formatted tables, not an analysis tool, so your question of _"R should start over to analyse after every 49th row."_ does not make any sense. – acylam Nov 10 '17 at 16:07
  • 1
    What exactly are you trying to do here? [stargazer](https://cran.r-project.org/web/packages/stargazer/index.html) just makes pretty tables and doesn't really do an analysis. you should provide some sort of minimal [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with data that can be used for testing and a clear description of the desired output. – MrFlick Nov 10 '17 at 16:19
  • Your sample data only has one row...please provide a panel data in the form of copy and pasting the output of `dput(my_data)` into your question. – acylam Nov 10 '17 at 16:42
  • Please read my comment again, and provide the `dput(my_data)` version instead of what you have here. Also read MrFlick's link on [how to provide a minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – acylam Nov 10 '17 at 17:09

1 Answers1

4

You can either use split + lapply from base R:

library(stargazer)

lapply(split(df, df$year), stargazer, type = "text")

or by:

by(df, df$year, stargazer, type = 'text')

Result:

===============================================================
Statistic     N      Mean        St. Dev.      Min      Max    
---------------------------------------------------------------
Population    10 9,083,988.000 7,541,970.000 491,723 21,759,420
Distance..km. 10   5,637.500     2,385.941    2,211    9,500   
year          10   2,008.000       0.000      2,008    2,008   
---------------------------------------------------------------

===============================================================
Statistic     N      Mean        St. Dev.      Min      Max    
---------------------------------------------------------------
Population    10 9,361,404.000 7,798,880.000 496,963 22,549,547
Distance..km. 10   5,637.500     2,385.941    2,211    9,500   
year          10   2,009.000       0.000      2,009    2,009   
---------------------------------------------------------------

===============================================================
Statistic     N      Mean        St. Dev.      Min      Max    
---------------------------------------------------------------
Population    10 9,645,370.000 8,065,676.000 502,384 23,369,131
Distance..km. 10   5,637.500     2,385.941    2,211    9,500   
year          10   2,010.000       0.000      2,010    2,010   
---------------------------------------------------------------
df$year: 2008
[1] ""                                                               
[2] "==============================================================="
[3] "Statistic     N      Mean        St. Dev.      Min      Max    "
[4] "---------------------------------------------------------------"
[5] "Population    10 9,083,988.000 7,541,970.000 491,723 21,759,420"
[6] "Distance..km. 10   5,637.500     2,385.941    2,211    9,500   "
[7] "year          10   2,008.000       0.000      2,008    2,008   "
[8] "---------------------------------------------------------------"
-------------------------------------------------------------------------- 
df$year: 2009
[1] ""                                                               
[2] "==============================================================="
[3] "Statistic     N      Mean        St. Dev.      Min      Max    "
[4] "---------------------------------------------------------------"
[5] "Population    10 9,361,404.000 7,798,880.000 496,963 22,549,547"
[6] "Distance..km. 10   5,637.500     2,385.941    2,211    9,500   "
[7] "year          10   2,009.000       0.000      2,009    2,009   "
[8] "---------------------------------------------------------------"
-------------------------------------------------------------------------- 
df$year: 2010
[1] ""                                                               
[2] "==============================================================="
[3] "Statistic     N      Mean        St. Dev.      Min      Max    "
[4] "---------------------------------------------------------------"
[5] "Population    10 9,645,370.000 8,065,676.000 502,384 23,369,131"
[6] "Distance..km. 10   5,637.500     2,385.941    2,211    9,500   "
[7] "year          10   2,010.000       0.000      2,010    2,010   "
[8] "---------------------------------------------------------------"

The disadvantage of these two methods is that they print out the tables twice (once from stargazer output, another from lapply/by). To get around this, you can use walk form purrr to only call stargazer for it's side-effects:

library(dplyr)
library(purrr)

df %>%
  split(.$year) %>%
  walk(~ stargazer(., type = "text"))

Result:

===============================================================
Statistic     N      Mean        St. Dev.      Min      Max    
---------------------------------------------------------------
Population    10 9,083,988.000 7,541,970.000 491,723 21,759,420
Distance..km. 10   5,637.500     2,385.941    2,211    9,500   
year          10   2,008.000       0.000      2,008    2,008   
---------------------------------------------------------------

===============================================================
Statistic     N      Mean        St. Dev.      Min      Max    
---------------------------------------------------------------
Population    10 9,361,404.000 7,798,880.000 496,963 22,549,547
Distance..km. 10   5,637.500     2,385.941    2,211    9,500   
year          10   2,009.000       0.000      2,009    2,009   
---------------------------------------------------------------

===============================================================
Statistic     N      Mean        St. Dev.      Min      Max    
---------------------------------------------------------------
Population    10 9,645,370.000 8,065,676.000 502,384 23,369,131
Distance..km. 10   5,637.500     2,385.941    2,211    9,500   
year          10   2,010.000       0.000      2,010    2,010   
---------------------------------------------------------------

Note:

All methods above works for latex output (type = "latex"). I only set type = "text" for demonstrative purposes.

acylam
  • 18,231
  • 5
  • 36
  • 45
  • 1
    Thanks a lot, that works! I'd need to solve two more things. 1. When I retrieve the result from my browser as a htm-file, it only shows me the last table (I may not use Latex) 2. I got now the tables for each year. How can I summarize those tables to one, and thereby get the statistic for the years 2008-2010? – aluuusch Nov 13 '17 at 17:33
  • @aluuusch Not sure about No.1 as I can't reproduce your issue. For No.2, just write: `stargazer(df, type = 'text')`? – acylam Nov 13 '17 at 18:15