0

I have a huge dataframe (Fertig) with 815 variables and about 5000 observations. One of the columns, $date contains years as values. I would like to visualize missing values for the different variables in one year. The following command naniar::gg_miss_fct(Fertig, date) worked, but there are too many observations to wade through.

So, how can I visualize the first 20 variables, then the next 20 variables, and so on. (Even better would be to separate them by the first 5 letters of the variable name (since they group the variables)). Thanks.

Part of my data structure:

    head(structure(Fertig),10)
  1Berlin_Briefkurs Staatsschuldscheine 4%
1                                       NA
  1Berlin_Geldkurs Staatsschuldscheine 4% 1Berlin_BK Staatsschuldscheine 3,5%
1                                      NA                                  NA
  1Berlin_GK Staatsschuldscheine 3,5% 1Berlin_BK Pr.-Englische Obligation 1830
1                                  NA                                       NA
  1Berlin_GK Pr.-Englische Obligation 1830
1                                       NA
  1Berlin_BK Prämienscheine Seehandlung 1Berlin_GK Prämienscheine Seehandlung
1                                    NA                                    NA
  1Berlin_BK Kurmärkische Obligation 1Berlin_GK Kurmärkische Obligation
1                                 NA                                 NA
  1Berlin_BK Neumärkische Interimsscheine
1                                      NA
  1Berlin_GK Neumärkische Interimsscheine
1                                      NA
  1Berlin_BK Berliner Stadtobligationen 4%
1                                       NA
  1Berlin_GK Berliner Stadtobligationen 4%
1                                       NA
  1Berlin_BK Berliner Stadtobligationen 3,5%

    > dput(head(Fertig[, 1:5]))
structure(list(`1Berlin_Briefkurs Staatsschuldscheine 4%` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `1Berlin_Geldkurs Staatsschuldscheine 4%` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `1Berlin_BK Staatsschuldscheine 3,5%` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `1Berlin_GK Staatsschuldscheine 3,5%` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `1Berlin_BK Pr.-Englische Obligation 1830` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_)), row.names = c(NA, 
6L), class = "data.frame")
Sulz
  • 333
  • 1
  • 8
  • Could you please share your data using `dput`? – Quinten Jul 04 '22 at 08:54
  • I added about the half of the the output of head(dput(Fertig, 2)) output. Since it always getting really big...sry – Sulz Jul 04 '22 at 09:00
  • why does `naniar::miss_var_summary()` is not enough for you? Do you need a visualisation? Otherwise try to run `dput(head(Fertig[, 1:5]))` and post the result here. AND have you considered looping though every 20 variables? – Stephan Jul 04 '22 at 09:15
  • Thanks a lot. wasn't aware of miss_var_summary()..helped allready a lot. I Added the deput output. I thought about looping, but even failed with only visuzalizing the first couple of variables – Sulz Jul 04 '22 at 09:24

0 Answers0