1

I was wondering if it is possible to add custom attributes to the built-in function plot_timetrend of the keyATM R-package? My graph produced with

fig_timetrend <- plot_timetrend(out, 
                               time_index_label = df$year,
                               xlab = "Year of complaint submission",
                               scales = "fixed",
                               width = 5)

yields the following:

enter image description here

This graph seems fine, but it would be desirable to add a ggtitle and customize axes since these functions are not supported by plot_timetrend. Can I do it with ggplot? And if yes, how?

Thank you in advance.

P.S. You can generate the output model with this code:

out <- keyATM(docs              = keyATM_docs,
              no_keyword_topics = 9,
              keywords          = keywords,
              model             = "dynamic",
              model_settings    = list(time_index = df_period$period,
                                       num_states = 2), 
              options           = list(seed = 400,
                                       store_theta = TRUE, 
                                       thinning = 5))

KR, Aleksandra

  • Can you share with us your `out` or the code and data used to generate it? We also would need your `df` object. You can use `dput()` on variables or the `reprex` package to generate a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Without your data or a sample of it, we cannot generate your plot and try to customize it. – Ben Norris Sep 27 '20 at 15:14
  • @BenNorris here is the link to the out file https://1drv.ms/u/s!AgVxX5DNjxxOjRdkgvTN1mA1q023?e=nnyzm8 – Aleksandra Butneva Sep 27 '20 at 16:28
  • And here is how you generate it: `df_period %>% select(year, period) out <- keyATM(docs = keyATM_docs, no_keyword_topics = 9, keywords = keywords, model = "dynamic", model_settings = list(time_index = df_period$period, num_states = 2), options = list(seed = 400,store_theta = TRUE, thinning = 5)) top_words(out)` – Aleksandra Butneva Sep 27 '20 at 16:31
  • So, what I want is this mean theta (posterior distribution over topics for each document) in a nicer ggplot format – Aleksandra Butneva Sep 27 '20 at 16:35
  • 1
    We can't use your code, because we don't have `keywords` or `df_period`. It's better to use examples from the help pages, which everyone should have. – user2554330 Sep 27 '20 at 16:45

1 Answers1

4

The fig_timetrend object is a complicated thing, with the figure stored as fig_timetrend$figure. So you can use that as a ggplot2 object. For example, using code from the keyATM help topic:

  library(keyATM)
  library(quanteda)
  data(keyATM_data_bills)
  bills_keywords <- keyATM_data_bills$keywords
  bills_dfm <- keyATM_data_bills$doc_dfm  # quanteda dfm object
  keyATM_docs <- keyATM_read(bills_dfm)

  # keyATM Dynamic
  bills_time_index <- keyATM_data_bills$time_index
  # Time index should start from 1 and increase by 1
  bills_time_index <- as.integer(bills_time_index - 100)
  out <- keyATM(docs = keyATM_docs, model = "dynamic",
                no_keyword_topics = 5, keywords = bills_keywords,
                model_settings = list(num_states = 5,
                                      time_index = bills_time_index))
  fig_timetrend <- plot_timetrend(out, 
                               time_index_label = bills_time_index,
                               xlab = "Year of complaint submission",
                               scales = "fixed",
                               width = 5)
  library(ggplot2)
  fig_timetrend$figure + labs(title = "My title")

screenshot

user2554330
  • 37,248
  • 4
  • 43
  • 90