1

I used the TextRank algorithm for ranking sentences of some articles. The total number of sentences in the articles range from 10 to 71. I wanted to know if there is any way of determining the value of k for selecting the top k ranked sentences as the summary. Or is that fixed to be some number?

Pritam Deka
  • 23
  • 1
  • 6

1 Answers1

0

That's probably mostly determined by how large of a summarization you need. In other words, if the summary must fit into some constraint (e.g., 400 characters or less; at least 50 words) then what's an appropriate setting of k to satisfy the constraints? Relatively speaking, it's similar to hyperparmeter optimization in ML.

Also, the quality will tend to be affected. Too small of k yields results that probably aren't effective. FWIW, I try to use k >= 3 generally. Too large of k and the results become less readable.

Paco
  • 602
  • 1
  • 9
  • 19
  • 1
    Thanks. So it is fixed to be a certain number according to our needs. – Pritam Deka Dec 03 '20 at 10:57
  • It may also be the case that you could vary the value of `k` for each document being summarized -- if there were criteria to use in that decision. For example, perhaps you wanted summaries of a certain target length, then you might iterate through a range to see which gets closest. – Paco Dec 04 '20 at 16:24