0

When you share something on Facebook or Digg, it generates some summary of the page. How would I do this in Perl? What algorithms are there?

For example:

If I go to Facebook and tried to share this question as a link: How can I create a website summary with Perl?

It retrieves "Facebook/Digg get website summary? - Stack Overflow" as the title (which is just the title of the page) and [... incomplete question?]

Community
  • 1
  • 1
Timmy
  • 12,468
  • 20
  • 77
  • 107

4 Answers4

4

CPAN is your friend.

Some promising looking modules:

pimlottc
  • 3,066
  • 2
  • 29
  • 24
2

Assuming you mean sharing a link...

Usually the summary is written by the user submitting the URL. If you have to write a summary automagically this can be achieved by:

  • Using the first 100 or so characters of the document body (in itself not easy)
  • Using metadata like the description or keywords (often empty or spammed)
  • Context-relevant summaries like recreating Google snippets (sorry its PHP but simple)
  • Tags/keywords from the document using something like the Yahoo Keyword Extractor API or your own keyword density function

Your best bet is to ask the user!

Hope that helps somewhat :)

Community
  • 1
  • 1
Al.
  • 2,872
  • 2
  • 22
  • 34
1

Basically you want to scrape the URL and find the "most significant paragraph" which might be the first <div> or <p> element after the first <h2> or <h1>, depending on the layout of the page.

xkcd150
  • 8,767
  • 3
  • 23
  • 17
1

You could check and see if there is a meta description on the page, but that leaves you at the mercy of whoever wrote the meta description.

Bryan Denny
  • 27,363
  • 32
  • 109
  • 125