2

I was very impressed with the OpenCalais system. It's (is/has) a web service where you send your text, they analyze it, then you are provided with a series of categorized (RDF enabled) tags that your document belongs to.

But - at the moment - English is the only supported language.

Do you know of similar systems that handle multilanguage documents? (I'm interested n Italian, but multi language is a plus, of course)

David Riccitelli
  • 7,491
  • 5
  • 42
  • 56
Claudio
  • 5,740
  • 5
  • 33
  • 40

2 Answers2

4

Apache Stanbol can analyze texts in many different languages. So far the following languages are supported (precision and recall values may vary according to the language):

  • English,
  • 中文 (Chinese),
  • Español (Spanish),
  • Русский (Russian),
  • Português (Portuguese),
  • Deutsch (German),
  • Italiano (Italian),
  • Nederlands (Dutch),
  • Svenska (Swedish),
  • Dansk (Danish),
  • العربية (Arabic),
  • עברית (Hebrew),
  • 日本語 (Japanese).

The analysis will return the discovered entities. The analysis output format can be:

  • JSON-LD,
  • RDF/XML,
  • RDF/JSON,
  • Turtles,
  • N-TRIPLES.

Entities, or tagging, of texts can be further tailored according to the system configuration. Ideally any custom vocabulary can be plugged into the system.

There are a couple of demo end-points:

Not sure whether all the above languages are supported in the afore-mentioned end-points.

RedLink GmbH is going to provide cloud services based on Apache Stanbol and related software.

The WordLift plugin for WordPress already provides text analysis within WordPress for all the afore-mentioned languages (currently in testing stage). You can try it out installing the plug-in in WordPress and submitting textual contents in the post body.

You can also subscribe and write to the Apache Stanbol mailing list for specific requests or information.

David Riccitelli
  • 7,491
  • 5
  • 42
  • 56
0

OpenCalais supports both French and Spanish metadata tagging for entities. The set of entities will be extended in future releases. See our online documentation at http://www.opencalais.com/documentation/calais-web-service-api