3

I have a document, after few annotations, i am writing it into a new view using HTMLConverter

Sample Input:

<p class="MsoNormal"><span data-bkmark="para10121"></span><span style="font-family:Arial; font-size:10pt; color:#color: #000000">[1] SJ. Goetsch,BD. Murphy,R. Schmidt,et al. "Physics of rotating gamma systems for stereotactic radiosurgery. "</span> <span style="font-family:Arial; font-size:10pt; color:#color: #000000">International Journal of Radiation Oncologybiologyphysics,</span> vol.<span style="font-family:Arial; font-size:10pt; color:#color: #000000">43, no.3, pp.689-696, 1999.</span><span data-bkmark="para10121"></span></p>

I am using htmlconvertor to create a new view "plaintextview"

 CONFIGURE(HtmlAnnotator, "onlyContent" = false);
                 Document{-> EXEC(HtmlAnnotator)};
                 Document { -> CONFIGURE(HtmlConverter, "inputView" = "_InitialView","outputView" = "plaintextview"),
                 EXEC(HtmlConverter,{TAG})};

After which i would run my own engine and perform few manual annotations

try {
          for (AnnotationFS afs : CasUtil.select(cas.getView("plaintextview"), type))
          {
            Feature bookmarkFtr = type.getFeatureByBaseName("RefBookmark");
            System.out.println("\n Ref is " + afs.getCoveredText());
            System.out.println("STart is " + afs.getBegin());
            System.out.println("End is " + afs.getEnd());
            String test = " vol.43, no.3, pp.689-696, 1999.";
            if (afs.getCoveredText().contains(test)) {
              int start = afs.getCoveredText().indexOf(test) + afs.getBegin();
              int end = start + test.length();
              testanno annotation = new testanno(cas.getView("plaintextview").getJCas());
              annotation.setBegin(start);             
              annotation.setEnd(end);
              annotation.addToIndexes();
              
            }
          }
        }
        catch (Exception e)
        {
          e.printStackTrace();
        }

This code will annotate the particular text in the plaintextview (Why? - because the _initialview document will have html spans in between the text ex: vol.43, no.3, < some html tags > pp. 689-696, 1999.)

So how do i get my annotaions from plaintextview to initial view or use these annotaions inside my ruta script using my annotations from different views(i.e, _initialview and plaintextview) ?

1 Answers1

0

In Ruta you cannot directly write rules for specific CAS views. (You could use EXEC to apply an analysis engine on the different view from within a Ruta script.)

The normal way to approach this is on framework level by either apply sofa mapping in an aggregated analysis engine or copying the view to the _initialView of a new CAS.

DISCLAIMER: I am a developer of UIMA Ruta

Peter Kluegl
  • 3,008
  • 1
  • 11
  • 8