3

I have downloaded TreeTaggerv3.2 for Windows and have configured it per the install.txt. I am trying to use it in R with koRpus package. I have set the kRp.env as -

set.kRp.env(TT.cmd="C:\\TreeTagger\\bin\\tag-english.bat", lang="en", 
   preset="en", treetagger="manual", format="file", 
    TT.tknz=TRUE, encoding="UTF-8" )

.My data to be tagged is in a file and trying to use it as treetag("myfile.txt") but it is throwing the error-

Error in matrix(unlist(strsplit(tagged.text, "\t")), ncol = 3, byrow = TRUE, : 'data' must be of a vector type, was 'NULL'

In addition: Warning message: running command 'C:\windows\system32\cmd.exe /c C:\TreeTagger\bin\tag-english.bat

C:\Users\vivsingh\Desktop\NLP\tree_tag_ex.txt' had status 255

The standalone TreeTagger is working on by windows.Any idea on how it works?

akrun
  • 874,273
  • 37
  • 540
  • 662
vivsingh
  • 41
  • 1
  • 5
  • What if you set the path in `treetag`, e.g. `treetag(file = "myfile.txt", treetagger="C:/TreeTagger/bin/tag-english.bat", TT.options=c(path="C:/TreeTagger/"))`? – lukeA Jan 25 '16 at 08:56
  • I tried as treetag(file="C:\\Users\\vivsingh\\Desktop\\NLP\\tree_tag_ex.txt", treetagger = "C:/TreeTagger/bin/tag-english.bat", TT.options = list(path="C:/TreeTagger/"), lang = "en"). Warning is gone but the error remains- Error in matrix(unlist(strsplit(tagged.text, "\t")), ncol = 3, byrow = TRUE, : 'data' must be of a vector type, was 'NULL' – vivsingh Jan 27 '16 at 04:01
  • Not reproducible. Mabye update your package? Provide your data? Voting to close. – lukeA Jan 27 '16 at 08:04
  • :( R version is 3.2.2 and koRpus package version is 0.05-6, I guess these are the latest. OS is windows 7 64-bit.The data in the file is just a plain text e.g. sql server – vivsingh Jan 28 '16 at 05:06
  • I've the exactly the same setup. And `writeLines(text = 'All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood.', con = "myfile.txt"); treetag("myfile.txt") ` works fine. – lukeA Jan 28 '16 at 09:01
  • Could it be a JAVA issue? I have java version 1.8.0_65 – vivsingh Jan 29 '16 at 03:50

4 Answers4

2

I had the exact same error and warning while trying lemmatization on R word vector following Bernhard Learns blog using windows 7 and R 3.4.1 (x64). The issue was also appearing using textstem package but TreeTagger was running properly in cmd window.

I mixed several answers I found on this post and here is my steps and code running properly:

get into R win_library (~\Documents\R\win-library\3.4\rJava\jri\x64\jri.dll) and copy jri.dll (thanks kravi!) to replace it the parent folder.

close and restart R

library(koRpus)

set.kRp.env(TT.cmd="C:\\TreeTagger\\bin\\tag-english.bat", lang="en", preset="en", treetagger="manual", format="file", TT.tknz=TRUE, encoding="UTF-8")
lemma_tagged <- treetag(lemma_unique$word_clean, treetagger="manual", format="obj", TT.tknz=FALSE , lang="en", TT.options=list(path="c:/TreeTagger", preset="en"))
lemma_tagged_tbl <- tbl_df(lemma_tagged@TT.res)

Hope it helps.

Xochitl C.
  • 201
  • 2
  • 7
1

I am posting this answer to keep a record. I also faced the same issue due to incorrect specification of the location of jri.dll on 64-Bit processor and windows 8.1. If we call set.kRp.env(TT.cmd="manual", lang="en", TT.options=list(path="/path/to/tree-tagger-windows-x.x/TreeTagger", preset="en")) and we follow either of following two steps, we can resolve this error:

  1. While installing R, if we install only 64 Bit version of R, and specify the proper path for these variables

    LD_LIBRARY_PATH = /path/to/rJava/jri
    JAVA_HOME = /path/to/jdk1.x.x
    java.library.path = /path/to/rJava/jri/jri.dll
    CLASSPATH = /path/to/rJava/jri

  2. If we already installed both versions viz. 32 bit and 64 bit of R on your computer then just copy jri.dll from /path/to/rJava/jri/x64/jri.dll and replace at path/to/rJava/jri/jri.dll. Further, we need to set the path of above mentioned four variables.

kravi
  • 747
  • 1
  • 8
  • 13
0

I've got this issue (very similar I guess) and posted query to GitHub. https://github.com/unDocUMeantIt/koRpus/issues/7 The current working solution for me for this case was easier than I could expect, just downgrading the koRpus package. This can change with time but this version should remain appropriate.

library("devtools")
install_github("unDocUMeantIt/koRpus", ref="0.06-5")

This package is not Java related they said.

Peter.k
  • 1,475
  • 23
  • 40
0

You can face the same error while setting up the korpus environment and getting the result from treetagger. For example, when you use:

tagged.text <- treetag(
  "C:/temp/sample_text.txt",
  treetagger = "manual",
  lang = "en",
  TT.options = list(
    path = "c:/Treetagger",
    preset = "en"
  ),
  doc_id = "sample"
)

You would receive a similar error

Error: Awww, this should not happen: TreeTagger didn't return any useful data.

This can happen if the local TreeTagger setup is incomplete or different from what presets expected.
You should re-run your command with the option 'debug=TRUE'. That will print all relevant configuration.
Look for a line starting with 'sys.tt.call:' and try to execute the full command following it in a command line terminal. Do not close this R session in the meantime, as 'debug=TRUE' will keep temporary files that might be needed.
If running the command after 'sys.tt.call:' does fail, you'll need to fix the TreeTagger setup.
If it does not fail but produce a table with proper results, please contact the author!

Here you need to change the value of treetagger, from

treetagger = "manual"

to

treetagger = "kRp.env"

However, before that remember to set the kRp.env as @Xochitl C. suggested in their answer

set.kRp.env(TT.cmd="C:\\TreeTagger\\bin\\tag-english.bat", lang="en", preset="en", treetagger="manual", format="file", TT.tknz=TRUE, encoding="UTF-8")

Once you do this, you'll get the desired result.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Mudassar
  • 85
  • 6