4

From the spark-nlp Github page I downloaded a .zip file containing a pre-trained NerCRFModel. The zip contains three folders: embeddings, fields, and metadata.

How do I load that into a Scala NerCrfModel so that I can use it? Do I have to drop it into HDFS or the host where I launch my Spark Shell? How do I reference it?

Thiago Custodio
  • 17,332
  • 6
  • 45
  • 90
Marsellus Wallace
  • 17,991
  • 25
  • 90
  • 154

1 Answers1

5

you just need to provide the path where the folders you mentioned are contained,

import com.johnsnowlabs.nlp.annotators.ner.crf.NerCrfModel
val path = "path/to/unziped/file/folder"
val model = NerCrfModel.read.load(path)
// use your model
model.setInputCols(someCol)
model.transform(yourData) // which contains 'someCol',

As long as I remember, you can place the folder in local FS or distributed FS, hope this helps other users as well!.

best, Alberto.

AlbertoAndreotti
  • 478
  • 4
  • 13
  • For users of Java: You need to cast the result of `load(...)` to the required Class. And somehow I think the SparkNLP API here should be consistent with Spark's `PretrainedPipeline.fromDisk()`. – martin_wun Oct 12 '21 at 08:17