Receiving, "An error was thrown and was not caught: The validation data provided must contain ..." when creating a Text Classifier Model with CreateML

Question

I am using Playground to create a Text Classifier Model using CreateML and keep getting the error:

Playground execution terminated: An error was thrown and was not caught:
▿ The validation data provided must contain class.
  ▿ type : 1 element
    - reason : "The validation data provided must contain class."

My code is relatively simple, using two columns from a data table. The textColumn is labeled "text" and the labelColumn is labeled "class":

import Cocoa
import CreateML

let data = try MLDataTable(contentsOf: URL(fileURLWithPath: "/Users/ ... .csv"))
let(trainingData, testingData) = data.randomSplit(by: 0.8, seed: 5)
let sentimentClassifier = try MLTextClassifier(trainingData: trainingData, textColumn: "text", labelColumn: "class")
let evaluationMetrics = sentimentClassifier.evaluation(on: testingData, textColumn: "text", labelColumn: "class")
let evaluationAccuracy = (1.0 - evaluationMetrics.classificationError) * 100

The only difference I can find between this and the code provided in the Apple Developer Documentation is that instead of

let evaluationMetrics = sentimentClassifier.evaluation(on: testingData, textColumn: "text", labelColumn: "class")

their documentation is:

let evaluationMetrics = sentimentClassifier.evaluation(on: testingData)

and version 11.2.1 of Xcode gives me a failure if I try using the line from the Apple Developer Documentation.

Thanks in advance for any help you can offer.

Did you found any solution to this? – Jayprakash Dubey May 05 '20 at 16:18 — Jayprakash Dubey, May 05 '20 at 16:18

score 0 · Accepted Answer · edited Sep 11 '20 at 22:36

0

Try This! it works for me

let data = try MLDataTable(contentsOf: URL(fileURLWithPath: "/Users/justinmacbook/Desktop/twitter-sanders-apple3.csv"))

let (trainingData, testingData) = data.randomSplit(by: 0.8, seed: 5 )

let sentimentClassifier = try MLTextClassifier(trainingData: trainingData, textColumn: "class", labelColumn: "text")

let evaluationMetrics = sentimentClassifier.evaluation(on: testingData, textColumn: "class", labelColumn: "text")

let evaluationAccuracy = (1.0 - evaluationMetrics.classificationError) * 100

edited Sep 11 '20 at 22:36

Fabio Vinicius Binder

13,024
4
34
33

answered May 19 '20 at 09:28

Prince Rana

26
1

2

Format the code properly please so it is actually readable and easy to understand for other( including future users) – MANU May 19 '20 at 10:06
That worked @PrinceRana. The only issue was that I had the textColumn and labelColumn labels swapped. Thanks. – Jerry Rufe May 29 '20 at 22:21

Madhur Ahuja · Answer 2 · 2020-06-27T16:11:27.147

This is the solution that worked for me. I believe this original code will only work on OSX 10.15+

import Cocoa
import CreateML
import NaturalLanguage

let data = try MLDataTable(contentsOf: URL(fileURLWithPath: "/Users/m0a04y6/Desktop/iOS/ML/twitter-sanders-apple3.csv"))

let (trainingData, testingData) = data.randomSplit(by: 0.8, seed: 5)
    
let parameters = MLTextClassifier.ModelParameters.init(validationData: trainingData, algorithm: MLTextClassifier.ModelAlgorithmType.maxEnt(revision: 1), language: NLLanguage.english, textColumnValidationData: "text", labelColumnValidationData: "class")
let sentimentClassifier = try MLTextClassifier(trainingData: trainingData, textColumn: "text", labelColumn: "class", parameters: parameters)

let evaluationMetrics = sentimentClassifier.evaluation(on: testingData, textColumn: "text", labelColumn: "class")     //Training accuracy as a percentage

let evaluationAccuracy = (1.0 - evaluationMetrics.classificationError) * 100
print(evaluationAccuracy)

let metadata = MLModelMetadata(author: "Madhur Ahuja", shortDescription: "A model trained to classify movie review sentiment", version: "1.0")
try sentimentClassifier.write(to: URL(fileURLWithPath: "/Users/m0a04y6/Desktop/iOS/ML/sentiment.mlmodel"), metadata: metadata)

score 0 · Answer 3 · edited Apr 17 '23 at 18:20

Try this:

let (trainingData, testingData) = data.randomSplit(by: 0.8, seed: 5)

let sentimentClassifier = try MLTextClassifier(
  trainingData: trainingData,
  textColumn: "text",
  labelColumn: "class"
)

let evaluationMetrics   = sentimentClassifier.evaluation(on: testingData, textColumn: "text", labelColumn: "class")

let evaluationAccuracy = (1.0 - evaluationMetrics.classificationError) * 100

Receiving, "An error was thrown and was not caught: The validation data provided must contain ..." when creating a Text Classifier Model with CreateML

3 Answers3