I'm trying to zo zero-shot classification over a dataset with 5000 records. Right now I'm using a normal Python loop, but it is going painfully slow. Is there to speed up the process using Transformers or Datasets structures? This is how my code looks right now:
classifier = pipeline("zero-shot-classification", model='cross-encoder/nli-roberta-base')
# Create prediction list
candidate_labels = ["Self-direction: action", "Achievement", "Security: personal", "Security: societal", "Benevolence: caring", "Universalism: concern"]
predictions = []
for index, row in reduced_dataset.iterrows():
res = classifier(row["text"], candidate_labels)
partial_prediction = []
for score in res["scores"]:
if score >= 0.5:
partial_prediction.append(1)
else:
partial_prediction.append(0)
if index % 100 == 0:
print(index)
predictions.append(partial_prediction)
partial_prediction