ValueError: spacy.strings.StringStore size changed, may indicate binary incompatibility. Expected 80 from C header, got 64 from PyObject

Question

i am using python 3.8.5 with jupyter notebook

spacy = 3.0.5

neuralcoref = 4.0

below is the code i running for testing

import datetime
import re
import time

import pandas as pd

from formative_assessment.dataset_extractor import ConvertDataType
from formative_assessment.feature_extractor import FeatureExtractor


class AEGrading:
    """
        Automatically evaluates, grades and provides feedback to students' answers of the datasets.
        Provides feedback as dict including total data of the student answer.
    """

    def __init__(self, qid, stu_answer, dataset, dataset_path, score=5):

        self.qid = qid
        self.stu_answer = stu_answer
        self.dataset = dataset
        self.length_ratio = len(stu_answer) / len(dataset[qid]["desired_answer"])
        self.score = score
        self.fe = FeatureExtractor(qid, stu_answer, dataset, dataset_path)
        self.wrong_terms = {}

        self.feedback = {"id": self.qid, "question": self.dataset[self.qid]["question"],
                         "desired_answer": self.dataset[self.qid]["desired_answer"], "student_answer": stu_answer,
                         "length_ratio": self.length_ratio, "is_answered": "-", "is_wrong_answer": "not wrong answer",
                         "interchanged": "-", "missed_topics": "-", "missed_terms": "-", "irrelevant_terms": "-",
                         "score_avg": 0, "our_score": 0}

    def is_answered(self, default="not answered"):
        """
            Checks if the student answered or not given the default evaluator's string. Assigns score to 'zero' if not
            answered.

        :param default: str
            String to be checked if student not answered
        :return: bool
            True if student answered, else False
        """

        re_string = " *" + default + " *"

        if re.match(re_string, self.stu_answer.lower()):
            self.feedback["is_answered"] = "not answered"
            self.score = 0
            return False

        else:
            self.feedback["is_answered"] = "answered"
            return True

    def iot_score(self):
        """
            Checks if there are any interchange of topics or missed topics and deduce the score accordingly. Deduce
            nothing from the score if there are no interchange of topics or missed topics

        :return: None
        """
        iot = self.fe.get_interchanged_topics()

        interchanged = iot["interchanged"]
        missed_topics = iot["missed_topics"]
        total_relations = iot["total_relations"]
        topics_num = iot["total_topics"]

        self.feedback["interchanged"] = interchanged
        self.feedback["missed_topics"] = missed_topics

        if interchanged:
            iot_deduce = len(interchanged) / total_relations
            self.score = self.score - (iot_deduce * self.score)

        if missed_topics:
            missed_deduce = len(missed_topics) / topics_num
            self.score = self.score - (missed_deduce * self.score)

    def missed_terms_score(self):
        """
            Checks if there are any missed terms in the student answer and deduce score accordingly

        :return: None
        """

        missed_terms = self.fe.get_missed_terms()
        self.feedback["missed_terms"] = missed_terms.keys()

        total = round(sum(missed_terms.values()), 3)
        self.score = self.score - (total * self.score)  # self.score/2

    def irrelevant_terms_score(self):
        """
            Checks if there are any irrelevant terms in the student answer. We do not deduce score for this feature, as
            we consider any irrelevant term as noise.

        :return: None
        """
        self.feedback["irrelevant_terms"] = self.fe.get_irrelevant_terms()


if __name__ == '__main__':

    PATH = "dataset/mohler/cleaned/"
    max_score = 5

    # Convert the data into  dictionary with ids, their corresponding questions, desired answers and student answers
    convert_data = ConvertDataType(PATH)
    dataset_dict = convert_data.to_dict()

    id_list = list(dataset_dict.keys())
    data = []

    # random.seed(20)
    for s_no in id_list[:7]:

        # s_no = random.choice(id_list)
        question = dataset_dict[s_no]["question"]
        desired_answer = dataset_dict[s_no]["desired_answer"]

        student_answers = dataset_dict[s_no]["student_answers"]
        scores = dataset_dict[s_no]["scores"]
        # score_me = dataset_dict[s_no]["score_me"]
        # score_other = dataset_dict[s_no]["score_other"]

        for index, _ in enumerate(student_answers):
            # index = random.randint(0, 12)
            start = time.time()
            student_answer = str(student_answers[index])

            print(s_no, student_answer)
            aeg = AEGrading(s_no, student_answer, dataset_dict, PATH, max_score)

            if aeg.is_answered():
                aeg.iot_score()
                aeg.missed_terms_score()
                aeg.irrelevant_terms_score()
                if aeg.score == 0:
                    aeg.feedback["is_wrong_answer"] = "wrong_answer"

            # aeg.feedback["score_me"] = score_me[index] # Only for mohler data
            # aeg.feedback["score_other"] = score_other[index]
            aeg.feedback["score_avg"] = scores[index]
            aeg.feedback["our_score"] = round((aeg.score * 4)) / 4  # Score in multiples of 0.25

            data.append(aeg.feedback)
            print(aeg.feedback)
            print("It took ", time.time() - start, " secs")
            print("----------------------------------------------------------")

            if len(data) % 50 == 0:
                df = pd.DataFrame(data)
                SAVE_PATH = "outputs/automatic_evaluation/II_NN/" + str(datetime.datetime.now()) + ".csv"
                df.to_csv(SAVE_PATH, sep=",")

    df = pd.DataFrame(data)
    SAVE_PATH = "outputs/automatic_evaluation/II_NN/" + str(datetime.datetime.now()) + ".csv"
    df.to_csv(SAVE_PATH, sep=",")

after i running above code i get the error as below

ValueError Traceback (most recent call last) in 9 import pandas as pd 10 ---> 11 from formative_assessment.dataset_extractor import ConvertDataType 12 from formative_assessment.feature_extractor import FeatureExtractor 13

~\Desktop\FYP\Automatic-Formative-Assessment-main\formative_assessment\dataset_extractor.py in 8 import pandas as pd 9 ---> 10 from formative_assessment.utilities.utils import Utilities 11

~\Desktop\FYP\Automatic-Formative-Assessment-main\formative_assessment\utilities\utils.py in 11 from typing import List 12 ---> 13 import neuralcoref 14 import numpy as np 15 import pytextrank

~\anaconda3\lib\site-packages\neuralcoref_init_.py in 12 warnings.filterwarnings("ignore", message="spacy.strings.StringStore size changed,") 13 ---> 14 from .neuralcoref import NeuralCoref 15 from .file_utils import NEURALCOREF_MODEL_URL, NEURALCOREF_MODEL_PATH, NEURALCOREF_CACHE, cached_path 16

strings.pxd in init neuralcoref.neuralcoref()

ValueError: spacy.strings.StringStore size changed, may indicate binary incompatibility. Expected 80 from C header, got 64 from PyObject

**i had try to uinstall the follwing method

pip uninstall neuralcoref

pip install neuralcoref --no-binary neuralcoref

but the problem still the same, hope someone can help me, very appreciate..**

Same here, downgrade the Spacy version may help. – Dammio Apr 26 '21 at 18:34 — Dammio, Apr 26 '21 at 18:34

score 3 · Answer 1 · answered Aug 18 '21 at 13:13

As others have already pointed out, spaCy 3 and higher doesn't support neuralcoref. Based on this comment, the spaCy team are working on the coref resolution problem actively to include it on their library under the hood, so stay tuned.
In the end, if you need this library right now, you should create a separate environment and do the following:

git clone https://github.com/huggingface/neuralcoref.git
cd neuralcoref
pip install -r requirements.txt
pip install -e .

score 2 · Answer 2 · answered Aug 15 '21 at 12:54

2

Please have a look at this answer https://stackoverflow.com/a/62844213/1264899

For neuralcoref to work, you need to use spaCy version 2.1.0 and python version 3.7. That is the only combination that neuralcored works for on Ubuntu 16.04 and on Mac.

answered Aug 15 '21 at 12:54

godidier

923
16
34

score 1 · Answer 3 · answered Apr 27 '21 at 13:25

1

In my case, I have to downgrade to Python 3.7.4 and it works. Have a look here: https://pypi.org/project/neuralcoref/#files and you can see "neuralcoref" supports Python 3.5, 3.6, and 3.7 only.

answered Apr 27 '21 at 13:25

Dammio

911
1
7
15

score -1 · Answer 4 · answered May 15 '21 at 15:28

-1

For me it worked when I use the method explained under the section 'Install NeuralCoref from source' (https://github.com/huggingface/neuralcoref).

I installed Cython and SpaCy first and then followed the process.

answered May 15 '21 at 15:28

WaveEye

1

A link to a solution is welcome, but please ensure your answer is useful without it: [add context around the link](//meta.stackexchange.com/a/8259) so your fellow users will have some idea what it is and why it’s there, then quote the most relevant part of the page you're linking to in case the target page is unavailable. [Answers that are little more than a link may be deleted.](/help/deleted-answers) – Anonymous May 15 '21 at 17:18

ValueError: spacy.strings.StringStore size changed, may indicate binary incompatibility. Expected 80 from C header, got 64 from PyObject

4 Answers4