How to solve "ValueError: setting an array element with a sequence"

Question

Here is an example of my data set

d = {'TEXT': ['History: A 59  year  old female, was sent to R/O lung nodule. Findings:  Lungs and airway:  The study reveals a speculated nodule with pleural tagging at anterior basal segment of LLL, measured 1.9x1.4x2.0 cm in size. Pleural tagging is seen. Partial encasement of subsegmental bronchi is seen.  CA lung is considered.','History: A 59  year  old woman with history of lung cancer S/P left lower lobectomy with close to pleural margin and left adrenal nodule , was sent for evaluation before post  operative RT. Findings: Comparison is made to the prior study on 03/02/2009. Chest:   The study reveals evidence of left lower lobectomy with compensatory hyperinflation of the LUL.']}
df2 = pd.DataFrame(data=d)

I want to implement Latent Diritchlet allocation (LDA) for context generation for each sentence. I have separately trained my model for it and want to test on these data.

To reach to LDA, I tokenize the text into sentences as I am interested to classify each sentence with a topic. After sentence tokenization, I implement TFIDF and then to LDA. While reaching upto LDA, I get this error. Following is my code.

df2["sent_token"] = df2["TEXT"].apply(nltk.sent_tokenize)
vectoriser = TfidfVectorizer(tokenizer=identity_tokenizer,stop_words='english',lowercase=False)
df2['tfidf1'] = vectoriser.fit_transform(df2['sent_token'])
lda = LatentDirichletAllocation(n_components =5)
df2['tfidf_lda']= lda.fit_transform(df2['tfidf1'])

Here is where I get this error "ValueError: setting an array element with a sequence." While going through similar errors, ValueError: setting an array element with a sequence I found it may be because the rows have a different number of sentences resulting in different length or sequences. But this is the heterogeneity I have and I am not really sure what is the problem. Please help!!

I cannot debug. Have no clue of the data underline. Can you provide some bogus datato make it easier? — powerPixie, Sep 23 '19 at 19:13
please show the function identity_tokenizer what u have written for tokenizer — qaiser, Sep 24 '19 at 09:34

How to solve "ValueError: setting an array element with a sequence"

0 Answers0