So I am currently learning how to pre-process text, unfortunately I am running into an error during extraction due to "too many values to unpack". I Believe the issue is due to how my lists are currently outputted by a function.
My goal is to have each word in a sentence as part of a list, and a list containing all sentences.
Currently, if I print my training_data[0], the output is:
[[('B-Actor', 'steve_PRPVBP'), ('I-Actor', 'mcqueen_VBN'), ('O', 'provided_VBN'), ('O', 'a_DT'), ('B-Plot', 'thrilling_NN'), ('I-Plot', 'motorcycle_NN'), ('I-Plot', 'chase_NN'), ('I-Plot', 'in_IN'), ('I-Plot', 'this_DT'), ('B-Opinion', 'greatest_JJS'), ('I-Opinion', 'of_IN'), ('I-Opinion', 'all_DT'), ('B-Plot', 'ww_NNP'), ('I-Plot', '2_NNP'), ('I-Plot', 'prison_NNP'), ('I-Plot', 'escape_NN'), ('I-Plot', 'movies_NNS')]]
is there any way to restructure my lists such that I only get [()] for the output, I believe I have 1 too many nested lists currently. below is a snippet of my desired output:
[('B-Actor', 'steve_PRPVBP'), ('I-Actor', 'mcqueen_VBN'), ('O', 'provided_VBN'), ('O', 'a_DT'), ('B-Plot', 'thrilling_NN'), ('I-Plot', 'motorcycle_NN'), ('I-Plot', 'chase_NN'), ('I-Plot', 'in_IN'), ('I-Plot', 'this_DT'), ('B-Opinion', 'greatest_JJS'), ('I-Opinion', 'of_IN'), ('I-Opinion', 'all_DT'), ('B-Plot', 'ww_NNP'), ('I-Plot', '2_NNP'), ('I-Plot', 'prison_NNP'), ('I-Plot', 'escape_NN'), ('I-Plot', 'movies_NNS')]
To give more context, currently using the following tests:
print(len(training_data))
print(len(training_data[0]))
print(len(training_data[0][0]))
I get an output of:
7816
1
17
I want to be able to access my list such that the above give an output of:
7816
17
2