0

So I am currently learning how to pre-process text, unfortunately I am running into an error during extraction due to "too many values to unpack". I Believe the issue is due to how my lists are currently outputted by a function.

My goal is to have each word in a sentence as part of a list, and a list containing all sentences.

Currently, if I print my training_data[0], the output is:

[[('B-Actor', 'steve_PRPVBP'), ('I-Actor', 'mcqueen_VBN'), ('O', 'provided_VBN'), ('O', 'a_DT'), ('B-Plot', 'thrilling_NN'), ('I-Plot', 'motorcycle_NN'), ('I-Plot', 'chase_NN'), ('I-Plot', 'in_IN'), ('I-Plot', 'this_DT'), ('B-Opinion', 'greatest_JJS'), ('I-Opinion', 'of_IN'), ('I-Opinion', 'all_DT'), ('B-Plot', 'ww_NNP'), ('I-Plot', '2_NNP'), ('I-Plot', 'prison_NNP'), ('I-Plot', 'escape_NN'), ('I-Plot', 'movies_NNS')]]

is there any way to restructure my lists such that I only get [()] for the output, I believe I have 1 too many nested lists currently. below is a snippet of my desired output:

[('B-Actor', 'steve_PRPVBP'), ('I-Actor', 'mcqueen_VBN'), ('O', 'provided_VBN'), ('O', 'a_DT'), ('B-Plot', 'thrilling_NN'), ('I-Plot', 'motorcycle_NN'), ('I-Plot', 'chase_NN'), ('I-Plot', 'in_IN'), ('I-Plot', 'this_DT'), ('B-Opinion', 'greatest_JJS'), ('I-Opinion', 'of_IN'), ('I-Opinion', 'all_DT'), ('B-Plot', 'ww_NNP'), ('I-Plot', '2_NNP'), ('I-Plot', 'prison_NNP'), ('I-Plot', 'escape_NN'), ('I-Plot', 'movies_NNS')]

To give more context, currently using the following tests:

print(len(training_data))
print(len(training_data[0]))
print(len(training_data[0][0]))

I get an output of:

7816
1
17

I want to be able to access my list such that the above give an output of:

7816
17
2
Loai Alnouri
  • 83
  • 1
  • 1
  • 8
  • Try `print(training_data[0][0])` – Andrej Kesely Jul 25 '21 at 15:53
  • @AndrejKesely Thanks for your comment, I know I could do that but I was looking more so for a function to restructure my data rather than just access it. I dont want unneccessary nesting but unsure on how to reprocess it into a different format – Loai Alnouri Jul 25 '21 at 15:55
  • 1
    I think this is already answered here: https://stackoverflow.com/questions/952914/how-to-make-a-flat-list-out-of-a-list-of-lists – croncroncron Jul 25 '21 at 16:01
  • ahh okay, Im still very new so couldnt find the appropriate terminology to search for flattening lists, thank you! – Loai Alnouri Jul 25 '21 at 16:03

1 Answers1

2

Use from_iterables from itertools

import itertools

flat_list = [list(itertools.chain.from_iterable(l)) for l in training_data]
Corralien
  • 109,409
  • 8
  • 28
  • 52