Another working answer, slightly different, with some explanatory comments:
sentences = [['its', 'a', 'great', 'show'], ['nice', 'movie'], ['good', 'series']]
labels = [['O', 'O', 'O', 'B_A'], ['O', 'B_A'], ['O', 'B_A']]
filename = "data.txt"
outputstring = ""
# Construct the output string with zip.
# First we're zipping the elements of the source lists,
# which gives a sequence of pairs like this:
# (sentences[0], labels[0]), (sentences[1], labels[1]), etc.
# Then we iterate over that sequence and zip up the contents of
# each pair of lists in the same way, and concatenate those strings
# with the outputstring, followed by a single newline character.
# After that, an extra newline is added to break up the groups.
for sentence, label in zip(sentences, labels):
for i, j in zip(sentence, label):
outputstring += i + " " + j + "\n"
outputstring += "\n"
# This removes the extra whitespace at the end.
outputstring = outputstring.rstrip()
# Finally, you can just write the string to your output file.
with open(filename, "w") as f:
f.write(outputstring)
And here is a second example without using zip
:
sentences = [['its', 'a', 'great', 'show'], ['nice', 'movie'], ['good', 'series']]
labels = [['O', 'O', 'O', 'B_A'], ['O', 'B_A'], ['O', 'B_A']]
filename = "data.txt"
outputstring = ""
# Check the length of each list of lists and make sure they're the same:
sentenceslen = len(sentences)
labelslen = len(labels)
if sentenceslen != labelslen:
print("Malformed data!")
raise SystemExit
# Iterate over both lists using their lengths to define the range of indices.
for i in range(sentenceslen):
# Check the lengths of each pair of sublists and make sure they're the same:
subsentenceslen = len(sentences[i])
sublabelslen = len(labels[i])
if subsentenceslen != sublabelslen:
print("Malformed data!")
raise SystemExit
# Iterate over each pair of sublists using their lengths to define the range of indices:
for j in range(subsentenceslen):
# Construct the outputstring by using both indices to drill down to the right strings,
# ending with newline:
outputstring += sentences[i][j] + " " + labels[i][j] + "\n"
# Break up groups with newline again:
outputstring += "\n"
# Remove whitespace at end:
outputstring = outputstring.rstrip()
# Write outputstring to file:
with open(filename, "w") as f:
f.write(outputstring)
I do not recommend actually using the code in the second example. It is unnecessarily complicated, but I include it for completeness and to illustrate how the use of the zip
function above saves effort. The zip
function also does not care if you feed it lists of different length, so your script won't crash if you try that but don't check for it; it'll spit out pairs of values up to the length of the smaller list, and ignore values after that for the larger list.