1

I have data as:

import pandas as pd
df2 = pd.DataFrame([[ "I am new at programming."],
                   [ "Leaves are falling from tree."]], columns = ['Text'])

input file print:

    Text
0   I am new at programming.
1   Leaves are falling from tree.

and I have a code that performs NLP task:

NewListA = []

for inputs in df2['Text']:
    t = nlp(inputs)
    res_A = {}
    for sent in t.sentences:
        for word in sent.words:
            # append to dict
            txt= f'{word.text}'
            upos= f'{word.upos}'
            res_A[txt]= upos

    NewList = list(res_A.items())            
    NewListA.append(NewList)

It output as:

[[('I', 'PRON'), ('am', 'AUX'), ('new', 'ADJ'), ('at', 'ADP'), ('programming', 'NOUN'), ('.', 'PUNCT')], [('Leaves', 'NOUN'), ('are', 'AUX'), ('falling', 'VERB'), ('from', 'ADP'), ('tree', 'NOUN'), ('.', 'PUNCT')]]

This results in an additional outmost list bracket. I want to remove outmost bracket and get:

[('I', 'PRON'), ('am', 'AUX'), ('new', 'ADJ'), ('at', 'ADP'), ('programming', 'NOUN'), ('.', 'PUNCT')], [('Leaves', 'NOUN'), ('are', 'AUX'), ('falling', 'VERB'), ('from', 'ADP'), ('tree', 'NOUN'), ('.', 'PUNCT')]

where I can convert it to dataframe and end up with this:

    POS
0   [('I', 'PRON'), ('am', 'AUX'), ('new', 'ADJ'), ('at', 'ADP'), ('programming', 'NOUN'), ('.', 'PUNCT')]
1   [('Leaves', 'NOUN'), ('are', 'AUX'), ('falling', 'VERB'), ('from', 'ADP'), ('tree', 'NOUN'), ('.', 'PUNCT')]

I looked into this solution or this however, these removing all brackets inside the list, which not I what I am looking for.

Note: my desire results example is:

[['A','B'],['B','C']] --> ['A','B'],['B','C']

Red
  • 26,798
  • 7
  • 36
  • 58
Bilgin
  • 499
  • 1
  • 10
  • 25

3 Answers3

2

There isn't a need to remove the brackets in order to produce your desired result:

import pandas as pd

NewListA = [[('I', 'PRON'), ('am', 'AUX'), ('new', 'ADJ'), ('at', 'ADP'), ('programming', 'NOUN'), ('.', 'PUNCT')], [('Leaves', 'NOUN'), ('are', 'AUX'), ('falling', 'VERB'), ('from', 'ADP'), ('tree', 'NOUN'), ('.', 'PUNCT')]]

df = pd.DataFrame({'POS':NewListA})
print(df)

Indepently of the size of NewListA, you could manage each list by managing the rows of df.

MrNobody33
  • 6,413
  • 7
  • 19
Red
  • 26,798
  • 7
  • 36
  • 58
  • thanks for the answer, but The size of list changes. Like now i have two list, but this can change and i dont want to write as: lst1, lst2, ... – Bilgin Jun 29 '20 at 18:39
0

You are working with a 2-d array(NewListA),


Use NewListA[0] to get:

[('I', 'PRON'), ('am', 'AUX'), ('new', 'ADJ'), ('at', 'ADP'), ('programming', 'NOUN'), ('.', 'PUNCT')]

Use NewListA[1] to get:

[('Leaves', 'NOUN'), ('are', 'AUX'), ('falling', 'VERB'), ('from', 'ADP'), ('tree', 'NOUN'), ('.', 'PUNCT')]
Red
  • 26,798
  • 7
  • 36
  • 58
Ahmed
  • 74
  • 4
  • Thanks for the answer, but my input size changes. Do you have any other suggestions to store elements or get rid of outmost bracket? – Bilgin Jun 29 '20 at 18:32
0

If you remove the outermost bracket from NewListA, the result will be an implicit tuple, which should behave identically to a list in your code.

Example

[list1, list2, list3] is a list containing three lists.

list1, list2, list3 is equivalent to (list1, list2, list3), which is a 3-tuple.

Emerson Harkin
  • 889
  • 5
  • 13
  • thanks for comments. Do you mean to save list into tuple at the end instead of list of list? – Bilgin Jun 29 '20 at 18:41
  • @Bilgin list of tuple and list of list are equivalent in your case. There's probably no problem with your code, as [others have suggested](https://stackoverflow.com/a/62644609/13568555). – Emerson Harkin Jun 29 '20 at 18:45