0

I have this list of data that I want to convert to dataframe:

enter image description here

There is 22 indices (let's rename it as docs), and if we examine each doc (let's see doc 0):

enter image description here

Now, I want to translate this triple-nested list into dataframe, where its columns are 'Word', 'Pos', and 'Biotag'. If we take an example from second picture, the table would be :

Word        Pos      Biotag
____________________________
S7892537B1  NNP      O
-           :        O
High        JJ       O
...

However, this is only data from first doc. I want to add all 22 documents into single dataframe, and I want to add 'Docs' column which indicates the document of each entry, such as it would be:

Word        Pos      Biotag    Docs
___________________________________
S7892537B1  NNP      O         0
-           :        O         0
High        JJ       O         0
...
encoding    VBG      O         2
Dev.        NNP      I         2
...
et          NNP      I         22 

I've tried this and that, however the returning columns always didn't match. Any help appreciated, thank you.

Mr.Riply
  • 825
  • 1
  • 12
  • 34
rayyar
  • 95
  • 1
  • 12

1 Answers1

0

For a single doc:

df_new = pd.DataFrame(df['Value'].values.tolist(), columns=['Word', 'Pos', 'Biotag'])

How do you want to process multiple docs?

aprilangel
  • 369
  • 2
  • 8