0

Given I have a list, how can I unpack it into a pandas data frame such as:

data = {u'2344': ["id", "value1", "value2", "01", "Addf112", "Addf113", "02", " ", "Addf213"]}

>> id  value1  value2
   01  Ad112   Ad113
   02          Ad213
ArchieTiger
  • 2,083
  • 8
  • 30
  • 45
  • 1
    Use the first three values for the ```columns``` argument and the rest (taken in groups of three) for the data. If the list is named ```a``` then ```zip(a[::3], a[1::3], a[2::3]``` should get you started. – wwii Mar 18 '15 at 21:02

1 Answers1

1

You'd have to extract the individual elements for the column names and then construct a list consisting of 2 lists for your 2 rows of data:

In [23]:

data = {u'2344': ["id", "value1", "value2", "01", "Addf112", "Addf113", "02", " ", "Addf213"]}

pd.DataFrame(columns = data['2344'][:3], data=[data['2344'][3:6], data['2344'][6:]])
Out[23]:
   id   value1   value2
0  01  Addf112  Addf113
1  02           Addf213

A dynamic method would be to use a chunker (modified from one of the answers to this question) to build the dict and use this to construct the df:

In [59]:

def chunker(seq, stride):
    cols = seq[:stride]
    data = [seq[stride:][pos::stride] for pos in range(0, stride)]
    return dict(zip(cols,data))

pd.DataFrame(chunker(data['2344'],3))

Out[59]:
   id   value1   value2
0  01  Addf112  Addf113
1  02           Addf213
Community
  • 1
  • 1
EdChum
  • 376,765
  • 198
  • 813
  • 562
  • is it possible to get the rows dynamically without specifying them manually `data=[data['2344'][3:6], data['2344'][6:]]` ? – ArchieTiger Mar 18 '15 at 21:12
  • You'd have to know what your data structure was before hand, in this case you'd have to know that the columns were the first three values and the following 3 values are the row data – EdChum Mar 18 '15 at 21:14
  • I've added some code that should be able to handle a variable length or the rows in the list – EdChum Mar 18 '15 at 21:29
  • do you happen to have a work around on on the data if it contains subitems in single quotes, `data = {u'2344': ['"id", "value1", "value2"', '"01", "Addf112", "Addf113"',' "02", " ", "Addf213"']}` – ArchieTiger Mar 19 '15 at 09:54
  • My code would still work, you'd pass the `stride` value as `1` in that case `pd.DataFrame(chunker(data['2344'],1))` – EdChum Mar 19 '15 at 09:58
  • all the values comes with double quotes, how do I remove this? – ArchieTiger Mar 19 '15 at 11:07
  • Please post another question, it doesn't help anyone to keep changing your data and question and using comments to seek further answers – EdChum Mar 19 '15 at 11:20