2

I got this:

columns = ['a','b','c']
data = [1,2,3],[3,4],[4,5,5]
df = pandas.DataFrame({i:pandas.Series(j) for i in columns for j in data})
print(df)

Output:

   a  b  c
0  4  4  4
1  5  5  5
2  5  5  5

I need:

   a  b  c
0  1  3  4
1  2  4  5
2  3     5

I really don't understand why this is not working. I know I'm accessing the elements in data in the right way.

Any tips?

teller.py3
  • 822
  • 8
  • 22

2 Answers2

2

This should do it:

import pandas as pd

data = [[1, 2, 3], [3, 4], [4, 5, 5]]
df = pd.DataFrame(data).transpose()
df.columns = columns

Output:

    a    b    c
0  1.0  3.0  4.0
1  2.0  4.0  5.0
2  3.0  NaN  5.0
sobek
  • 1,386
  • 10
  • 28
1

You are overwriting values when you enter the second loop. What you are doing is:

import pandas


columns = ['a','b','c']
data = [1,2,3],[3,4],[4,5,5]

myDict = {}
for i in columns:
    for j in data:
        myDict[i]=j
print(pandas.DataFrame(myDict))

That's why your data is overwritten. What you want to do is clearly:

myDict = {}
for i,key in enumerate(columns):
    myDict[key] = data[i]

However, this will cause:

raise ValueError('arrays must all be same length')
ValueError: arrays must all be same length

Which has a well-described solution here

Nina
  • 148
  • 2
  • 16