2

I have many lists which consists of 1d data. like below:

list1 = [1,2,3,4...]
list2 = ['a','b','c'...] 

Now, I have to create dataframe like below:

df = [[1,'a'],[2,'b'],[3,'c']]

I need this dataframe so that I can profile each column using pandas_profiling. Please suggest.

I have tried

list1+list2

but its giving data like below:

list3=[1,2,3,4...'a','b'...]

used numpy hpstack too, but not working

import pandas as pd
import pandas_profiling
import numpy as np

list3 = np.hstack([[list1],[list2]])

array([[1,2,3,4,'a','b','c'..]],dtype='<U5')
Simon
  • 5,464
  • 6
  • 49
  • 85
MSNGH
  • 33
  • 4
  • Numpy is for number arrays only, use pandas, google some pandas tutorial it should be in the first couple chapters – IcedLance Jun 19 '19 at 11:00
  • 1
    `zip` is the correct answer but this is what you could have done in numpy: `np.vstack((list1, list2)).T` – Dan Jun 19 '19 at 12:28
  • For lists `+` is a simple join. With the `hstack` expression you are concatenating (1,n) arrays on the last dimension, producing a (1,2n) array. `column_stack` will make a (n,2) array, but string dtype. The python `zip` does a better job of iterleaving the number and string elements. – hpaulj Jun 19 '19 at 15:21

2 Answers2

1

You can do in this way:

import pandas as pd

list1 = [1,2,3,4]
list2 = ['a','b','c','d']
list3 = zip(list1, list2)
df = pd.DataFrame(list3, columns=('list1', 'list2'))
print (df)

Output:

   list1 list2
0      1     a
1      2     b
2      3     c
3      4     d
Joe
  • 12,057
  • 5
  • 39
  • 55
0

You can use the zip function described in the answer from this question to create your nested list.

You should note that you cannot use the zip function directly as it could lead to an error.

The solution would be:

import pandas as pd

list1 = [1,2,3]
list2 = ['a','b','c']
df = pd.DataFrame(list(zip(list1,list2)), columns=['list1', 'list2'])
AUBSieGUL
  • 474
  • 1
  • 5
  • 17