OK, it looks like your goal is to have a second DataFrame with the same number and name of columns, where each column contains the unique values of the initial DataFrame. I don't know if that's the best way to go about it, so I'll show you how I'd do it your way, and then suggest another way to print it nicely.
As mentioned, the error you're getting is because you're trying to create a DataFrame with columns of different lengths. You can do this with some finagling if you're OK with NaN values in the "empty" cells. I would approach it like this:
- Get your column names, and save them in a list.
- Create a list to hold the unique values from each column of df2 as a new Series.
- Iterate through each column name, storing the new series of each column's unique values
- Figure out which column is the longest, and create an empty (filled with NaNs) DataFrame based on the # of columns and the longest list of unique values.
Lastly, replace what NaN values you can with actual values, and print the DF.
import pandas as pd
colNames = df2.columns.tolist()
uniqueValsList = []
for each in colNames:
uniqueVals = list(df2[each].unique())
uniqueValsList.append(pd.Series(data=uniqueVals,name=each))
maxlen = 0
for each in uniqueValsList:
if len(each) > maxlen:
maxlen = len(each)
fillerData = np.empty((maxlen,len(colNames),))
dfDiff = pd.DataFrame(columns=colNames,data=fillerData)
for i in range(len(uniqueValsList)):
dfDiff[colNames[i]] = uniqueValsList[i]
dfDiff
This will allow you to print out a DF with your unique values, but it will look weird with all the NaN values. I would recommend doing it with HTML and the tabulate
module, as in this answer. For example:
from IPython.display import HTML, display
import tabulate
listOfLists = []
for i in range(len(uniqueValsList)):
thisList = []
thisList.append(colNames[i])
for each in uniqueValsList[i].tolist():
thisList.append(each)
listOfList.append(thisList)
display(HTML(tabulate.tabulate(listOfLists, tablefmt='html')
I'm not familiar with LaTeX in Jupyter Notebooks, so if you've found a better way to do this I'd be interested to know! I tried messing with the tablefmt
values in the display(HTML())
call, to no avail.