Error while splitting a column of lists into different columns

Question

I have a dataframe named amplitude_df1 as shown below.

                                            amplitudes
0    [1.8224, 9.10515, 10.187, 4.67473, 1.60665, 1....
1    [4.40045, 15.4495, 27.3758, 17.6756, 3.21038, ...
2    [2.11535, 11.9202, 18.2254, 7.32574, 4.11506, ...
3    [3.51715, 5.90878, 14.3854, 11.5154, 8.16267, ...
4    [5.33236, 19.8225, 33.4585, 15.5712, 9.21001, ...
..                                                 ...
196  [1.18488, 2.8276, 9.20956, 17.0281, 9.59571, 3...
197  [0.878292, 2.50281, 2.9185, 9.55309, 9.55309, ...
198  [0.220521, 0.503399, 2.16432, 2.92407, 2.92407...
199  [0.572135, 2.4478, 4.80103, 4.65729, 3.54338, ...
200  [1.14716, 1.58989, 3.63487, 6.12651, 4.42284, ...

[201 rows x 1 columns]

I was trying to split the amplitude column which contains lists into different columns as shown below:

amplitude_df=pd.DataFrame(amplitude_df1['amplitudes'].values.tolist(),columns=list(range(len(amplitude_df1["amplitudes"][0])))

But the error pops up saying:

ValueError: Shape of passed values is (201, 1), indices imply (201, 49617)

Any help appreciated.For any questions let me know in the comments

EDIT: This is my whole code as requested by an answerer

for filename in os.listdir(directory):
        if re.search("part-00001",filename):
            json_file_path = os.path.join(directory, filename)
            with open(json_file_path) as f:
                jsonData = json.load(f)
                #print(jsonData)
                if jsonData["channel"]== channel:
                    df = {"amplitudes":jsonData["amplitudes"],"time":jsonData["time"],
                          "channel":jsonData["channel"]}
                    df1 = pd.DataFrame(df,index=[0])
                    jsonNew = jsonNew.append(df1)

    jsonDF = jsonNew
    #sorting values with timestamp
    jsonSorted = jsonDF.sort_values("time")
    #resetting index after sorting
    newJson = jsonSorted.reset_index(drop=True)
    newJson_amp = newJson.drop(['time','channel'],axis=1)
    print(newJson_amp)
    amplitude_df1 = newJson_amp
    #mytry
    amplitude_df=pd.DataFrame(amplitude_df1['amplitudes'].values.tolist(),columns=list(range(len(amplitude_df1["amplitudes"][0])))

This is what I want Pandas split column of lists into multiple columns

Please check if the values in your `amplitude_df1['amplitudes']` are indeed lists rather than string representation of lists. For example, add `print(type(amplitude_df1['amplitudes'][1]))` before `#mytry`. — Błotosmętek, Feb 18 '20 at 11:12

score -1 · Answer 1 · edited Feb 18 '20 at 10:54

-1

Try using .to_numpy() instead of .tolist(). (You will need to import numpy) If that doesn't work, create a variable and reshape the amplitude_df1['amplitudes'].values.to_numpy() with the function .reshape()

Here is the doc: https://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html

Regards

edited Feb 18 '20 at 10:54

Ajil T.P

33
2

answered Feb 18 '20 at 10:31

Santiago

39
6

Error showing numpy ndarray has no attribute to_numpy() in first case – Ajil T.P Feb 18 '20 at 10:34
can you tell me ,what you meant by reshape? – Ajil T.P Feb 18 '20 at 10:36

Error while splitting a column of lists into different columns

1 Answers1