0

So I have a set of data loaded into python using pandas, it looks like :

          V     n             I
0    -0.400   0.0 -6.611865e-05
1    -0.384   0.0 -6.340880e-05
2    -0.368   0.0 -6.063826e-05
3    -0.352   0.0 -5.789697e-05
4    -0.336   0.0 -5.512495e-05
...     ...   ...           ...
4483  0.336  83.0  1.905807e-10
4484  0.352  83.0  2.146759e-10
4485  0.368  83.0  2.452133e-10
4486  0.384  83.0  2.511581e-10
4487  0.400  83.0  2.704376e-10

[4488 rows x 3 columns]

Each data set is marked by an n value, I want to use that n value to sepearate the I and V from each other so I can plot them on the same graph. The V range are pretty much identical in each set, while the I varies.

To plot all 84 data sets on one graph to do that I used:

#store data using pandas
data = pd.read_csv( f, sep = '\t', comment = '#', names = ['V','n','I'] )

#observe data format
print(data)

#plot data
fig, ax = plt.subplots()

data = data.pivot(index='V', columns='n', values='I')

data.plot()
plt.legend(loc='best')
plt.show()

But this gives me :

ValueError: Index contains duplicate entries, cannot reshape

I tried something similar for another data set with the same structure and that worked fine but not here. I kinda need those values even if they are identical, does anybody have ideas I can try? Thnx!

sleepy
  • 23
  • 7
  • what to you expect to happen? do you want both values displaying the same cell? display the average (for some other aggregation) of them? – Paul H Jan 18 '21 at 19:58
  • Sorry, i failed to properly mention what I wanted. I want to plot the values for v on x and I on y, using n as the value to distinguish between each set of data for the plots. – sleepy Jan 19 '21 at 09:14
  • `df.groupby(by=['n']).plot()`? – Paul H Jan 19 '21 at 15:23

2 Answers2

0

A pivot table by definition cannot accept duplicate rows or columns so you have to specify an aggregating function for these duplicated features. And you better use pivot_table as explained here

Is this what you're after?

data.csv
V   n   I
-0.85   0   0.060058
-0.85   0   -0.022989
-0.85   0   0.061704
-0.85   0   0.077374
-0.85   0   -0.03107
0.96    22  -0.07421
0.96    22  -0.011674
0.96    22  -0.090547
0.96    22  -0.018355
0.96    22  0.096896
-0.2    88  0.011591
-0.2    88  0.030667
-0.2    88  0.095687
-0.2    88  -0.030725

import pandas as pd, numpy as np    
data = pd.read_csv( r'D:\path\data.csv' )    
data_piv = pd.pivot_table(data, index=['V'], columns=['n'], values='I', aggfunc=np.mean, fill_value=0)

print(data_piv)
data_piv.plot()

You can use any aggregating function from numpy and fill_value with any value (you'll get nans if you don't specify it).

nick
  • 1,090
  • 1
  • 11
  • 24
0

this works:

#store data using pandas
data = pd.read_csv( f, sep = '\t', comment = '#', names = ['V','n','I'] )

#observe data format
print(data)

data['n']=data['n'].astype('category')

fig, ax = plt.subplots()
sns.lineplot(data = data, x = 'V', y = 'I', hue = 'n', ax=ax)
plt.legend( bbox_to_anchor=(.05, 1),loc='upper left')
plt.show()
sleepy
  • 23
  • 7