I'm trying to properly import data from a space delimited file into a pandas dataframe so that I can plot it properly. My data file looks like so:
Vmeas -5.00E+000 -4.50E+000 -4.00E+000 -3.50E+000 ...
vfd3051 -3.20E-008 -1.49E-009 1.38E-008 -1.17E-008 ...
vfd3151 -3.71E-008 -6.58E-009 -6.58E-009 -6.58E-009 ...
vfd3251 -4.73E-008 3.59E-009 8.68E-009 -1.68E-008 ...
vfd3351 -2.18E-008 -3.71E-008 3.60E-009 -3.20E-008 ...
So the test location is originally in the rows with the columns increasing in voltage to the right to 20V.
My code to read the data file into the dataframe is:
if __name__ == '__main__':
file_path = str(input("Enter the filename to open: "))
save = str(input('Do you wish to save a pdf of the IV plots? (y/n): '))
df = pd.read_csv(file_path, index_col="Vmeas", delim_whitespace=True, header=0)
df = df.T
df.reset_index(inplace=True)
df.index.names = ['Voltage']
df.columns.names = ['Die_numbers']
df.drop('index',axis=1, inplace=True)
make_plots(df, save)
The actual plotting is done by:
def make_plots(df, save):
voltage = np.arange(-5, 20, 0.5)
plt.figure(figsize=(10, 7))
for col in df:
plt.plot(voltage, col, legend=False)
plt.show()
At first, I encountered problems with the voltage being treated by pandas as a string and since pandas doesn't play nice with float indexes. Trying that initially started my plot of a diode current-voltage relationship at 0. (https://i.stack.imgur.com/i2XOY.jpg) Then, I re-indexed it but then plotting that still didn't work. Now, I've re-indexed the dataframe, dropped the old index column and when I check the df.head() everything looks right:
Die_numbers vfd3051 vfd3151 vfd3251 vfd3351
Voltage
0 -3.202241e-08 -3.711351e-08 -4.728576e-08 -2.184733e-08
1 -1.493095e-09 -6.580329e-09 3.594383e-09 -3.710431e-08
2 1.377107e-08 -6.581644e-09 8.683344e-09 3.595368e-09
except now I keep running into a ValueError in mpl. I think this is related to the col values being strings instead of floats which I don't understand because it was printing the currents properly before.
Admittedly, I'm new to pandas but it seems like at every turn I am stopped, by my ignorance no doubt, but it's getting tiresome. Is there a better way to do this? Perhaps I should just ignore the first row of the logfile? Can I convert from scientific notation while reading the file in? Keep plugging away?
Thanks.
df.info() is: Int64Index: 51 entries, 0 to 50 Columns: 1092 entries, vfd3051 to vfd6824 dtypes: float64(1092)
Everything seems to load into pandas correctly but mpl doesn't like something in the data. The columns are floats, I'm not using the index of integers. If the column names were being added as my first row, the columns would be treated as str or obj type. The error is:
Traceback (most recent call last):
File "D:\Python\el_plot_top_10\IV_plot_all.py", line 51, in <module>
make_plots(df, save)
File "D:\Python\el_plot_top_10\IV_plot_all.py", line 21, in make_plots
plt.plot(voltage, col, legend=False)
File "C:\Anaconda3\lib\site-packages\matplotlib\pyplot.py", line 2987, in plot
ret = ax.plot(*args, **kwargs)
File "C:\Anaconda3\lib\site-packages\matplotlib\axes.py", line 4139, in plot
for line in self._get_lines(*args, **kwargs):
File "C:\Anaconda3\lib\site-packages\matplotlib\axes.py", line 319, in _grab_next_args
for seg in self._plot_args(remaining, kwargs):
File "C:\Anaconda3\lib\site-packages\matplotlib\axes.py", line 278, in _plot_args
linestyle, marker, color = _process_plot_format(tup[-1])
File "C:\Anaconda3\lib\site-packages\matplotlib\axes.py", line 131, in _process_plot_format
'Unrecognized character %c in format string' % c)
ValueError: Unrecognized character f in format string