I am trying to parse some data to generate a histogram
The data is in multiple columns but the only relevant column for me are the two below.
X
AB 42
CD 77
AB 33
AB 42
AB 33
CD 54
AB 33
Only for the rows with AB, I want to plot the histogram of the value in col 2. So the histogram should sort and plot:
33 - 3
42 - 2
(even though 42 occurs first, I want to plot 33 first).
I have a lot of columns but it needs to grep the 'AB' character and only search in those rows. Can anyone help?
UPDATE: Data is in a csv file and there are several columns.
EDIT: I now have the data in a csv file in this format.
Addresses,Data
FromAP,42
FromAP,42
FromAP,33
ToAP,77
FromAP,54
FromAP,42
FromAP,42
FromAP,33
ToAP,42
FromAP,42
FromAP,33
If I use the code from @dranxo,
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('data.csv', sep=',')
df_useful = df[df['Addresses'] == 'FromAP']
df_useful.hist()
plt.show()
I get the following error:
Laptop@ubuntu:~/temp$ ./a.py
/usr/lib/pymodules/python2.7/matplotlib/axes.py:8261: UserWarning: 2D hist input should be nsamples x nvariables;
this looks transposed (shape is 0 x 1)
'this looks transposed (shape is %d x %d)' % x.shape[::-1])
Traceback (most recent call last):
File "./a.py", line 11, in <module>
df_useful.hist()
File "/usr/lib/python2.7/dist-packages/pandas/tools/plotting.py", line 2075, in hist_frame
ax.hist(data[col].dropna().values, **kwds)
File "/usr/lib/pymodules/python2.7/matplotlib/axes.py", line 8312, in hist
xmin = min(xmin, xi.min())
File "/usr/lib/python2.7/dist-packages/numpy/core/_methods.py", line 21, in _amin
out=out, keepdims=keepdims)
ValueError: zero-size array to reduction operation minimum which has no identity
I do have the pandas package, numpy, matplotlib installed. Thanks