pandas plot of bar chart of postitive and negative values from a csv file

Question

i have a csv file containing data, i have a column that contains positive and negative values and i need to plot the mean of this column in a way to have 2 bars , one for the negative values and one for the positive values. Take a look on my data :

timestamp,heure,lat,lon,ampl,type
2006-01-01 00:00:00,13:58:43,33.837,-9.205,10.3,1
2006-01-02 00:00:00,00:07:28,34.5293,-10.2384,17.7,1
2007-02-01 00:00:00,23:01:03,35.0617,-1.435,-17.1,2
2007-02-02 00:00:00,01:14:29,36.5685,0.9043,36.8,1
....
2011-12-31 00:00:00,05:03:51,34.1919,-12.5061,-48.9,1

i am using this code to plot my data :

names =["timestamp","heure","lat","lon","ampl","type"]
data = pd.read_csv('flash.txt',names=names, parse_dates=['timestamp'],index_col=['timestamp'], dayfirst=True)
data['ampl'] = data['ampl'].abs()
yearly = data.groupby(data.index.month)['ampl'].count()
ax = yearly.plot(kind='bar')

so, i need to disassociate the values of the column in question and have 2 bars instead of one , how can I proceed ?

Without data it is a bit problematic, but if change `yearly = data.groupby(data.index.month)['ampl'].count()` to `yearly = data.groupby([data.index.month, 'type'])['ampl'].count().unstack(fill_value)` it should work. — jezrael, Jun 22 '17 at 16:23
I just edited my question, you can take a look on my data now — Mar, Jun 22 '17 at 16:26

score 1 · Accepted Answer · edited Sep 27 '17 at 16:42

1

First create new column sign by numpy.sign and map by dict.

Then add new column name to groupby, aggregate by size and reshape by unstack:

data['sign'] = np.sign(data['ampl']).map({1:'+', -1:'-', 0:'0'})
data['ampl'] = data['ampl'].abs()
yearly = data.groupby([data.index.month, 'sign'])['ampl'].size().unstack()
yearly.plot(kind='bar')

What is the difference between size and count in pandas?

edited Sep 27 '17 at 16:42

Graham

7,431
18
59
84

answered Jun 22 '17 at 16:30

jezrael

822,522
95
1,334
1,252

I did as you said, and I got this : KeyError: 'type' – Mar Jun 22 '17 at 16:34
What is `print (df.columns.tolist())` ? – jezrael Jun 22 '17 at 16:35
it gives this : ['heure', 'lat', 'lon', 'ampl'] – Mar Jun 22 '17 at 16:36
Hmmm, it is interesting, because in your sample is last column `type`. What if remove parameter `names`? Change `data = pd.read_csv('flash.txt',names=names, parse_dates=['timestamp'],index_col=['timestamp'], dayfirst=True)` to `data = pd.read_csv('flash.txt', parse_dates=['timestamp'],index_col=['timestamp'], dayfirst=True)`. Names is used, if no header in csv. How does it works now? – jezrael Jun 22 '17 at 16:40
it's working now, but i just can't understand the meaning of the result, i'll send you the result via email since i don't have the right to share images ? – Mar Jun 22 '17 at 16:53
Sure, no problem. – jezrael Jun 22 '17 at 16:55

pandas plot of bar chart of postitive and negative values from a csv file

1 Answers1