1

So I am just trying to learn Python and have built a histogram that looks like such:

enter image description here

I've been going crazy trying to figure out how I could display this same data in a table format ie:

  0-5  = 50,500
  5-10 = 24,000
 10-50 = 18,500

and so on...

There is only one field in df, and it contains the number of residents in towns/cities. Any help is greatly appreciated.

EDIT:

From the duplicate question answer... I GET AN ERROR

bins = [0,5,10,50,150,500,2500,5000,8000]
groups = df.groupby(['Total_Residents', pd.cut(df.Total_Residents, bins)])
groups.size().unstack()

AttributeError Traceback (most recent call last) in () 1 bins = [0,5,10,50,150,500,2500,5000,8000] ----> 2 groups = df.groupby(['Total_Residents', pd.cut(df.Total_Residents, bins)]) 3 groups.size().unstack()

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in getattr(self, name) 4370 if self._info_axis._can_hold_identifiers_and_holds_name(name): 4371 return self[name] -> 4372 return object.getattribute(self, name) 4373 4374 def setattr(self, name, value):

 AttributeError: 'Series' object has no attribute 'Total_Residents'

EDIT: For Sample data, you can use the bin values +1

df = pd.Series([1,6,11,51,151,501,2501,5001,8001], name = 'Total_Residents')

but fwiw, my data wasn't causing the issue. It was that I was using a function of pandas meant for a dataframe on a series of data.

user76595
  • 357
  • 5
  • 17
  • Possible duplicate of [Pandas groupby with bin counts](https://stackoverflow.com/questions/34317149/pandas-groupby-with-bin-counts) – Yuca Sep 18 '18 at 22:40
  • I tried that and still didn't work. – user76595 Sep 18 '18 at 22:51
  • include sample data, otherwise we can't replicate your error – Yuca Sep 18 '18 at 23:01
  • it says df is a series, change your code to `df =dataset[['Total_Residents']]` – Yuca Sep 18 '18 at 23:06
  • the sample data is just integers. any random integers would work. – user76595 Sep 19 '18 at 00:32
  • 1
    yes, so give us the code to run your random integers, we're not going to do the extra work for you – Yuca Sep 19 '18 at 00:36
  • Just use the bin values +1 for the integers. I didn't think that would have been any more work than for me copy and paste them. My apologies, I should have just done that in the last edit. – user76595 Sep 19 '18 at 15:37

1 Answers1

4

Figured it out. I was not able to actually convert the 'series' to a dataframe but pandas has the ability to work with a series:

  bins = [0,5,10,50,150,500,2500,5000,8000]
  df.value_counts(bins=bins)

I needed to use the value_counts function.

I was able to use the suggestion duplicate answer only if I had another column to group the data by.

user76595
  • 357
  • 5
  • 17
  • 2
    It's great that this works for you, but it isn't supported by the documentation. `bins` is only supposed to be an integer according to https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.value_counts.html but looking at the source the parameter is passed to `cut` which does actually accept a sequence of scalars https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.cut.html So I think your approach is safe despite the documentation conflict. – beldaz Jan 14 '20 at 20:10
  • @beldaz: or it might be a case of a legit, but underdocumented, feature in pandas? (I find such things occasionally). Have you considered raising a [pandas issue](https://github.com/pandas-dev/pandas/issues/)? – smci Feb 19 '20 at 00:48