I am working on a geospatial project where I need to do some calculations between groups of data within a data frame. The data I am using spans over several different years and specific to the Local Authority District code, each year has a numerical ID.
I need to be able to calculate the mean average of a group of years within that data set relative to the LAD code.
LAC LAN JAN FEB MAR APR MAY JUN ID
K04000001 ENGLAND AND WALES 56597 43555 49641 88049 52315 42577 5
E92000001 ENGLAND 53045 40806 46508 83504 49413 39885 5
I can use groupby to calculate the mean based on a LAC, but what I can't do is calculate the mean grouped by LAC for ID 1:3 for example.
What is more efficient, seperate in to seperate dataframes stored in an dict for example, or keep in one dataframe and use an ID?
df.groupby('LAC').mean()
I come frome a matlab background so just getting the hang of the best way to do things.
Secondly, once these operatons are complete, I would like to do the following:
(mean of id - 1:5 - mean id:6) using LAC as the key.
Sorry if I haven't explained this very well!
Edit: Expected output.
To be able to average a group of rows by specific ID for a given value of LAC.
For example: Average monthly values for E92000001 rows with ID 3
LAC JAN FEB MAR APR MAY JUN ID
K04000001, 56706 43653 49723 88153 52374 42624 5
K04000001 56597 43555 49641 88049 52315 42577 5
E92000001 49186 36947 42649 79645 45554 36026 5
E92000001 53045 40806 46508 83504 49413 39885 3
E92000001 68715 56476 62178 99174 65083 55555 4
E92000001 41075 28836 34538 71534 37443 27915 3
E92000001 54595 42356 48058 85054 50963 41435 1
Rows to be averaged:
E92000001 53045 40806 46508 83504 49413 39885 3
E92000001 41075 28836 34538 71534 37443 27915 3
Result
E92000001 47060 34821 40523 77519 43428 33900 3
edit: corrected error.