0

I am trying to perform the following task: I want to create 10 dataframes whose lon values are less than/greater than the numbers given in split. These lon are different each time but connected, for example:

dfre0 = dfres[(dfres["lon"] > split[0]) & (dfres["lon"] <= split[1])]
dfre1 = dfres[(dfres["lon"] > split[1]) & (dfres["lon"] <= split[2])]

Where the vector split is:

>>> split = np.linspace(-180.0, 180.0, num=10)

array([-180., -140., -100.,  -60.,  -20.,   20.,   60.,  100.,  140., 180.])

The line with the for-loop is something like:

for i in range(len(split)):
    dfres[(dfres["lon"] > split[i]) & (dfres["lon"] <= split[i+1])]

But how I change the name each time?

Instead of doing it by hand each time, is there any way to do it inside a loop?

enter image description here

smci
  • 32,567
  • 20
  • 113
  • 146
LaSanton
  • 127
  • 9
  • Instantiate a `list` before the loop, and then append each `DataFrame` to it in the loop – Andrew Apr 17 '20 at 11:58
  • This is just *binning*. You don't need to physically split the dataframe, just use a `groupby()` on `lon`, possibly preced by `pd.cut()`. Presumably `split` is a (sorted) vector of 9 values defining the bin values, right? We need you to show us an example of its values. – smci Apr 17 '20 at 12:00
  • Related: [Binning column with python pandas](https://stackoverflow.com/questions/45273731/binning-column-with-python-pandas) – smci Apr 17 '20 at 12:02
  • In general creating copies is a bad workflow with large dataframes. That's why the [***Split-Apply-Combine*** paradigm](https://pandas.pydata.org/pandas-docs/stable/getting_started/10min.html#grouping) was invented. Please see the tutorial. – smci Apr 17 '20 at 12:06
  • Hello and thanks everyone for the answers! My main project is the following. I have some geospatial data concerning hotels and coffee shops. I want to grid them, like gridding in a map. and then i want to cross join each data frame with the other. For example in the split area [-100,-60], i got 2 dataframes. 1 dataframe with the coffees and an other dataframe with hotels in that area. Then i want to cross join them with the haversine function. The values are like shown in the image above. The 1st column is name, the 2nd is lat and the 3rd lon. Thank you in advance! – LaSanton Apr 18 '20 at 15:55
  • **Show us the values in `split`**. Please edit that into your question. Is it a sorted list of 9 bin cutoff values? – smci Apr 19 '20 at 03:12
  • I just edited sir. Exactly. `Split` is a sorted np.array with the values that i want to have as bins. – LaSanton Apr 19 '20 at 17:24
  • Ok. We still need your dataset, or at least a snpipet of it, to make this reproducible for the rest of us. Please post the URL or a snippet of say 10 lines of data, in your question. – smci Apr 20 '20 at 06:30

0 Answers0