Firstly - apologies but I am unable to reproduce this error using code. I will try and describe it as best as possible using screenshots of the data and errors.
I've got a large dataframe indexed by 'Year' and 'Season' with values for latitude, longitude, and Rainfall with some others which looks like this:
This is organised to respect the annual sequence of 'Winter', 'Spring', 'Summer', 'Autumn' (numbers 1:4 in Season column) - and I need to keep this sequence after conversion to an Xarray Dataset
too. But if I try and convert straight to Dataset
:
future = future.to_xarray()
So it is clear I need to reindex by unique identifiers, I tried using just lat and lon but this gives the same error (as there are duplicates). Resetting the index then reindexing then using lat, lon and time like so:
future = future.reset_index()
future.head()
future.set_index(['latitude', 'longitude', 'time'], inplace=True)
future.head()
allows for the
future = future.to_xarray()
code to work:
The problem is that this has now lost its annual sequencing, you can see from the Season variable in the dataset that it starts at '1' '1' '1'
for the first 3 months of the year but then jumps to '3','3','3'
meaning we're going from winter to summer and skipping spring.
This is only the case after re-indexing the dataframe, but I can't convert it to a Dataset without re-indexing, and I can't seem to re-index without disrupting the annual sequence. Is there some way to fix this?
I hope this is clear and the error is illustrated enough for someone to be able to help!
EDIT: I think the issue here is when it indexes by date it automatically orders the dates chronologically (e.g. 1952 follows 1951 etc), but I don't want this, I want it to maintain the sequence in the initial dataframe (which is organised seasonally, but it could have a spring from 1955 followed by a summer from 2000 followed by an autumn from 1976) - I need to retain this sequence.
EDIT 2:
So the dataset looks like this when I set 'Year' as the index, or just keep the index generic but I need the tg variable to have lat/lon associated with it so the dataset looks like this:
<xarray.Dataset>
Dimensions: (Year: 190080)
Coordinates:
* Year (Year) int64 1970 1970 1970 1970 1970 1970 1970 1970 1970 ...
Data variables:
Season (Year) object '1' '1' '2' '2' '2' '3' '3' '3' '4' '4' '4' '1' ...
latitude (Year) float64 51.12 51.12 51.12 51.12 51.12 51.12 51.12 ...
longitude (Year) float64 -10.88 -10.88 -10.88 -10.88 -10.88 -10.88 ...
seasdif (Year) float32 -0.79192877 -0.79192877 -0.55932236 ...
tg (Year, latitude, longitude) float32 nan nan nan nan nan nan nan nan nan nan nan ...
time (Year) datetime64[ns] 1970-01-31 1970-02-28 1970-03-31 ...