4

I have the following array:

array([1963, 1964, 1965, 1966, 1967, 1968, 1969, 1970, 1971,
       1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982,
       1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993,
       1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004,
       2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013], dtype=object)

I would like to divide it into separate lists, one for each decade (e.g. 1970-1979 is one decade).

Right now, I am looping through the years and dividing into separate lists. Is there a more pythonic way to go about it?

TigerhawkT3
  • 48,464
  • 6
  • 60
  • 97
user308827
  • 21,227
  • 87
  • 254
  • 417

3 Answers3

8

You can use itertools.groupby and divide the year by 10. This will essentially group by the decade. Then use that in a list comprehension and create a new array.

>>> import numpy as np
>>> from itertools import groupby
>>> np.array([list(g) for k,g in groupby(a, lambda i: i // 10)])
array([[1963, 1964, 1965, 1966, 1967, 1968, 1969],
       [1970, 1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979],
       [1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989],
       [1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999],
       [2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009],
       [2010, 2011, 2012, 2013]], dtype=object)

Note that groupby requires that your sequence is sorted (which it looks like your data is).

Cory Kramer
  • 114,268
  • 16
  • 167
  • 218
7

There is another simpler method and a trick with np.floor

    import numpy as np
    years = np.array([1963, 1964, 1965, 1966, 1967, 1968, 1969, 1970, 1971,
          1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982,
          1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993,
          1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004,
          2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013])

    decades = []
    for each in years:
        decade = int(np.floor(each / 10) * 10)
        decades.append(decade)

    print(set(decades)) 
Raghav Gurung
  • 71
  • 1
  • 1
0

For a pandas data frame, you can use list comprehension for shorter code:

df["decade"]= [ int(np.floor(year/10) * 10) for year in np.array(df["year"])]
Adrian Mole
  • 49,934
  • 160
  • 51
  • 83