0

I am trying to group certain data from array "frog". Array "frog" looks something like :

285,1944,10,12,579
286,1944,11,13,540
287,1944,12,14,550
285,1945,10,12,536
286,1945,11,13,504
287,1945,12,14,508
285,1946,10,12,522
286,1946,11,13,490
287,1946,12,14,486

The order is "Day of the Year", Year, Month,"Day of Month", and money. I want to put all the "Day of the Year"s with their correct Month and "Day of the Month"s. So there are three constraints (Day of the Year, Month, Day of the Month). An example output array would be something like:

285,1944,10,12,579 
285,1945,10,12,536
285,1946,10,12,522

I am unsure how to go about this. Is there possibly a faster way than using a while loop or for loop in this situation? Please let me know if you would like me to explain more.

Thanks

BBHuggin
  • 85
  • 1
  • 11
  • Is your frog array 1 or 2d and is your output just an ordered grouping based on the "Day of the Year" and "Day of the Month" indexes? – Red Shift Feb 25 '15 at 01:21

2 Answers2

1

Python has a sort function which takes a key function, which can be arbitrarily defined. In this case, we can define a simple function, or even a lambda to do what we want.

However, as @Vasif mentions, there will be issues with leap-years, because, for example, day 285 might be October 13 one year, but then October 12 in a leap year, so that makes it trickier to require that 3-tuple as a constraint...

In any event:

# let's assume you've read in your file with something like csvreader
# so you've got a list of lists, similar to what @Vasif shows
sorted_a = sorted(a, key=lambda row: (row[0], row[2], row[3]))

This will create a new array, where everything is ordered first by "Day of Year" (so all the 285's will be together), then by "Month", then by "Day".

For completeness, we can operate on the array in place:

a.sort(key=lambda row:(row[0], row[2], row[3]))

And for more complex things (not necessary here, but may be nice to see):

def keyfunc(row):
    # could do anything you want with more complex data:
    # maybe row[0] is an index into a database that you query, or 
    # a URL that you request the page of, parse, and process somehow, etc...
    return (row[0], row[2], row[3])

sorted_a = sorted(a, key=keyfunc)
## or again:
a.sort(key=keyfunc)
dwanderson
  • 2,775
  • 2
  • 25
  • 40
0

I am giving you a solution below. How ever i m not sure of your output. 1944 is a leap year.

import datetime as dt

a = [[285,1944,10,12,579],
[286,1944,11,13,540],
[287,1944,12,14,550],
[285,1945,10,12,536],
[286,1945,11,13,504],
[287,1945,12,14,508],
[285,1946,10,12,522],
[286,1946,11,13,490],
[287,1946,12,14,486]]

def solution(frog):
 goodlist=[]
 for l in frog:
  if isGood(l):
   print l 
   goodlist.append(l)
  print l , 'rejected'
 return goodlist


def isGood(l):
 [days,year,month,day,money] = l



 # http://stackoverflow.com/questions/2427555/python-question-year-and-day-of-year-to-date
 date = dt.datetime(year, 1, 1) + dt.timedelta(days - 1)
 # print date.month, date.day
 if date.month == month and date.day == day :
  return True
 return False

# print isGood([285,1944,10,12,579])
print solution(a)
Vasif
  • 1,393
  • 10
  • 26
  • I accounted for leap year by deleting all the 2-29 within my data set. My data set is a large 2D array. That has Index (Day of the Year), Year, Month, Day ,and Money as headers. What I will need to do with the sorted data is take the averages of each days money output throughout the years. So all the October 5th's money output within my 100 plus years of data will need to be averaged. In the end I am looking for 365 averages of money output for each day over the course of 100 years. Sorry if I made things more confusing. – BBHuggin Feb 25 '15 at 15:22