I am trying to build an efficient algorithm that can process thousands of rows of data that contains zip codes of customers. I would then want to cross check those zip codes against a grouping of around 1000 zip codes, but I have about 100 columns of 1000 zip codes. A lot of these zip codes are consecutive numbers, but there is also a lot of random zip codes thrown in there. So what I would like to do is group consecutive zip codes together that I can then just check to see if the zip code falls within that range instead of checking it against every single zip code.
Sample data -
90001
90002
90003
90004
90005
90006
90007
90008
90009
90010
90012
90022
90031
90032
90033
90034
90041
This should be grouped as follows:
{ 90001-90010, 90012, 90022, 90031-90034, 90041 }
Here's my idea for the algorithm:
public struct gRange {
public int start, end;
public gRange(int a, int b) {
start = a;
if(b != null) end = b;
else end = a;
}
}
function groupZips(string[] zips){
List<gRange> zipList = new List<gRange>();
int currZip, prevZip, startRange, endRange;
startRange = 0;
bool inRange = false;
for(int i = 1; i < zips.length; i++) {
currZip = Convert.ToInt32(zips[i]);
prevZip = Convert.ToInt32(zips[i-1]);
if(currZip - prevZip == 1 && inRange == false) {
inRange = true;
startRange = prevZip;
continue;
}
else if(currZip - prevZip == 1 && inRange == true) continue;
else if(currZip - prevZip != 1 && inRange == true) {
inRange = false;
endRange = prevZip;
zipList.add(new gRange(startRange, endRange));
continue;
}
else if(currZip - prevZip != 1 && inRange == false) {
zipList.add(new gRange(prevZip, prevZip));
}
//not sure how to handle the last case when i == zips.length-1
}
}
So as of now, I am unsure of how to handle the last case, but looking at this algorithm, it doesn't strike me as efficient. Is there a better/easier way to be sorting a group of numbers like this?