Given two arrays; $births
containing a list of birth years indicating when someone was born, and $deaths
containing a list of death years indicating when someone died, how can we find the year on which the population was highest?
For example given the following arrays:
$births = [1984, 1981, 1984, 1991, 1996];
$deaths = [1991, 1984];
The year on which the population was highest should be 1996
, because 3
people were alive during that year, which was the highest population count of all those years.
Here's the running math on that:
| Birth | Death | Population | |-------|-------|------------| | 1981 | | 1 | | 1984 | | 2 | | 1984 | 1984 | 2 | | 1991 | 1991 | 2 | | 1996 | | 3 |
Assumptions
We can safely assume that the year on which someone is born the population can increase by one and the year on which someone died the population can decrease by one. So in this example, 2 people were born on 1984 and 1 person died on 1984, meaning the population increased by 1 on that year.
We can also safely assume that the number of deaths will never exceed the number of births and that no death can occur when the population is at 0.
We can also safely assume that the years in both $deaths
and $births
will never be negative or floating point values (they're always positive integers greater than 0).
We cannot assume that the arrays will be sorted or that there won't be duplicate values, however.
Requirements
We must write a function to return the year on which the highest population occurred, given these two arrays as input. The function may return 0
, false
, ""
, or NULL
(any falsey value is acceptable) if the input arrays are empty or if the population was always at 0 throughout. If the highest population occurred on multiple years the function may return the first year on which the highest population was reached or any subsequent year.
For example:
$births = [1997, 1997, 1997, 1998, 1999];
$deaths = [1998, 1999];
/* The highest population was 3 on 1997, 1998 and 1999, either answer is correct */
Additionally, including the Big O of the solution would be helpful.
My best attempt at doing this would be the following:
function highestPopulationYear(Array $births, Array $deaths): Int {
sort($births);
sort($deaths);
$nextBirthYear = reset($births);
$nextDeathYear = reset($deaths);
$years = [];
if ($nextBirthYear) {
$years[] = $nextBirthYear;
}
if ($nextDeathYear) {
$years[] = $nextDeathYear;
}
if ($years) {
$currentYear = max(0, ...$years);
} else {
$currentYear = 0;
}
$maxYear = $maxPopulation = $currentPopulation = 0;
while(current($births) !== false || current($deaths) !== false || $years) {
while($currentYear === $nextBirthYear) {
$currentPopulation++;
$nextBirthYear = next($births);
}
while($currentYear === $nextDeathYear) {
$currentPopulation--;
$nextDeathYear = next($deaths);
}
if ($currentPopulation >= $maxPopulation) {
$maxPopulation = $currentPopulation;
$maxYear = $currentYear;
}
$years = [];
if ($nextBirthYear) {
$years[] = $nextBirthYear;
}
if ($nextDeathYear) {
$years[] = $nextDeathYear;
}
if ($years) {
$currentYear = min($years);
} else {
$currentYear = 0;
}
}
return $maxYear;
}
The algorithm above should work in polynomial time given it is at worst O(((n log n) * 2) + k)
where n
is number of elements to be sorted from each array and k
is number of birth years (since we know that k
is always k >= y
) where y
is number of death years. However, I'm not sure if there is a more efficient solution.
My interests are purely in an improved Big O of computational complexity upon the existing algorithm. Memory complexity is of no concern. Nor is the runtime optimization. At least it's not a primary concern. Any minor/major runtime optimizations are welcome, but not the key factor here.