I have run into this phenomenon many times before, but none were as clear for demonstrating my point as the following:
The source code for pwhois has two functions:
w_lookup_all_pwhois(whois_session_params * wsess, char *addr)
w_lookup_as_pwhois(whois_session_params * wsess, char *addr)
These are 40-to-60-line functions, and are identical except the ...as_pwhois
function only saves the "origin-as" field, whereas ...all_pwhois
saves all fields.
If I were writing these, I would instead write a single function, with another variable that says whether to fetch all fields or just a single one. Depending on the application, this variable could even flag, which of the fields to fetch. One advantage would be that, when reading code from scratch, I wouldn't have to read two identical functions (which are not adjacent in the code), to figure out that they do the exact same thing. Also, when changing the functionality of one, I wouldn't have to visit every one of the relevant functions, to modify the code there as well. The disadvantage: a more complicated function-- a seemingly minor disadvantage.
However, most developers in my company seem to prefer the multi-function approach, as seems to be the case everywhere else as well, judging by the available open-source code on the internet. As a result, the pwhois
has on the order of 50 functions, and I have to remember which one does what-- when 10 multi-purpose functions could easily do the job. What am I missing, that makes the 50-function approach more preferable? Is there a way to read source code from scratch, in a way which avoids reading these very similar functions more than once? (Since the functions are not adjacent in the code, I would guess maybe there's some "standard" comment file that I've not run into yet.)