148

We all know that premature optimization is the root of all evil because it leads to unreadable/unmaintainable code. Even worse is pessimization, when someone implements an "optimization" because they think it will be faster, but it ends up being slower, as well as being buggy, unmaintainable, etc. What is the most ridiculous example of this that you've seen?

Mark Harrison
  • 297,451
  • 125
  • 333
  • 465
dsimcha
  • 67,514
  • 53
  • 213
  • 334

42 Answers42

207

I think the phrase "premature optimization is the root of all evil" is way, way over used. For many projects, it has become an excuse not to take performance into account until late in a project.

This phrase is often a crutch for people to avoid work. I see this phrase used when people should really say "Gee, we really didn't think of that up front and don't have time to deal with it now".

I've seen many more "ridiculous" examples of dumb performance problems than examples of problems introduced due to "pessimization"

  • Reading the same registry key thousands (or 10's of thousands) of times during program launch.
  • Loading the same DLL hundreds or thousands of times
  • Wasting mega bytes of memory by keeping full paths to files needlessly
  • Not organizing data structures so they take up way more memory than they need
  • Sizing all strings that store file names or paths to MAX_PATH
  • Gratuitous polling for thing that have events, callbacks or other notification mechanisms

What I think is a better statement is this: "optimization without measuring and understanding isn't optimization at all - its just random change".

Good Performance work is time consuming - often more so that the development of the feature or component itself.

simon
  • 12,666
  • 26
  • 78
  • 113
Foredecker
  • 7,395
  • 4
  • 29
  • 30
  • Strange that this was voted into negative territory. Informative post, anyhow. – Joseph Ferris Nov 23 '08 at 03:27
  • 46
    "Premature" is the key word of that quote. Your rephrasing it to "optimization without measuring and understanding" doesn't seem to change the meaning one bit. That is precisely what Knuth meant. – Bill the Lizard Nov 23 '08 at 03:52
  • Re: packing data structures... I knew a guy who always put `#pragma pack 1` at the head of each source file. It eventually bit him on some platform that required alignment for DWORD values when he passed a pointer to a DWORD struct member and got random looking values instead. – Hasturkun Nov 23 '08 at 12:08
  • 13
    @Foredecker: right on. Too many people forget the context, which puts that quote solidly against *micro*-optimization. Analyzing a problem to pick the proper algorithm before implementing it isn't premature, yet too often that quote gets thrown up to justify the laziest, most inefficient solution. – Shog9 Nov 23 '08 at 16:30
  • HI Bill-the-lizard :) I completely understand what you mean - this is what Knuth meant. But the phrase itself is WAY over used. – Foredecker Nov 23 '08 at 22:21
  • Typo: "poling" instead of "polling" sorry, can't help myself :) +1 anyway – MarkJ Feb 28 '09 at 19:40
  • +1 One example from some code I have to work on - reading exactly the same value from the database 100s of times during ONE operation. I'm gradually working my way through trying to refactor it, but it's tough going. – Mark Pim Mar 26 '09 at 09:14
  • 2
    @Foredecker: I understand what you mean about the phrase being overused. If you're saying it out loud, it's probably a sign that you need to measure your code. :) – Bill the Lizard Mar 26 '09 at 16:16
  • 5
    It really depends on the individual case, there are more cases where premature optimization becomes a problem, than inadequate optimization planning becomes a problem – Mark Rogers Mar 26 '09 at 17:40
  • 13
    -1: There's a difference between "optimization" and proper design. For those who can't tell, a good rule of thumb is that an "optimization" makes the code tougher to read, but faster or more efficient. A better design will make the code easier to read (or at least no worse) *and* more efficient. – T.E.D. Mar 26 '09 at 20:43
  • 1
    I completely agree. I reported a bug to Microsoft where a .NET library did some interop and called LoadLibrary hundreds and hundreds of time when it it only needed about 30 calls. I wrote similar code that in seconds instead of minutes. MS did not employ this fix. :) – BobbyShaftoe Mar 27 '09 at 06:28
  • Hi Tid :) I completely disagree - there is nothing inherent in fixing performance bugs (what may be called optimizing) that makes the code harder to read. – Foredecker Mar 27 '09 at 18:39
  • 5
    If it's way overused, then the population asking questions on SO is heavily weighted toward the outliers. :D – dkretz Mar 27 '09 at 21:41
  • 1
    It's way overused, and I'm sick of that phrase too! I was just thinking that yesterday! – JAL Mar 29 '09 at 06:54
  • "Wasting mega bytes of memory by keeping full paths to files needlessly" is so funny in that context. Unintentionally though, I guess? – Dun3 Apr 01 '09 at 16:07
  • I see it used to justify complete wastes of CPU all the time. – doug65536 Feb 14 '14 at 20:38
  • It is way overused because it's a selective quotation. You need to read the entire paragraph, and the context. For example, it doesn't include choice of algorithm, database index design, ... – user207421 Jan 31 '16 at 23:53
  • Come on, you guys. It's "*love of* premature optimization is the root of all evil". – Don Hatch Aug 04 '17 at 07:23
114

Databases are pessimization playland.

Favorites include:

  • Split a table into multiples (by date range, alphabetic range, etc.) because it's "too big".
  • Create an archive table for retired records, but continue to UNION it with the production table.
  • Duplicate entire databases by (division/customer/product/etc.)
  • Resist adding columns to an index because it makes it too big.
  • Create lots of summary tables because recalculating from raw data is too slow.
  • Create columns with subfields to save space.
  • Denormalize into fields-as-an-array.

That's off the top of my head.

JYelton
  • 35,664
  • 27
  • 132
  • 191
dkretz
  • 37,399
  • 13
  • 80
  • 138
  • Resisting the need to index is just painful to think about. – Bill the Lizard Nov 23 '08 at 02:58
  • 2
    Yes, I know someone who works for a big US oil company where nearly all their tables have an associated archive table, and most queries select from views that UNION the pairs of tables. Performance is as you would expect! – Tony Andrews Mar 26 '09 at 13:32
  • Hah, I think every DBA must have gone down the union with archive table route at some point. It always seem *so* reasonable at the time. – Cruachan Mar 26 '09 at 19:39
  • 3
    I add: divide the database in several different databases (customers a-c, customers d-f, etc.) – friol May 01 '09 at 08:13
  • Could you elaborate on "Denormalize into fields-as-an-array."? What do you mean here? – Bart van Heukelom Sep 27 '11 at 09:10
  • Fields-as-an-array - adding columns to a table for, say, each month of the year for monthly activity; or columns for large, medium, and small for sizes; or red, blue, green for colors. Normalization requires a new child table for these attributes. – dkretz Sep 27 '11 at 18:56
  • @le dorfier it now comes to my mind that usually the fields-as-an-array idea is implemented without adding columns. They just encode the array into the existing column by constructing huge ever-growing XML strings and storing that in the column. Sigh. A year later, they invest in a hardware XML cruncher box. This will never end. – TheBlastOne Jun 17 '13 at 15:14
87

I think there is no absolute rule: some things are best optimized upfront, and some are not.

For example, I worked in a company where we received data packets from satellites. Each packet cost a lot of money, so all the data was highly optimized (ie. packed). For example, latitude/longitude was not sent as absolute values (floats), but as offsets relative to the "north-west" corner of a "current" zone. We had to unpack all the data before it could be used. But I think this is not pessimization, it is intelligent optimization to reduce communication costs.

On the other hand, our software architects decided that the unpacked data should be formatted into a very readable XML document, and stored in our database as such (as opposed to having each field stored in a corresponding column). Their idea was that "XML is the future", "disk space is cheap", and "processor is cheap", so there was no need to optimize anything. The result was that our 16-bytes packets were turned into 2kB documents stored in one column, and for even simple queries we had to load megabytes of XML documents in memory! We received over 50 packets per second, so you can imagine how horrible the performance became (BTW, the company went bankrupt).

So again, there is no absolute rule. Yes, sometimes optimization too early is a mistake. But sometimes the "cpu/disk space/memory is cheap" motto is the real root of all evil.

MiniQuark
  • 46,633
  • 36
  • 147
  • 183
81

On an old project we inherited some (otherwise excellent) embedded systems programmers who had massive Z-8000 experience.

Our new environment was 32-bit Sparc Solaris.

One of the guys went and changed all ints to shorts to speed up our code, since grabbing 16 bits from RAM was quicker than grabbing 32 bits.

I had to write a demo program to show that grabbing 32-bit values on a 32-bit system was faster than grabbing 16-bit values, and explain that to grab a 16-bit value the CPU had to make a 32-bit wide memory access and then mask out or shift the bits not needed for the 16-bit value.

Mark Harrison
  • 297,451
  • 125
  • 333
  • 465
  • 16
    Hey where did you learn your math? 2 instructions with 1 cache/RAM access is obviously faster than 1 instruction with 1 cache/RAM access! – Razor Storm Jun 24 '10 at 02:24
  • 3
    @RazorStorm On later machines where bandwidth and cache are more precious, the opposite would be true. The bitmask/shift is cheap, but you want to fit as much in cache as possible and also minimize bandwidth. – Jed Jul 22 '13 at 18:02
  • @Jed completely with you on this. When you do data oriented design for memory intensive stuff, like DSP process, lots of otherwise 128bits+ hardware (AVX, GPU..) would gain speed by working with arrays of `shorts`. – v.oddou Mar 31 '21 at 03:01
75

Oh good Lord, I think I have seen them all. More often than not it is an effort to fix performance problems by someone that is too darn lazy to troubleshoot their way down to the CAUSE of those performance problems or even researching whether there actually IS a performance problem. In many of these cases I wonder if it isn't just a case of that person wanting to try a particular technology and desperately looking for a nail that fits their shiny new hammer.

Here's a recent example:

Data architect comes to me with an elaborate proposal to vertically partition a key table in a fairly large and complex application. He wants to know what type of development effort would be necessary to adjust for the change. The conversation went like this:

Me: Why are you considering this? What is the problem you are trying to solve?

Him: Table X is too wide, we are partitioning it for performance reasons.

Me: What makes you think it is too wide?

Him: The consultant said that is way too many columns to have in one table.

Me: And this is affecting performance?

Him: Yes, users have reported intermittent slowdowns in the XYZ module of the application.

Me: How do you know the width of the table is the source of the problem?

Him: That is the key table used by the XYZ module, and it is like 200 columns. It must be the problem.

Me (Explaining): But module XYZ in particular uses most of the columns in that table, and the columns it uses are unpredictable because the user configures the app to show the data they want to display from that table. It is likely that 95% of the time we'd wind up joining all the tables back together anyway which would hurt performance.

Him: The consultant said it is too wide and we need to change it.

Me: Who is this consultant? I didn't know we hired a consultant, nor did they talk to the development team at all.

Him: Well, we haven't hired them yet. This is part of a proposal they offered, but they insisted we needed to re-architect this database.

Me: Uh huh. So the consultant who sells database re-design services thinks we need a database re-design....

The conversation went on and on like this. Afterward, I took another look at the table in question and determined that it probably could be narrowed with some simple normalization with no need for exotic partitioning strategies. This, of course turned out to be a moot point once I investigated the performance problems (previously unreported) and tracked them down to two factors:

  1. Missing indexes on a few key columns.
  2. A few rogue data analysts who were periodically locking key tables (including the "too-wide" one) by querying the production database directly with MSAccess.

Of course the architect is still pushing for a vertical partitioning of the table hanging on to the "too wide" meta-problem. He even bolstered his case by getting a proposal from another database consultant who was able to determine we needed major design changes to the database without looking at the app or running any performance analysis.

JohnFx
  • 34,542
  • 18
  • 104
  • 162
  • Aaag MSAccess to prod. We wrote a procedure to drop all access connections every few minutes tp finally got across the message that it was bad. – Nat Mar 26 '09 at 20:37
  • 1
    We had a similar job, but it had gone into disuse. To be fair, Access isn't the problem, it just makes it easy for neophytes to create/run nonperformanct queries. – JohnFx Mar 26 '09 at 21:19
  • We have a dependency at our company on legacy ad hoc Access connections to the production DB. Nothing like a few casual SQL'ers to forget a WHERE clause and deadlock the main tables! – HardCode Mar 28 '09 at 05:09
  • 35
    "I heard mauve has the most RAM" – Piskvor left the building Mar 28 '09 at 12:37
  • It could have been worse. The Excel Query Editor locks the full database when using it. When I didn't know about it, I left one instance of it open for the better part of the day while I worked in something else. The worst part is that MS SQL Server didn't report the correct username/machine that was making the lock. I realized hours later that I was the reason of the locking because of the tables being locked being part of the view I was querying and having checked everything else first. – Esteban Küber Jan 17 '10 at 05:13
58

I have seen people using alphadrive-7 to totally incubate CHX-LT. This is an uncommon practice. The more common practice is to initialize the ZT transformer so that bufferication is reduced (due to greater net overload resistance) and create java style bytegraphications.

Totally pessimistic!

Zalaga
  • 337
  • 2
  • 2
54

Nothing Earth-shattering, I admit, but I've caught people using StringBuffer to concatenate Strings outside of a loop in Java. It was something simple like turning

String msg = "Count = " + count + " of " + total + ".";

into

StringBuffer sb = new StringBuffer("Count = ");
sb.append(count);
sb.append(" of ");
sb.append(total);
sb.append(".");
String msg = sb.toString();

It used to be quite common practice to use the technique in a loop, because it was measurably faster. The thing is, StringBuffer is synchronized, so there's actually extra overhead if you're only concatenating a few Strings. (Not to mention that the difference is absolutely trivial on this scale.) Two other points about this practice:

  1. StringBuilder is unsynchronized, so should be preferred over StringBuffer in cases where your code can't be called from multiple threads.
  2. Modern Java compilers will turn readable String concatenation into optimized bytecode for you when it's appropriate anyway.
Bill the Lizard
  • 398,270
  • 210
  • 566
  • 880
  • First: If you're not using Java 5, you don't have StringBuilder. Second: You can't count on JVM optimization. Taking the first and second into account, the first statement CREATES 5 OBJECTS, whereas the second one creates one, possibly two, objects. – MetroidFan2002 Nov 23 '08 at 06:39
  • A second point: Compilers may use StringBuffer behind the scenes anyways: http://java.sun.com/docs/books/jls/third_edition/html/expressions.html – MetroidFan2002 Nov 23 '08 at 06:43
  • 3
    First: Why wouldn't you be using at least Java 5? Second: Yes you can. How is it that you can count to 5 in the first example, but not the second? It uses the same String literals as the first. Write readable code and let the compiler decide when to use StringBuffer behind the scenes. – Bill the Lizard Nov 23 '08 at 13:06
  • The second uses one StringBuffer and then creates a String. That's two objects. The first creates a StringBuffer instance for every string concatenated. Secondly, workspace politics sometimes prevent the use of Java 5. For example, if you deploy to Websphere 6.0, you are limited to Java 1.4. – MetroidFan2002 Dec 31 '08 at 18:18
  • I've seen the same in .NET (the ONLY difference is the name of the StringBuffer class -> StringBuilder) – Andrei Rînea Mar 26 '09 at 10:16
  • 4
    @MetroidFan2002: The String literals in the second example are objects too. As I said in the answer, the differences are trivial at this scale. – Bill the Lizard Mar 26 '09 at 11:53
  • @Bill - the second example doesn't another StringBuffer for each string. The first example may be (behind the scenes) equivalent to String msg = new StringBuffer("Count = ").append(new StringBuffer(count).append(...) - new StringBuffers for each string. The second example doesn't do this. – MetroidFan2002 Mar 26 '09 at 17:37
  • @MetroidFan2002: Section 15.18.1.2 from your link above: An implementation may choose to perform conversion and concatenation in one step to avoid creating and then discarding an intermediate String object. (continued...) – Bill the Lizard Mar 26 '09 at 18:16
  • To increase the performance of repeated string concatenation, a Java compiler may use the StringBuffer class or a similar technique to reduce the number of intermediate String objects that are created by evaluation of an expression. – Bill the Lizard Mar 26 '09 at 18:17
  • 1
    That doesn't mean it replaces each String with its own StringBuffer. The optimization that the compiler performs reduces the number of objects created. – Bill the Lizard Mar 26 '09 at 18:18
  • The first method takes one line and no method call The second one takes 6 lines, 5 method calls and one new call Readability above all! – Eric Mar 26 '09 at 19:19
  • 3
    @Eric: String msg = "Count = " + count + " of " + total + "."; is often compiled in Java to String msg = new StringBuffer().append("Count").append(count).append(" of ").append(total).append(".").toString(); ... which is precisely what the second example does. – Grant Wagner Mar 26 '09 at 21:35
  • @Eric: Confirmed. javac 1.6.0_12 decompiles to String s = (new StringBuilder()).append("Count = ").append(i).append(" of ").append(j).append(".").toString(); - SIX method calls and one new call. Not saying explicit SB is better, but + concatenation does not remove the method calls, just hides them. – Grant Wagner Mar 26 '09 at 21:54
  • 3
    Mr. Wagner the thing is that YOU have to look at all these method calls, not the compiler. You have to write them and comprehend them later. The compiler does the same anyway. So readability is more important in this case. – ypnos Mar 27 '09 at 16:08
  • 1
    +1 I'm seeing even better. Using StringBuilder to concatenate a bunch of string *constants* in VB.NET instead of concatenating them with &. The latter emits just one big string constant in MSIL, the former creates a pretty heavy object and calls its methods. For every single query. – Tomek Szpakowicz Apr 04 '09 at 20:57
  • I've been a technical reviewer at a book on programming showing this pessimization. Ofcourse I reported it instantly! – Andrei Rînea Apr 26 '09 at 22:18
47

I once saw a MSSQL database that used a 'Root' table. The Root table had four columns: GUID (uniqueidentifier), ID (int), LastModDate (datetime), and CreateDate (datetime). All tables in the database were Foreign Key'd to the Root table. Whenever a new row was created in any table in the db, you had to use a couple of stored procedures to insert an entry in the Root table before you could get to the actual table you cared about (rather than the database doing the job for you with a few triggers simple triggers).

This created a mess of useless overheard and headaches, required anything written on top of it to use sprocs (and eliminating my hopes of introducing LINQ to the company. It was possible but just not worth the headache), and to top it off didn't even accomplish what it was supposed to do.

The developer that chose this path defended it under the assumption that this saved tons of space because we weren't using Guids on the tables themselves (but...isn't a GUID generated in the Root table for every row we make?), improved performance somehow, and made it "easy" to audit changes to the database.

Oh, and the database diagram looked like a mutant spider from hell.

Dusda
  • 3,347
  • 5
  • 37
  • 58
42

How about POBI -- pessimization obviously by intent?

Collegue of mine in the 90s was tired of getting kicked in the ass by the CEO just because the CEO spent the first day of every ERP software (a custom one) release with locating performance issues in the new functionalities. Even if the new functionalities crunched gigabytes and made the impossible possible, he always found some detail, or even seemingly major issue, to whine upon. He believed to know a lot about programming and got his kicks by kicking programmer asses.

Due to the incompetent nature of the criticism (he was a CEO, not an IT guy), my collegue never managed to get it right. If you do not have a performance problem, you cannot eliminate it...

Until for one release, he put a lot of Delay (200) function calls (it was Delphi) into the new code. It took just 20 minutes after go-live, and he was ordered to appear in the CEO's office to fetch his overdue insults in person.

Only unusual thing so far was my collegues mute when he returned, smiling, joking, going out for a BigMac or two while he normally would kick tables, flame about the CEO and the company, and spend the rest of the day turned down to death.

Naturally, my collegue now rested for one or two days at his desk, improving his aiming skills in Quake -- then on the second or third day he deleted the Delay calls, rebuilt and released an "emergency patch" of which he spread the word that he had spent 2 days and 1 night to fix the performance holes.

This was the first (and only) time that evil CEO said "great job!" to him. That's all that counts, right?

This was real POBI.

But it also is a kind of social process optimization, so it's 100% ok.

I think.

TheBlastOne
  • 4,291
  • 3
  • 38
  • 72
  • 10
    I remember someone writing about a data processing app that was sold at different levels where the "Lite" could cruch only a few data sets per second, the "superduper" version thousands. The only source code difference being Sleep(N)'s. – peterchen Jul 28 '10 at 21:53
  • 1
    Brilliant! i'd recommend that standard in a situation like that. At the beginning of development allocate a big chunk of memory and add some sleep calls, and whenever you need to eek out some performance, just chink them down. It's called being a miracle worker ;) – RCIX Jul 28 '10 at 23:02
  • Unfortunately, patching Sleeps to NOPs is easy, so that lite version can be cracked very easily. This "optimization" reserve might require a executeable packer to make debugging and patching harder. – TheBlastOne Aug 17 '10 at 09:15
32

"Database Independence". This meant no stored procs, triggers, etc - not even any foreign keys.

chris
  • 36,094
  • 53
  • 157
  • 237
  • 8
    Is this "independence" in the sense that you're so far above the database, you've forgotten what data is? Unnecessarily abstracting over databases "to avoid migration pains" is a pet peeve; you ain't gonna need it. – Rob Mar 26 '09 at 19:22
  • 8
    Pretty much. Architecture astronauts at work. I've been building web apps since there was a web, and in all that time I've never actually moved from one db platform to another. – chris Mar 27 '09 at 00:53
  • +1, Chris: I've been working on shrink-wrap and web apps since, oh, 1998 and never once have I needed to change out DBMS products (with the exception of supporting Sybase ASE and SQL Server, which is pretty easy to do) – Matt Rogish Mar 27 '09 at 14:42
  • It does happen. Ca. 10 years ago, I had to support some software that had to change from from Sybase to Oracle. Another project I know of is currently in the process of migrating from Sybase to MySQL. – mfx Mar 27 '09 at 16:37
  • 5
    I'm sure it happens, but it's rare enough that you're an idiot if you design your architecture around that possibility. – chris Mar 27 '09 at 17:48
  • pretty much. tons of people make this "database independence" argument too. pick a database, learn it, and exploit the heck out of it. – Cory R. King Mar 29 '09 at 06:37
  • 1
    Really, guys? What about products for resale, like FogBugz, which supports SQL Server, Access, and MySql? Does this make Joel an "idiot" or an "architecture astronaut"? – harpo Mar 30 '09 at 19:33
  • 1
    Indeed, I've worked on numerous projects that had to (or chose to) change dbs. We didn't have stupid rules about avoiding foreign keys, though, as we did assume that whatever db we used would have at least some modicum of sanity as an RDBMS. And porting triggers was done if needed (though avoided). – jsight Mar 30 '09 at 21:33
  • 6
    harpo, that's a different situation - it's a requirement in that case. I'm talking about when it's not a requirement, but the AA decides that it "might be" at some point. – chris Mar 30 '09 at 23:35
  • 3
    @All: DB independence can cost you, yes, but our product is run in environments where the DB vendor is chosen by bids, and we basically have to play along. Some developers don't get the luxury of a vertically-integrated software stack and have to make do despite that. – Chris R Mar 31 '09 at 04:00
  • Why not just use MySQL, MariaDB, or PostgreSQL? All free, no purchase needed. – Demi May 27 '15 at 16:07
  • @Demetri: you have obviously never worked in a government environment, where paid-for support for a commercial product is basically a requirement. Picking a product without having someone you can blame is a career-limiting move in bureaucrat-think. – chris May 27 '15 at 17:22
  • I understand what the issue is now -- you will be blamed if there is no company to be blamed. If you used Postgres, you could still distribute the DBMS with the product though (Postgres is BSD licensed), though you would need to provide/contract for support. – Demi May 27 '15 at 23:09
  • @chris Now I understand -- you can't choose the DBMS because it must be paid for, and thus put on bids. – Demi Jul 31 '15 at 00:33
31
var stringBuilder = new StringBuilder();
stringBuilder.Append(myObj.a + myObj.b + myObj.c + myObj.d);
string cat = stringBuilder.ToString();

Best use of a StringBuilder I've ever seen.

luft
  • 16
  • 3
  • 4
26

Using a regex to split a string when a simple string.split suffices

Cherian
  • 19,107
  • 12
  • 55
  • 69
26

Very late to this thread I know, but I saw this recently:

bool isFinished = GetIsFinished();

switch (isFinished)
{
    case true:
        DoFinish();
        break;

    case false:
        DoNextStep();
        break;

    default:
        DoNextStep();
}

Y'know, just in case a boolean had some extra values...

Damovisa
  • 19,213
  • 14
  • 66
  • 88
  • 22
    True, False an FileNotFound ofcourse – Ikke Jun 01 '09 at 10:55
  • Hey you should always have a default/case else/etc. What happens when some bright person changes that boolean to an enum to reflect so other status, then the next person adds to the enum and forgets to modify the procedure? Having a default when you don't need to costs no execution time and very little development time. Tracking down an accidentally introduced logical error that occurs during runtime... That costs time, money, and reputation. A stitch in time saves nine. – Oorang Mar 18 '10 at 02:24
  • 2
    @Oorang... why would you have it as a switch anyway? It's a boolean - an if/else is all that's required. – Damovisa Mar 18 '10 at 03:01
  • @Damovisa *facepalm* right... very well then:) Missed that:) – Oorang Apr 03 '10 at 02:45
  • There's nothing like good ol' `enBooleanDummy`. `Yes`, `No` and `Undefined` being the possible values, defaulting to Yes... –  Feb 14 '11 at 12:56
  • 2
    It was Nullable... :) – Giorgi Chakhidze Jul 05 '11 at 08:00
25

Worst example I can think of is an internal database at my company containing information on all employees. It gets a nightly update from HR and has an ASP.NET web service on top. Many other apps use the web service to populate things like search/dropdown fields.

The pessimism is that the developer thought that repeated calls to the web service would be too slow to make repeated SQL queries. So what did he do? The application start event reads in the entire database and converts it all to objects in memory, stored indefinitely until the app pool is recycled. This code was so slow, it would take 15 minutes to load in less than 2000 employees. If you inadvertently recycled the app pool during the day, it could take 30 minutes or more, because each web service request would start multiple concurrent reloads. For this reason, new hires wouldn't appear in the database the first day when their account was created and therefore would not be able to access most internal apps on their first couple days, twiddling their thumbs.

The second level of pessimism is that the development manager doesn't want to touch it for fear of breaking dependent applications, but yet we continue to have sporadic company-wide outages of critical applications due to poor design of such a simple component.

spoulson
  • 21,335
  • 15
  • 77
  • 102
  • 28
    Management at their finest - "No, let's not put one-off 80 programmer hours into fixing this app, that's too expensive. Let's just keep it, so its bugs can drain 200+ user hours per month, plus 10 programmer hours a month for 'maintenance'." AAAAAAAAAUGH!!! – Piskvor left the building Mar 28 '09 at 12:45
25

No one seems to have mentioned sorting, so I will.

Several different times, I've discovered that someone had hand-crafted a bubblesort, because the situation "didn't require" a call to the "too fancy" quicksort algorithm that already existed. The developer was satisified when their handcrafted bubblesort worked well enough on the ten rows of data that they're using for testing. It didn't go over quite as well after the customer had added a couple of thousand rows.

Dan Breslau
  • 11,472
  • 2
  • 35
  • 44
20

I once worked on an app that was full of code like this:

 1 tuple *FindTuple( DataSet *set, int target ) {
 2     tuple *found = null;
 3     tuple *curr = GetFirstTupleOfSet(set);
 4     while (curr) {
 5         if (curr->id == target)
 6             found = curr;
 7         curr = GetNextTuple(curr);
 8     }
 9     return found;
10 }

Simply removing found, returning null at the end, and changing the sixth line to:

            return curr;

Doubled the app performance.

erickson
  • 265,237
  • 58
  • 395
  • 493
Dour High Arch
  • 21,513
  • 29
  • 75
  • 90
  • 1
    I worked once in a company where the coding guidelines demanded "just one return at the end" (for maintainence). And indeed some spit code out like yours, because they dont think (the obvious solutions was most of the times using a goto to the proc exit, or changing the exit condition of loops) – flolo Nov 23 '08 at 09:26
  • The place that had this code also had the "one return at the end" rule. – Dour High Arch Nov 24 '08 at 05:18
  • 12
    A return curr here produces notably different behavior. When you return curr you end up getting the FIRST match, where as the code you pasted returns the LAST match. – SoapBox Nov 28 '08 at 01:08
  • 2
    @SoapBox: You are Right. @Dour High Arch: The increase in performance had nothing to do with the single return rule, as flolo said chaing the loop condition to (curr && !found) would have the same effect. GOTO to the proc exit is horrible, and defeats the purpose of the single return guideline. – Akusete Nov 28 '08 at 01:23
  • 1
    Using break; would be better than adding an extra condition to the while(). – Adam Pierce Nov 28 '08 at 02:58
  • 2
    Good comments everyone. In this case there was supposed to be only a single tuple with each ID. – Dour High Arch Nov 29 '08 at 03:09
  • 1
    This has nothing to do with a coding standard and everything to do with a coding bug. Sure a return would have worked, but adding a couple of curly braces and a break statement would also have worked. – jussij Feb 19 '09 at 04:44
  • The old code actually has a performance bug that adding a "break;" would have fixed nicely. – ksuralta Mar 26 '09 at 09:21
  • @Akusete: Actually, I don't think that GOTO here would be particularly harmful. It operates no differently on the procedure than a break does. GOTO is not universally harmful, it's harmful only when it causes method entry invariants to be violated. – Chris R Mar 31 '09 at 03:57
  • @Chris R: Using GOTO adds a burden to the maintenance programmer - they have to determine that it is really just a break in disguise. Also, it is possible that a goto that acts like a break today will skip over some code added just after the loop tommorrow. I say keep it simple – Tom Leys May 05 '09 at 04:35
  • 7
    That's not a "Pessimization" though, is it? It is simply an optimisation waiting to happen. – Tim Long May 12 '09 at 19:14
  • Your version returns the first match while the original function returns the last match. – Adrian McCarthy May 26 '09 at 23:09
20

I once had to attempt to modify code that included these gems in the Constants class

public static String COMMA_DELIMINATOR=",";
public static String COMMA_SPACE_DELIMINATOR=", ";
public static String COLIN_DELIMINATOR=":";

Each of these were used multiple times in the rest of the application for different purposes. COMMA_DELIMINATOR littered the code with over 200 uses in 8 different packages.

KitsuneYMG
  • 12,753
  • 4
  • 37
  • 58
  • At least something like that is easy to Find/Replace out of the source - still, my sympathies. – Erik Forbes Mar 26 '09 at 19:12
  • 12
    Also - Deliminator? I thought it was spelled 'delimiter'. Deliminator sounds like a bad mid-90s movie that somehow got 3 sequals........... – Erik Forbes Mar 26 '09 at 19:13
  • 54
    Deliminator III: Rise of the Commas – Rob Mar 26 '09 at 19:19
  • 33
    On another note, I'm pleased to see proper delimiting of Colins. Every programmer worth his salt knows that if there's one thing you absolutely must separate out properly, it's the damn Colins. – Rob Mar 26 '09 at 19:20
  • 2
    It's not that easy to do a proper find and replace. Since each one is used for different purposes. Any good programmer would have at least done something like this: COUNTRY_LIST_DELIM=... CLASSIFICATION_DELIM=... etc – KitsuneYMG Mar 26 '09 at 20:56
  • When that comma becomes a semicolon, the const makes a potentially time consuming task into a few clicks in the IDE. – Rich Remer Jul 23 '14 at 20:02
19

The big all time number one which I run into time and time again in inhouse software:

Not using the features of the DBMS for "portability" reasons because "we might want to switch to another vendor later".

Read my lips. For any inhouse work: IT WILL NOT HAPPEN!

Peter Stuer
  • 1,945
  • 2
  • 19
  • 18
17

I had a co-worker who was trying to outwit our C compiler's optimizer and routine rewrote code that only he could read. One of his favorite tricks was changing a readable method like (making up some code):

int some_method(int input1, int input2) {
    int x;
    if (input1 == -1) {
        return 0;
    }
    if (input1 == input2) {
        return input1;
    }
    ... a long expression here ...
    return x;
}

into this:

int some_method() {
    return (input == -1) ? 0 : (input1 == input2) ? input 1 :
           ... a long expression ...
           ... a long expression ...
           ... a long expression ...
}

That is, the first line of a once-readable method would become "return" and all other logic would be replace by deeply nested terniary expressions. When you tried to argue about how this was unmaintainable, he would point to the fact that the assembly output of his method was three or four assembly instructions shorter. It wasn't necessarily any faster but it was always a tiny bit shorter. This was an embedded system where memory usage occasionally did matter, but there were far easier optimizations that could have been made than this that would have left the code readable.

Then, after this, for some reason he decided that ptr->structElement was too unreadable, so he started changing all of these into (*ptr).structElement on the theory that it was more readable and faster as well.

Turning readable code into unreadable code for at the most a 1% improvement, and sometimes actually slower code.

Eddie
  • 53,828
  • 22
  • 125
  • 145
  • If said module is being called millions and millions of times per loop, then I would approve of that optimization as long as he commented the heck out of it. – Michael Dorgan Jul 06 '10 at 14:37
  • 2
    @Michael: I wouldn't, unless there were measurements indicating that it was **faster**, not just **shorter**. – dsimcha Aug 31 '10 at 02:47
  • In most situations the ternary operator is *more* readable than `if`. Insistence on statements over expressions in C is cultural/religious dogma, *not* any kind of objective practice. (Better guideline: if the nested ternary is too long to read, you shouldn't be using `if` either.) – Alex Celeste Apr 14 '15 at 20:39
  • 2
    The issue here is taking an **entire function** and replacing it with a single statement, a return, thus replacing all of the logic of the entire function with nested ternaries. If you saw it, you would understand. This isn't a religious "I hate ternary operators" thing. I'm not talking about taking a single `if` in a function and replacing it with a ternary. That's fine, and often more readable. I'm talking about replacing an entire 30+ line method with a single return statement and nested ternaries. No-one thought the new code was more readable, but one developer thought it was faster. – Eddie Apr 15 '15 at 00:07
  • that way of expressing code is more functional. one return point. – v.oddou Mar 31 '21 at 05:50
15

In one of my first jobs as a full-fledged developer, I took over a project for a program that was suffering scaling issues. It would work reasonably well on small data sets, but would completely crash when given large quantities of data.

As I dug in, I found that the original programmer sought to speed things up by parallelizing the analysis - launching a new thread for each additional data source. However, he'd made a mistake in that all threads required a shared resource, on which they were deadlocking. Of course, all benefits of concurrency disappeared. Moreover it crashed most systems to launch 100+ threads only to have all but one of them lock. My beefy dev machine was an exception in that it churned through a 150-source dataset in around 6 hours.

So to fix it, I removed the multi-threading components and cleaned up the I/O. With no other changes, execution time on the 150-source dataset dropped below 10 minutes on my machine, and from infinity to under half an hour on the average company machine.

Jeffrey Blake
  • 9,659
  • 6
  • 43
  • 65
14

I suppose I could offer this gem:

unsigned long isqrt(unsigned long value)
{
    unsigned long tmp = 1, root = 0;
    #define ISQRT_INNER(shift) \
    { \
        if (value >= (tmp = ((root << 1) + (1 << (shift))) << (shift))) \
        { \
            root += 1 << shift; \
            value -= tmp; \
        } \
    }

    // Find out how many bytes our value uses
    // so we don't do any uneeded work.
    if (value & 0xffff0000)
    {
        if ((value & 0xff000000) == 0)
            tmp = 3;
        else
            tmp = 4;
    }
    else if (value & 0x0000ff00)
        tmp = 2;

    switch (tmp)
    {
        case 4:
            ISQRT_INNER(15);
            ISQRT_INNER(14);
            ISQRT_INNER(13);
            ISQRT_INNER(12);
        case 3:
            ISQRT_INNER(11);
            ISQRT_INNER(10);
            ISQRT_INNER( 9);
            ISQRT_INNER( 8);
        case 2:
            ISQRT_INNER( 7);
            ISQRT_INNER( 6);
            ISQRT_INNER( 5);
            ISQRT_INNER( 4);
        case 1:
            ISQRT_INNER( 3);
            ISQRT_INNER( 2);
            ISQRT_INNER( 1);
            ISQRT_INNER( 0);
    }
#undef ISQRT_INNER
    return root;
}

Since the square-root was calculated at a very sensitive place, I got the task of looking into a way to make it faster. This small refactoring reduced the execution time by a third (for the combination of hardware and compiler used, YMMV):

unsigned long isqrt(unsigned long value)
{
    unsigned long tmp = 1, root = 0;
    #define ISQRT_INNER(shift) \
    { \
        if (value >= (tmp = ((root << 1) + (1 << (shift))) << (shift))) \
        { \
            root += 1 << shift; \
            value -= tmp; \
        } \
    }

    ISQRT_INNER (15);
    ISQRT_INNER (14);
    ISQRT_INNER (13);
    ISQRT_INNER (12);
    ISQRT_INNER (11);
    ISQRT_INNER (10);
    ISQRT_INNER ( 9);
    ISQRT_INNER ( 8);
    ISQRT_INNER ( 7);
    ISQRT_INNER ( 6);
    ISQRT_INNER ( 5);
    ISQRT_INNER ( 4);
    ISQRT_INNER ( 3);
    ISQRT_INNER ( 2);
    ISQRT_INNER ( 1);
    ISQRT_INNER ( 0);

#undef ISQRT_INNER
    return root;
}

Of course there are both faster AND better ways to do this, but I think it's a pretty neat example of a pessimization.

Edit: Come to think of it, the unrolled loop was actually also a neat pessimization. Digging though the version control, I can present the second stage of refactoring as well, which performed even better than the above:

unsigned long isqrt(unsigned long value)
{
    unsigned long tmp = 1 << 30, root = 0;

    while (tmp != 0)
    {
        if (value >= root + tmp) {
            value -= root + tmp;
            root += tmp << 1;
        }
        root >>= 1;
        tmp >>= 2;
    }

    return root;
}

This is exactly the same algorithm, albeit a slightly different implementation, so I suppose it qualifies.

Christoffer
  • 12,712
  • 7
  • 37
  • 53
11

This might be at a higher level that what you were after, but fixing it (if you're allowed) also involves a higher level of pain:

Insisting on hand rolling an Object Relationship Manager / Data Access Layer instead of using one of the established, tested, mature libraries out there (even after they've been pointed out to you).

Gordon Hartley
  • 489
  • 4
  • 10
  • It's not always a bad idea to roll your own code. Like a wise man once said, find the dependencies and eliminate them. If it's a core business function, do it yourself. – Kibbee Nov 23 '08 at 02:58
  • I never inferred it's always a bad idea. Unless you're say Frans Bouma or similar, I doubt ORM/DAL stuff is a core business function. It's extremely cost inefficent to write your own equivelant, a case of reinventing the (square) wheel, usually due to NIH syndrome. – Gordon Hartley Nov 24 '08 at 00:25
  • @Kibbee - I agree. Its better to roll your own and understand it than use 3rd party dependencies. When it breaks (and it will) at least you cna fix it then. I've found bugs in Hibernate and Apache Commons in the past that were absolutely killing our app's performance. – CodingWithSpike Feb 06 '09 at 20:31
  • Thing is, a decent ORM is very hard to write. Sure, modeling records as object with convenient insert/update methods is easy, but as soon as you try provide an API that supports joins AND produces efficient SQL, you're entering a world of pain. – SpoonMeiser Mar 26 '09 at 11:09
  • 4
    Hand-rolling one is really your only option if none of the established ones have a critical feature you need. – staticsan Apr 14 '09 at 04:45
  • 3
    Actually, given some of the comments above, some more perspective: another pessimization is to try and get the ORM to do absolutely everything. Its often useful for 95%+ of cases. For that final 5% it's much easier to drop out to hand crafted persistence code/direct stored procedure calls etc. for performace, simplicity or both. – Gordon Hartley Nov 30 '10 at 06:09
10

All foreign-key constraints were removed from a database, because otherwise there would be so many errors.

Guge
  • 4,569
  • 4
  • 35
  • 47
8

Checking before EVERY javascript operation whether the object you are operating upon exists.

if (myObj) { //or its evil cousin, if (myObj != null) {
    label.text = myObj.value; 
    // we know label exists because it has already been 
    // checked in a big if block somewhere at the top
}

My problem with this type of code is nobody seems to care what if it doesn't exist? Just do nothing? Don't give the feedback to the user?

I agree that the Object expected errors are annoying, but this is not the best solution for that.

Svante
  • 50,694
  • 11
  • 78
  • 122
Chetan S
  • 23,637
  • 2
  • 63
  • 78
  • What is the best solution then? I think it's sloppy to write code, where errors occassinally occur, even if they have no direct consequences. Of course you shouldn't do it, if you don't expect the object to be null in any circumstances - maybe this is what you meant. – simon Mar 31 '09 at 13:35
8

This doesn't exactly fit the question, but I'll mention it anyway a cautionary tale. I was working on a distributed app that was running slowly, and flew down to DC to sit in on a meeting primarily aimed at solving the problem. The project lead started to outline a re-architecture aimed at resolving the delay. I volunteered that I had taken some measurements over the weekend that isolated the bottleneck to a single method. It turned out there was a missing record on a local lookup, causing the application to have to go to a remote server on every transaction. By adding the record back to the local store, the delay was eliminated - problem solved. Note the re-architecture wouldn't have fixed the problem.

Jack BeNimble
  • 35,733
  • 41
  • 130
  • 213
7

How about YAGNI extremism. It is a form of premature pessimization. It seems like anytime you apply YAGNI, then you end up needing it, resulting in 10 times the effort to add it than if you had added it in the beginning. If you create a successful program then odds are YOU ARE GOING TO NEED IT. If you are used to creating programs whose life runs out quickly then continue to practice YAGNI because then I suppose YAGNI.

Dunk
  • 1,704
  • 1
  • 14
  • 19
  • 3
    Thank you, I'm sick of these lame 'extreme programming' acronyms and how people use them to support lazy, counter-productive practices. – JAL Mar 29 '09 at 07:05
  • Studies of actual projects show that the actual factor between one-time and reusbale code averages at about 3. So 10 is just the "felt" value, but your are right by intent. – peterchen Jul 28 '10 at 21:55
  • @peterchen - are you saying that studies show it takes three times as long to write reusable code as one-off code, or that they show it takes three times as long to **convert** one-off-code to reusable code than it does to write the reusable code in the first place? – Jeff Sternal Jul 29 '10 at 18:01
  • @jeff: IIRC they compared some complexity measure (whatever you think of them) of inline snippets that werre moved to separate methods. Complexity increases due to additional cases supported, parameter checking etc. (which makes me assume the methods were rather small). Let me try to dig out a reference. – peterchen Jul 29 '10 at 22:06
6

Someone in my department once wrote a string class. An interface like CString, but without the Windows dependence.

One "optimization" they did was to not allocate any more memory than necessary. Apparently not realizing that the reason classes like std::string do allocate excess memory is so that a sequence of += operations can run in O(n) time.

Instead, every single += call forced a reallocation, which turned repeated appends into an O(n²) Schlemiel the Painter's algorithm.

dan04
  • 87,747
  • 23
  • 163
  • 198
6

Not exactly premature optimisation - but certainly misguided - this was read on the BBC website, from an article discussing Windows 7.

Mr Curran said that the Microsoft Windows team had been poring over every aspect of the operating system to make improvements. "We were able to shave 400 milliseconds off the shutdown time by slightly trimming the WAV file shutdown music.

Now, I haven't tried Windows 7 yet, so I might be wrong, but I'm willing to bet that there are other issues in there that are more important than how long it takes to shut-down. After all, once I see the 'Shutting down Windows' message, the monitor is turned off and I'm walking away - how does that 400 milliseconds benefit me?

belugabob
  • 4,302
  • 22
  • 22
  • You will probably find that the other issues aren't as easy to explain to non-programmers on a BBC website. – Tom Leys May 05 '09 at 20:52
  • Now that's an angle that I didn't consider - maybe I'm starting to lose my cynicism :-) – belugabob May 06 '09 at 14:14
  • That 400 ms is 400 ms of power draw. Probably insignificant, but maybe it adds up over time. Still, not something I would worry about. – ZachS Oct 02 '09 at 22:16
  • 1
    I've lost plenty of hours in total waiting for XP VMs to shut down so that I can move on to the next thing. I'm very grateful for faster shutdown. – James Jan 20 '10 at 19:27
  • 1
    Interestingly, the WAV files are played asynchroneously, so as long as the shutdown fanfare is shorter than the time needed to shutdown, trimming the WAV file does nothing. And even more interestingly, if they optimized the shutdown soooo much, how comes every Windows box I shutdown need aeons until it really is down? (Except for using the big red button, of course.) – TheBlastOne Jul 07 '10 at 07:34
  • If only they could spend some effort reducing the amount of time it takes to power up! Any enthusiam I have, upon arrival at the office, is severely dented by the interminable wait for my machine to become usable. :-( – belugabob Jul 07 '10 at 10:43
  • Well it might not benefit you, but the planet maybe. As ZachS was getting at, the power involved in running a PC for 400ms, x (each of the 365 days in a year), x(god knows how many millions of computers) is not insignificant. How long could it have taken them to crop the wav file? 10mins? I'd say that was a good decision. – UpTheCreek Jul 27 '10 at 13:14
  • @UpTheCreek: That's a good point, and one that could be extrapolated to any one of a multitude of operations that just take far too long - for example, what the hell is Firefox doing during startup?, how many cpu cycles (and the associated wattage) is wasted on antivirus scanning? etc, etc... – belugabob Jul 27 '10 at 15:19
5

An ex-coworker of mine (a s.o.a.b., actually) was assigned to build a new module for our Java ERP that should have collected and analyzed customers' data (retail industry). He decided to split EVERY Calendar/Datetime field in its components (seconds, minutes, hours, day, month, year, day of week, bimester, trimester (!)) because "how else would I query for 'every monday'?"

Joril
  • 19,961
  • 13
  • 71
  • 88
  • 3
    That's not a premature optimization, he thought he needed to do that for correctness – Pyrolistical Nov 28 '08 at 01:24
  • Sure, he *thought* he needed that, but since most DBMSes have some kind of DAYOFWEEK(timestamp) function, doing that mess up-front is premature enough in my opinion :) – Joril Dec 02 '08 at 12:40
  • 1
    I wouldn't use it for OLTP, but if you were "analyzing customer's data" then that's actually a very flexible way of designing a data warehouse (as long as the date and time are split into different dimensions). Would you really want to call DAYOFWEEK() against millions of rows of data or just do an index seek against an integer field? – Tim M. Dec 16 '10 at 16:56
  • Well I don't know if there were so much rows, but surely that's not the explanation that was given :) – Joril Dec 16 '10 at 17:19
3

No offense to anyone, but I just graded an assignment (java) that had this

import java.lang.*;
Overflown
  • 1,830
  • 2
  • 19
  • 25
  • 1
    Unless this is an upper level class, I think you need to cut this student a little slack unless you taught her enough to know why this isn't a good idea. – Bryan Oakley Mar 26 '09 at 11:05
  • Oh, I didn't hold it against the student, but I wondered what would make a student include the line (it's not like it wouldn't compile before) – Overflown Mar 26 '09 at 11:28
  • Java is smart enough to *not* import what it doesn't need when you use the blob(*) on imports. This isn't nearly as bad as needless #includes in C and C++. That said, the fact that the student feels the need to import from java.lang is troubling. :) – Bill the Lizard Mar 26 '09 at 12:07
  • 25
    Am I going to be the only one to note the irony of a teacher calling WTF on the code of a student that he/she is responsible for teaching to program correctly? – JohnFx Mar 26 '09 at 15:54
  • @JohnFx: incredibly funny. Remember though that it doesn't take a bad teacher to have a bad student. – Dinah Mar 26 '09 at 15:59
  • 3
    Yeah, I can't see that this would hurt. At worst it's surpurfluous. Students tend to resort fo rigid consistency while they are learning, and importing java.lang is rigidly consistent with what the student has learned about importing. – cygil Mar 26 '09 at 18:04
  • 1
    Thank you all for telling me the obvious. It was a Computational Biology assignment and I didn't count it, nor even mention it. – Overflown Mar 26 '09 at 21:49
  • 2
    @JohnFX: The grader and teacher aren't always the same person. – Eddie Jul 06 '10 at 19:51
  • Isn't this necessary in C# to get the basic language stuff imported? – Dan Rosenstark Jan 01 '11 at 03:07
  • This sounds like a good time saver for java students to use in a template when they don't know for sure what they will be using ... of course the final submitted version should be fixed, but in the interim it is a PITA to continuously add/remove imports because it is time consuming and takes you away from the task at hand. As a C programmer, Google's foobar was an interesting way to learn Java that required testing multiple ways of doing the same task to meet the time requirements - each requiring a different set of imports. I spent nearly as much time looking up imports as I did coding. – technosaurus May 18 '17 at 15:44
3

Maybe just having a quick glance over the system early on will help point to the possible bottlenecks.

"This part doesnt need to be fast" (archiving logs) "This part must be hella fast" (accepting new connections)

Then the very fast parts usually dont need to be extra optimised with dirty quirks, usually decent hardware and well coded parts will suffice.

Just answering the simple question "Do I gain anything from having this part of code very fast?" will be a great guideline. I mean, using common sense optimises other parts of the project!

Eric
  • 19,525
  • 19
  • 84
  • 147
3
while true; do echo 3 > /proc/sys/vm/drop_caches; sleep 3600; done

This caused the kernel to spend time clearing out disk cache, and once it succeeded, everything ran slowly until the cache got repopulated. Caused by a misapprehension that disk cache prevented memory from being available for applications to use.

dhasenan
  • 1,177
  • 7
  • 15
3

I don't think pessimization is rare. From my experience doing performance tuning, a lot of the poor performance is caused by "good programming practice" justified in the name of "efficiency". Examples:

  • Map collections or "dictionaries"
    These usually make use of some kind of hash-coding, so they will have O(1) performance, but will only break even when filled with far more items than are typically used.

  • Iterators
    These are justified as being possibly optimized into efficient inline code, when it is seldom checked to see if they actually are.

  • Notifications and event handling as a way to keep data consistent
    Since data structure is seldom normalized, inconsistency must be managed, and notification is the usual method because it supposedly takes care of the problem "immediately". However, there is a big difference between immediacy and efficiency. Also "properties", when Get or Set, are encouraged to reach deep into the data structure to try to keep it consistent. These "short leash" methods can cause large amounts of wasted computation. "Long leash" methods, such as periodically cycling through the data structure to "repair" it, can be a little less "immediate" but much more efficient.

Examples

Community
  • 1
  • 1
Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135
2

Some collegues of mine, that were on an "optimization" project of existing server side batches (written in C++), "optimized" to death the logging class (!), using win32-specific code and functions.

Maybe the bottleneck was in logger.write(...), who knows...

friol
  • 6,996
  • 4
  • 44
  • 81
  • highly possible, if every function happens to invoke blocking disk IO when called... – Jimmy Mar 27 '09 at 07:23
  • Then it's a problem in the outside code, not in the logging class (I don't want to log every single byte that happens in the production system). Batches are usually bound on I/O versus the database. – friol Mar 27 '09 at 17:48
2

One co-worker had to check access to the page for a specific role - "Admin" only. This is what she wrote:

.

if( CurrentUser.CurrentRole == "Role1" || CurrentUser.CurrentRole == "Role2")  
{
// Access denied
} 
else
{
// Access granted
}

instead of

if( !CurrentUser.CurrentRole.equals("Admin") ) 
{
 // access denied
}  

So whenever a new role was added to the system, the new role had access to all confidential pages.


The same coworker was also joins for production and archive table for all queries.

Abhishek
  • 359
  • 1
  • 8
2

Another fancy performance trick :)

if (!loadFromDb().isEmpty) {
    resultList = loadFromDb();
    // do something with results
}

For a small price of extra DB hit, you save all that time doing like 10 lines of code, that probably wouldn't do much on an empty list anyway. And things like this were scattered all over the code :)

Slartibartfast
  • 8,735
  • 6
  • 41
  • 45
2

One company I visited as a consultant many years ago had written a sort function that looked something like this:

procedure sort(string[] values, string direction)
begin
  while not sorted do
  begin
    for every value in values
    begin
      if direction="Ascending" then
      begin
        ... swap values in ascending order
      end
      else if direction="Descending" then
      begin
        ... swap values in descending order
      end
    end;
  end;
end;

That is a bubblesort algorithm (which is inefficient in itself) with a string comparison for direction in the inner loop! I could hardly believe my eyes and explained that yes I can probably make a couple of speed improvements here (they were my clients after all so I had to be diplomatic about the fact that this was the most non-optimal code I've ever seen :-) )

Ville Krumlinde
  • 7,021
  • 1
  • 33
  • 41
1

I was going to mention StringBuilder for tiny/non-looping string concats, but its been mentioned.

Putting a method's variables into private class members to prevent them from getting "garbage collected every time the method runs." The variables are value types.

Jeremy
  • 107
  • 1
  • 4
1

An application that used an Integer field to allocated bitwise which application access groupings our clients could add thier users to. That meant at the time we could create a grand total of 32 groups to be shared across all 500+ clients.

Aaaah, but a bitwise comparison is faster than an equals and waaay faster than a join right?

Unfortunately, when I completely (and rather vocally) freaked at this code and it's author, I discovered the author was my bosses boss. A rather authoritarian dude it turns out.

P.s.

I know what your are thinking, it totally should have been a binary string right? :)

Nat
  • 14,175
  • 5
  • 41
  • 64
  • No, I think it's not an example of pessimization, it's the wrong design for the right feature. There is no way to optimize the kludge so it works, and the pessimization does not hurt the show, it stops, or avoids it. – TheBlastOne Jul 07 '10 at 07:39
  • I had exactly the same somewhere I worked. Funny as the rest of the system was OK. I think it must be a hang up from old timers trying to save every last drop of memory. – NimChimpsky Aug 14 '12 at 10:52
1

A lot of programmers don't know or don't want to know SQL so they find "tricks" to avoid really using SQL so they can get the data into an array. Arrays make some people happy. (I love both cursors and arrays. Coke and Pepsi.) I have found these two blocks of code in a few object oriented programmers' code that complained that relational databases are slow. (the answer is not more memory or more processors.)

The table in this case is a huge table with the uniqueid_col is a unique id or a unique row.

Load this data into arrayX (because arrays must be faster)

   Select uniqueid_col, col2, col3
     from super_big_tbl
 

(psuedo code)


Loop 
   arrayX.next_record
    if uniqueid_col = '829-39-3984'
      return col2
    end if
end loop
 

(My answer is at bottom.)

This next one is a simple mistake I have also seen. The idea is you never get a duplicate this way:

   Select uniqueid_col, col2, col3
     from super_big_tbl
 group by uniqueid_col, col2, col3
   having uniqueid_col = '829-39-3984'

Correct syntax should be

   Select uniqueid_col, col2, col3
     from super_big_tbl
    where uniqueid_col = '829-39-3984'
   
Stradas
  • 1,768
  • 16
  • 17
  • These are the sort of programmers who need a DBA holding their hand, and occasionally slapping them in the face. – Tom Leys May 05 '09 at 20:50
  • I found code using the "group by" and "having" in open source code once too. The code ran fast until it got more than few records then hit the wall. You could say that it didn't scale well. – Stradas May 21 '09 at 15:43
  • Scalability is a bug, not a feature. Mh. – TheBlastOne Jul 07 '10 at 07:42
0

I've got an intentional one... I've once implemented sorting through backtracking... just as a proof of concept ;)) needless to say its performance was horrific.

luvieere
  • 37,065
  • 18
  • 127
  • 179
0

Any significant optimization effort that isn't based on triaged reports from a profiler tool earns a big WTF from me.

JeffH
  • 10,254
  • 2
  • 27
  • 48
  • Be careful. Profiling is critical, but it's not the *only* way to glean performance issues. – Jon Adams Mar 27 '09 at 13:59
  • Well don't tease... how about mentioning some if the other specific ways? – JeffH Mar 27 '09 at 17:43
  • Since you said "profiler tool" specifically, I'm assuming he means methods that don't rely on being able to run the entire app under your favorite profiler. Ie, logging in key places to find bottlenecks (kind of manual profiling). A lot of the hardest problems can only be found that way. – jsight Mar 30 '09 at 21:43
  • It's not the tool that counts, its the hero brain using it (or not). It's like saying "I will WTF your book and will not read it if you do not spellcheck it with Word, and show me checkmark Word displayed first. – TheBlastOne Jul 07 '10 at 07:43