First of all, lazy evaluation wasn't invented in Haskell, it's incorrect to attribute it like so.
Second, Haskell has eager evaluation (the opposite; evaluate when referred) possibilities too.
Third, lazy evaluation facilities are easily available in other languages and technologies too; Python's generators (recall the xrange
function), Meyers singleton and template instantiation in C++, delayed symbol resolution in runtime linkers -- are all examples of this idea.
So anyway, accomodating that idea and corresponding vocabulary will never be harmful to a software engineer.
As to the pros & cons, you named the primary ones. There can be named a few more (remember, you can do these in virtually any language with data structures and function calls):
Recursive datastructures, where you can create, for example, a list value with elements arranged in a cirle, the head being the next
element of the "last" one; traversing such a list would yield an infinitely repeating sequence of elements. Probably not the most motivating example, but you can do the same with trees, graphs and so on.
Arranging control flow with lazy data structures instead of built-in primitives; think of building coroutines with just lazy lists. This is actually the other side of the coin of more complex and convoluted evaluation order (i.e. one of your contra being an advantage).
Semi-automatic parallelisation of computations. This is more of the referential transparency's advantage rather than lazy evaluation; but still, the features fuse together very organically.
Performance-wise, memoization often comes to mind when musing on lazy evaluation; although doing it automatically is a hard (probably still unsolved) problem with lots of details and pitfalls.
So, basically, if you look at it deeper, every aspect comes with possibilities and trade-offs; your task as a software engineer is to know all of these and choose wisely based on concrete problem details.