What is the difference between refactoring code in single-process vs. distributed programs?

Question

I'm a Java developer and almost new to refactoring code. I've been reading Refactoring written by Martin Fowler. I read the following paragraph, notifying the fact that there are some differences between refactoring codes in single threaded programs and multi-threaded programs.

"Another aspect to remember about these refactorings is that they are described with single-process software in mind. In time, I hope to see refactorings described for use with concurrent and distributed programming. Such refactorings will be different. For example, in single-process software you never need to worry how often you call a method; method calls are cheap. With distributed software, however, round trips have to be minimized. There are different refactorings for those flavors of programming, but those are topics for another book."

As you know, web programming is multi-threaded (in Java, servlet based), so I feel I should be aware of what the differences are before making a practice of refactoring my actual programs.

Especially, I want to know a part of the paragraph I quoted above:

For example, in single-process software you never need to worry how often you call a method; method calls are cheap. With distributed software, however, round trips have to be minimized.

Please anyone explain it clearly.

The quote isn't talking about multi-*threaded* software. It's talking about [distributed software](https://en.wikipedia.org/wiki/Distributed_computing). — T.J. Crowder, Nov 13 '16 at 15:06
[ask] suggests, "If you can imagine an entire book that answers your question, you’re asking too much." — trashgod, Nov 13 '16 at 15:14
Ah, sorry and thank you for your comment and the correction you just made.. I'll delete this question some time later. — ParkCheolu, Nov 13 '16 at 15:16
@thatsyou: It's a common problem, but _very_ broad. Your question seems to be _why_. I've tried to frame the problem below. — trashgod, Nov 13 '16 at 16:02

score 1 · Answer 1 · edited May 23 '17 at 12:31

The essential difference is latency. A method call in a single process has a small, predictable latency that can be assessed by profiling on a typical target platform. In contrast, distributed software typically includes network latency that is inherently stochastic. A refactoring that added an imperceptible delay to a single-process application might be unacceptable in a web-based application. Examples would be "topics for another book," but refactoring to accommodate batch database transactions is a recurring problem.

Ira Baxter · Accepted Answer · 2016-11-13T23:29:44.260

Refactoring is about reorganizing a program to achieve an improved structure without changing its function ("lift code to method"), or sometimes about changing the functionality slightly to make the addition of more new functionality easier (add a parameter).

One can argue about whether refactoring should change other properties of the program, such as readability (people refactor code to do this) or execution time. If I have a realtime system, and your refactoring step breaks its ability to meet time limits, then your refactoring has broken my program.

Let's talk about time for a bit.

Computer programs are made out of computing primitives (add, compare, ...) and communications that moves data between these computing primitives. This is much easier to see when one draws a dataflow diagram of code (check out the C dataflow graph). The operations are astonishingly fast (fractions of a nanosecond in effect on a modern CPU).

Just as all operations take time, all dataflows between operations take time. But for most compiled code running in a single CPU, this time is measured in register or register-to-cache transfers (typically a few nanonseconds on modern CPUs), so we tend to ignore and act as if it is free/zero cost. But even on a single CPU, moving data from cache to main memory is expensive: 40-60 ns on modern systems. So we already see significant delays in moving data between some operations.

We remark that single CPU applications can be multithreaded; that just means there are several points where computation can proceed at any one moment ("threads"). This gives us psuedo-parallelism, by multiplexing the one cpu across the various live threads. We take advantage of this with multicore systems, to run those threads actually in parallel; done properly, multicore systems provide speedups. But communication of data between the cores is more expensive; cache line transfers from one CPU to another take considerably longer than cache line access by just a single core.

A key difference between single process and "distributed programs" is the dataflows between the operations (usually) take significantly longer. This is because the distributed systems tend to have their CPUs far apart compared to the speed of light/electricity. We use messaging primitives (one of the Cray supercomputers builds these into hardware) to move the data from one part of the distributed system to another.

But this doesn't change the essence of our computation. We are still using computing primitives and communications operations. All it does is expose the time cost of those communications as a significant element that we must account for in deciding if our program is satisfactory.

For me, refactoring must take into account the time of operators and communications, and generally must preserve or enhance the program properties with respect to time. (If you don't beleive this, talk to somebody in scientific computing space).

I note that "lift code block to method" tends to damage performance (now there is another subroutine call cost paid) yet we accept this as a valid refactoring.

Refactoring of distributed systems in principle is not different than refactoring of single-process software.

If you are doing it with tools, they need to be a lot smarter than most refactoring tools, because they have to account for communication costs and the fact that the distributed system may use multiple languages.

What is the difference between refactoring code in single-process vs. distributed programs?

2 Answers2