Refactoring is about reorganizing a program to achieve an improved structure without changing its function ("lift code to method"), or sometimes about changing the functionality slightly to make the addition of more new functionality easier (add a parameter).
One can argue about whether refactoring should change other properties of the program, such as readability (people refactor code to do this) or execution time. If I have a realtime system, and your refactoring step breaks its ability to meet time limits, then your refactoring has broken my program.
Let's talk about time for a bit.
Computer programs are made out of computing primitives (add, compare, ...) and communications that moves data between these computing primitives. This is much easier to see when one draws a dataflow diagram of code (check out the C dataflow graph). The operations are astonishingly fast (fractions of a nanosecond in effect on a modern CPU).
Just as all operations take time, all dataflows between operations take time. But for most compiled code running in a single CPU, this time is measured in register or register-to-cache transfers (typically a few nanonseconds on modern CPUs), so we tend to ignore and act as if it is free/zero cost. But even on a single CPU, moving data from cache to main memory is expensive: 40-60 ns on modern systems. So we already see significant delays in moving data between some operations.
We remark that single CPU applications can be multithreaded; that just means there are several points where computation can proceed at any one moment ("threads"). This gives us psuedo-parallelism, by multiplexing the one cpu across the various live threads. We take advantage of this with multicore systems, to run those threads actually in parallel; done properly, multicore systems provide speedups. But communication of data between the cores is more expensive; cache line transfers from one CPU to another take considerably longer than cache line access by just a single core.
A key difference between single process and "distributed programs" is the dataflows between the operations (usually) take significantly longer.
This is because the distributed systems tend to have their CPUs far apart compared to the speed of light/electricity. We use messaging primitives (one of the Cray supercomputers builds these into hardware) to move the data from one part of the distributed system to another.
But this doesn't change the essence of our computation. We are still using computing primitives and communications operations. All it does is expose the time cost of those communications as a significant element that we must account for in deciding if our program is satisfactory.
For me, refactoring must take into account the time of operators and communications, and generally must preserve or enhance the program properties with respect to time. (If you don't beleive this, talk to somebody in scientific computing space).
I note that "lift code block to method" tends to damage performance (now there is another subroutine call cost paid) yet we accept this as a valid refactoring.
Refactoring of distributed systems in principle is not different than refactoring of single-process software.
If you are doing it with tools, they need to be a lot smarter than most refactoring tools, because they have to account for communication costs and the fact that the distributed system may use multiple languages.