This question is similar to IPC performance: Named Pipe vs Socket but focusses on anonymous instead of named pipes: How is the performance difference between an anonymous pipe and a TCP connection on different operating systems and with different transfer sizes?
-
On Linux, I don't think there's any difference between named vs. anonymous pipes. It's just a mechanism to establish a pipe between two processes other than creating in a common parent that forks/execs them. (Other than setup overhead of course, having to open() via a pathname instead of a pipe() system call.) – Peter Cordes Jul 12 '21 at 17:48
1 Answers
I tried to benchmark it using BenchmarkDotNet with the code attached at the end of this post. When the program starts, it initializes BenchmarkDotNet which in turn invokes the GlobalSetup()
methods once and the two benchmarked methods (Pipe()
and Tcp()
) many times.
In GlobalSetup()
, two child processes are started. One for pipe communication and one for tcp communication. Once the child processes are ready, they wait for a trigger signal and the number of values N
to be transferred (provided via stdin
) and then start sending data.
When the benchmarked methods (Pipe()
and Tcp()
) are invoked, they send the trigger signal and the number of values N
and wait for the incoming data.
It has shown that it is important to set TcpClient.NoDelay = true
to disable the Nagle-Algorithm that first collects small messages until a certain threshold or a certain timeout is reached. Interestingly this affects only the Linux tests with N = 10000
. With NoDelay = false
(default), the average time for this test jumps from ~40 µs
to ~40 ms
.
Here are the results:
Legends
- N : N = number of int32 values to be transmitted
- Mean : Arithmetic mean of all measurements
- Error : Half of 99.9% confidence interval
- StdDev : Standard deviation of all measurements
- Median : Value separating the higher half of all measurements (50th percentile)
- Ratio : Mean of the ratio distribution ([Current]/[Baseline])
- RatioSD : Standard deviation of the ratio distribution ([Current]/[Baseline])
- 1 us : 1 Microsecond (0.000001 sec)
Virtual Machine (Ubuntu 20.04)
BenchmarkDotNet=v0.13.0, OS=ubuntu 20.04
AMD Opteron(tm) Processor 4334, 4 CPU, 4 logical and 4 physical cores
.NET SDK=5.0.102
[Host] : .NET 5.0.2 (5.0.220.61120), X64 RyuJIT
DefaultJob : .NET 5.0.2 (5.0.220.61120), X64 RyuJIT
Method | N | Mean | Error | StdDev | Median | Ratio | RatioSD |
---|---|---|---|---|---|---|---|
Pipe | 1 | 27.33 μs | 1.660 μs | 4.895 μs | 30.75 μs | 1.00 | 0.00 |
Tcp | 1 | 31.42 μs | 0.620 μs | 0.713 μs | 31.24 μs | 1.39 | 0.21 |
Pipe | 100 | 26.72 μs | 1.990 μs | 5.867 μs | 26.63 μs | 1.00 | 0.00 |
Tcp | 100 | 38.95 μs | 2.146 μs | 6.327 μs | 43.34 μs | 1.53 | 0.43 |
Pipe | 10000 | 42.45 μs | 2.804 μs | 8.268 μs | 47.09 μs | 1.00 | 0.00 |
Tcp | 10000 | 46.97 μs | 3.057 μs | 9.013 μs | 53.93 μs | 1.16 | 0.34 |
Pipe | 1000000 | 1,621.87 μs | 116.924 μs | 344.752 μs | 1,893.49 μs | 1.00 | 0.00 |
Tcp | 1000000 | 1,707.25 μs | 8.066 μs | 7.545 μs | 1,707.24 μs | 0.94 | 0.13 |
Pipe | 10000000 | 21,013.86 μs | 166.250 μs | 129.797 μs | 21,007.89 μs | 1.00 | 0.00 |
Tcp | 10000000 | 20,548.03 μs | 407.779 μs | 814.379 μs | 20,713.44 μs | 0.96 | 0.03 |
Notebook (Ubuntu 20.04 on Windows 10 + WSL2):
BenchmarkDotNet=v0.13.0, OS=ubuntu 20.04
Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R), 1 CPU, 8 logical and 4 physical cores
.NET SDK=5.0.301
[Host] : .NET 5.0.7 (5.0.721.25508), X64 RyuJIT
DefaultJob : .NET 5.0.7 (5.0.721.25508), X64 RyuJIT
Method | N | Mean | Error | StdDev | Median | Ratio | RatioSD |
---|---|---|---|---|---|---|---|
Pipe | 1 | 44.66 μs | 0.882 μs | 1.051 μs | 44.45 μs | 1.00 | 0.00 |
Tcp | 1 | 54.42 μs | 0.411 μs | 0.364 μs | 54.34 μs | 1.21 | 0.03 |
Pipe | 100 | 45.07 μs | 0.895 μs | 1.496 μs | 44.63 μs | 1.00 | 0.00 |
Tcp | 100 | 55.27 μs | 0.735 μs | 0.614 μs | 55.17 μs | 1.21 | 0.05 |
Pipe | 10000 | 52.30 μs | 1.018 μs | 1.131 μs | 52.32 μs | 1.00 | 0.00 |
Tcp | 10000 | 55.47 μs | 0.590 μs | 0.523 μs | 55.32 μs | 1.06 | 0.03 |
Pipe | 1000000 | 4,034.01 μs | 77.978 μs | 65.115 μs | 4,035.58 μs | 1.00 | 0.00 |
Tcp | 1000000 | 1,398.62 μs | 24.230 μs | 21.479 μs | 1,395.20 μs | 0.35 | 0.01 |
Pipe | 10000000 | 69,767.35 μs | 4,993.492 μs | 14,723.423 μs | 64,169.46 μs | 1.00 | 0.00 |
Tcp | 10000000 | 24,660.43 μs | 1,746.809 μs | 4,955.406 μs | 23,947.15 μs | 0.38 | 0.14 |
Notebook (Windows 10):
BenchmarkDotNet=v0.13.0, OS=Windows 10.0.19043.1083 (21H1/May2021Update)
Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R), 1 CPU, 8 logical and 4 physical cores
.NET SDK=5.0.203
[Host] : .NET 5.0.6 (5.0.621.22011), X64 RyuJIT
DefaultJob : .NET 5.0.6 (5.0.621.22011), X64 RyuJIT
Method | N | Mean | Error | StdDev | Median | Ratio | RatioSD |
---|---|---|---|---|---|---|---|
Pipe | 1 | 22.60 μs | 0.441 μs | 1.013 μs | 22.21 μs | 1.00 | 0.00 |
Tcp | 1 | 27.42 μs | 0.535 μs | 1.019 μs | 27.51 μs | 1.21 | 0.08 |
Pipe | 100 | 21.93 μs | 0.146 μs | 0.122 μs | 21.94 μs | 1.00 | 0.00 |
Tcp | 100 | 26.06 μs | 0.506 μs | 0.474 μs | 25.99 μs | 1.19 | 0.02 |
Pipe | 10000 | 29.59 μs | 0.126 μs | 0.099 μs | 29.58 μs | 1.00 | 0.00 |
Tcp | 10000 | 33.25 μs | 0.655 μs | 0.919 μs | 33.01 μs | 1.14 | 0.04 |
Pipe | 1000000 | 1,675.35 μs | 32.862 μs | 43.870 μs | 1,685.37 μs | 1.00 | 0.00 |
Tcp | 1000000 | 2,553.07 μs | 58.100 μs | 167.631 μs | 2,505.34 μs | 1.63 | 0.10 |
Pipe | 10000000 | 23,421.61 μs | 141.337 μs | 132.207 μs | 23,380.19 μs | 1.00 | 0.00 |
Tcp | 10000000 | 28,182.91 μs | 375.644 μs | 313.679 μs | 28,114.22 μs | 1.20 | 0.01 |
Benchmark code:
Benchmark.csproj
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net5.0</TargetFramework>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="BenchmarkDotNet" Version="0.13.0" />
</ItemGroup>
</Project>
Program.cs
using BenchmarkDotNet.Running;
using System;
using System.IO;
using System.Linq;
using System.Net.Sockets;
using System.Runtime.InteropServices;
namespace Benchmark
{
public class Program
{
public const int MIN_LENGTH = 1;
public const int MAX_LENGTH = 10_000_000;
static void Main(string[] args)
{
if (!args.Any())
{
var summary = BenchmarkRunner.Run<PipeVsTcp>();
}
else
{
var data = MemoryMarshal
.AsBytes<int>(
Enumerable
.Range(0, MAX_LENGTH)
.ToArray())
.ToArray();
using var readStream = Console.OpenStandardInput();
if (args[0] == "pipe")
{
using var pipeStream = Console.OpenStandardOutput();
RunChildProcess(readStream, pipeStream, data);
}
else if (args[0] == "tcp")
{
var tcpClient = new TcpClient()
{
NoDelay = true
};
tcpClient.Connect("localhost", 55555);
var tcpStream = tcpClient.GetStream();
RunChildProcess(readStream, tcpStream, data);
}
else
{
throw new Exception("Invalid argument (args[0]).");
}
}
}
static void RunChildProcess(Stream readStream, Stream writeStream, byte[] data)
{
// wait for start signal
Span<byte> buffer = stackalloc byte[4];
while (true)
{
var length = readStream.Read(buffer);
if (length == 0)
throw new Exception($"The host process terminated early.");
var N = BitConverter.ToInt32(buffer);
// write
writeStream.Write(data, 0, N * sizeof(int));
}
}
}
}
PipeVsTcp.cs
using BenchmarkDotNet.Attributes;
using System;
using System.Buffers;
using System.Diagnostics;
using System.IO;
using System.Net;
using System.Net.Sockets;
using System.Reflection;
using System.Runtime.InteropServices;
namespace Benchmark
{
[MemoryDiagnoser]
public class PipeVsTcp
{
private Process _pipeProcess;
private Process _tcpProcess;
private TcpClient _tcpClient;
[GlobalSetup]
public void GlobalSetup()
{
// assembly path
// under Linux the Location property is an empty
// string (why?), therefore I have it replaced
// with an hard-coded string
var assemblyPath = Assembly.GetExecutingAssembly().Location;
// run pipe process
var pipePsi = new ProcessStartInfo("dotnet")
{
Arguments = $"{assemblyPath} pipe",
UseShellExecute = false,
RedirectStandardInput = true,
RedirectStandardOutput = true,
RedirectStandardError = true
};
_pipeProcess = new Process() { StartInfo = pipePsi };
_pipeProcess.Start();
// run tcp process
var tcpPsi = new ProcessStartInfo("dotnet")
{
Arguments = $"{assemblyPath} tcp",
UseShellExecute = false,
RedirectStandardInput = true,
RedirectStandardOutput = true,
RedirectStandardError = true
};
_tcpProcess = new Process() { StartInfo = tcpPsi };
_tcpProcess.Start();
var tcpListener = new TcpListener(IPAddress.Parse("127.0.0.1"), 55555);
tcpListener.Start();
_tcpClient = tcpListener.AcceptTcpClient();
_tcpClient.NoDelay = true;
}
[GlobalCleanup]
public void GlobalCleanup()
{
_pipeProcess?.Kill();
_tcpProcess?.Kill();
}
[Params(Program.MIN_LENGTH, 100, 10_000, 1_000_000, Program.MAX_LENGTH)]
public int N;
[Benchmark(Baseline = true)]
public Memory<byte> Pipe()
{
var pipeReadStream = _pipeProcess.StandardOutput.BaseStream;
var pipeWriteStream = _pipeProcess.StandardInput.BaseStream;
using var owner = MemoryPool<byte>.Shared.Rent(N * sizeof(int));
return ReadFromStream(pipeReadStream, pipeWriteStream, owner.Memory);
}
[Benchmark()]
public Memory<byte> Tcp()
{
var tcpReadStream = _tcpClient.GetStream();
var pipeWriteStream = _tcpProcess.StandardInput.BaseStream;
using var owner = MemoryPool<byte>.Shared.Rent(N * sizeof(int));
return ReadFromStream(tcpReadStream, pipeWriteStream, owner.Memory);
}
private Memory<byte> ReadFromStream(Stream readStream, Stream writeStream, Memory<byte> buffer)
{
// trigger
var Nbuffer = BitConverter.GetBytes(N);
writeStream.Write(Nbuffer);
writeStream.Flush();
// receive data
var remaining = N * sizeof(int);
var offset = 0;
while (remaining > 0)
{
var span = buffer.Slice(offset, remaining).Span;
var readBytes = readStream.Read(span);
if (readBytes == 0)
throw new Exception("The child process terminated early.");
remaining -= readBytes;
offset += readBytes;
}
var intBuffer = MemoryMarshal.Cast<byte, int>(buffer.Span);
// validate first 3 values
for (int i = 0; i < Math.Min(N, 3); i++)
{
if (intBuffer[i] != i)
throw new Exception($"Invalid data received. Data is {intBuffer[i]}, index = {i}.");
}
return buffer;
}
}
}

- 422
- 4
- 12