14

I need for fast reading data from standard input stream of console. Input consist of 100.000 rows with 20 chars each (2 million chars); user paste it from clipboard. My procedure works for about 3 minutes (very slowly; the target is 10 seconds). It is look like:

var inputData = new string[100000]; // 100.000 rows with 20 chars
for (int i = 0; i < 100000; i++) // Cycle duration is about 3 minutes...
{
    inputData[i] = Console.ReadLine();
}
// some processing...

What's I tried:

  1. Directly: Console.Read, Console.ReadKey - the same result

  2. Console.In: Read(), ReadLine(), ReadAsync(), ReadLineAsync(), ReadBlock(with various block size), ReadBlockAsync(), ReadToEnd(), ReadToEndAsync() - the same result

  3. new StreamReader(Console.OpenStandardInput(buffer)) with various buffer and block size - the same result

  4. Hide console window at start of reading, and show it when reading is finished - acceleration 10%

  5. I tried get input data from file - it's works perfectly and fast. But I need read from __ConsoleStream.

I noticed, while input reading in progress - process conhost.exe actively uses a processor.

How can I speed up the reading of input?

upd:

  1. Increasing/decreasing Console.BufferHeight and Console.BufferWidth has no effect

  2. ReadFile msdn is also slowly. But I noticed an interesting fact:

    ReadFile(handle, buffer, bufferSize, out bytesCount, null);
    // bufferSize may be very big, but buffer obtains no more than one row (with \r\n).
    // So, it seems that data passed into InputStream row-by-row syncroniously.
    
Maradik
  • 214
  • 2
  • 8
  • `inputData = Console.ReadLine();` won't compile and how exactly does the Clipboard fit in? – H H Oct 26 '15 at 09:33
  • Reading 20 MB of text should take much less than a second. – H H Oct 26 '15 at 09:33
  • 7
    Why not directly reading the data from the clipboard? http://stackoverflow.com/questions/3840080/how-to-retrieve-data-from-clipboard-as-system-string – Alex H Oct 26 '15 at 09:38
  • I wonder if playing with [BufferHeight](http://stackoverflow.com/a/1370360/11683) changes anything. – GSerg Oct 26 '15 at 09:51
  • @HenkHolterman, sorry, must be `inputData[i] = Console.ReadLine();` – Maradik Oct 26 '15 at 10:41
  • @GSerg, BufferHeight and BufferWidth has no effect – Maradik Oct 26 '15 at 10:42
  • @AlexH, Because the task is get a data from input stream (keyboard). – Maradik Oct 26 '15 at 10:53
  • Is it just as slow in a release build? – GSerg Oct 26 '15 at 11:29
  • @GSerg It is slow in debug and release builds. – Maradik Oct 26 '15 at 11:41
  • Are you sure the bottle neck is in the Console.Read? In general I find, if changing one line of code multiple times yields zero change to the run time, is because I am looking at the wrong place. – Aron Oct 26 '15 at 12:04
  • @Aron, yes, I'm sure. Because, I compared `DateTime.Now.Ticks` before and after calling `Read()` . – Maradik Oct 26 '15 at 13:01
  • A I said in my post below, Read() and Readline() echo the pasted text, and the act of writing 100,000 characters dooms the process to failure (Console.WriteLines are time killers). Using Console.Readkey(true) prevents this echo effect and speeds things up dramatically without any other modifications. – Shannon Holsinger Aug 28 '16 at 11:53

5 Answers5

2

Your main slowdown here is that Console.Read() and Console.ReadLine() both "echo" your text on the screen - and the process of writing the text slows you WAY down. What you want to use, then, is Console.Readkey(true), which does not echo the pasted text. Here's an example that writes 100,000 characters in about 1 second. It may need some modification for your purposes, but I hope it's enough to give you the picture. Cheers!

public void begin()

    {   List<string> lines = new List<string>();
        string line = "";
        Console.WriteLine("paste text to begin");
        int charCount = 0;
        DateTime beg = DateTime.Now;
        do
        {
            Chars = Console.ReadKey(true);
            if (Chars.Key == ConsoleKey.Enter)
            {
                lines.Add(line);
                line = "";
            }
            else
            {
                line += Chars.KeyChar;
                charCount++;
            }


        } while (charCount < 100000);
        Console.WriteLine("100,000 characters ("+lines.Count.ToString("N0")+" lines) in " + DateTime.Now.Subtract(beg).TotalMilliseconds.ToString("N0")+" milliseconds");

    }

I'm pasting a 5 MB file with long lines of text on a machine with all cores active doing other things (99% CPU load) and getting 100,000 characters in 1,600 lines in 1.87 seconds.

Shannon Holsinger
  • 2,293
  • 1
  • 15
  • 21
2

In you scenario a lot of time is wasted by attempts to display inserting symbols. You can disable inserting symbols displaying in Windows (I don't know how to do that on other platforms).

Unfortunately, necessary API is not exposed by .NET (at least in 4.6.1). So you need following native methods/constants:

internal class NativeMethods
{
    [DllImport("kernel32.dll", SetLastError = true)]
    internal static extern bool SetConsoleMode(IntPtr hConsoleHandle, int mode);

    [DllImport("kernel32.dll", SetLastError = true)]
    internal static extern bool GetConsoleMode(IntPtr hConsoleHandle, out int mode);

    [DllImport("kernel32.dll", SetLastError = true)]
    internal static extern IntPtr GetStdHandle(int nStdHandle);

    internal const int STD_INPUT_HANDLE = -10;
    internal const int ENABLE_ECHO_INPUT = 0x0004;
}

and use them in following way before receiving data from clipboard:

var handle = NativeMethods.GetStdHandle(NativeMethods.STD_INPUT_HANDLE);
int mode; 
NativeMethods.GetConsoleMode(handle, out mode);
mode &= ~NativeMethods.ENABLE_ECHO_INPUT; // disable flag
NativeMethods.SetConsoleMode(handle, mode);

Don't forget to revert console mode flags back when you finished receiving clipboard data. I hope it will reduce your performance problem. More info about console modes can be found on GetConsoleMode

Further attempts to optimize can include:

  • Rewrite console reading code without locks (as it implemented in .NET) and ensure that no any threads works with console at that moment. Quite expensive task.
  • Try to find a way to increase stdin buffer size. But i'm not sure is it possible at all.
  • Don't forget to test in release build without debugging %)
Dlinny_Lag
  • 66
  • 4
0

Use native WinApi function:

  1. Get input handle: GetStdHandle msdn
  2. Read 22 bytes (with endline /n/r) with ReadFile (Instead of ReadLine) msdn

Examples for WinApi use in C#: http://www.pinvoke.net/

Evgeniy Mironov
  • 777
  • 6
  • 22
  • The last idea: - Read all input with one ReadFile call to one memory buffer; - Don't use string array - use one buffer of memory (may be, performance fall down because string objects are created for a long time). – Evgeniy Mironov Oct 26 '15 at 18:45
0

I don't see that you need to preserve order? If so, use Parallel in combination with partitioner class since you're executing small tasks:

See When to use Partitioner class? for example

This means you have to change datatype to ConcurrentBag or ConcurrentDictionary

Community
  • 1
  • 1
CthenB
  • 800
  • 6
  • 17
-2

Why not use

Parallel.For

To Multi-Thread the read from Console? If not then try to pull it straight from the clipboard using

https://msdn.microsoft.com/en-us/library/kz40084e(v=vs.110).aspx

Astro
  • 15
  • 1