-3

I have a C++ application which reads bytes of data in the form of messages. The messages are stored in a text file, separated by the newline character.

What would be the fastest way to pipe these bytes to the C++ app/accept the bytes from the pipeline within the C++ app?

The answer I am ideally looking for will show how the C++ application accepts the (my guess) std::cin input?

user997112
  • 29,025
  • 43
  • 182
  • 361
  • Constructive criticisms (rather than pointless down-voting) will be much appreciated.... – user997112 Feb 04 '14 at 19:36
  • 1
    I don't think I got it. You want to read the file (with cat or something) then pipe the result onto your app? – webuster Feb 04 '14 at 19:39
  • I/O performance does not matter that much (usually disk is the bottleneck). And code first a correct program, then profile and benchmark it to understand what should be optimized (asymptotic complexity is often what matters a lot). – Basile Starynkevitch Feb 04 '14 at 19:47
  • @BasileStarynkevitch no- this is not a "only optimize if needs be". I need to know the fastest mechanism for piping file byte contents in to a C++ application (appreciate you taking the time to reply though). – user997112 Feb 04 '14 at 20:27
  • @webuster A file containing data will be piped in to a C++ application which will process the data. I am under the impression there is one way of sending the data to the app (piping) but from within the app there are multiple ways of handling the pipe? – user997112 Feb 04 '14 at 20:28
  • BTW, "*piping* file contents" suggests [pipe(7)](http://man7.org/linux/man-pages/man7/pipe.7.html) which is *not* what you want. – Basile Starynkevitch Feb 04 '14 at 21:47

2 Answers2

2

You can easily achieve what you want by using std::getline which extracts characters from a stream until a delimitation character is found.

Using it on the standard input stream would lead to something close to the following :

#include <iostream>
#include <sstream>

int     main() {
  std::string data;

  while (std::getline(std::cin, data)) {
    // each messages would be stored into data
    std::cout << data << std::endl;
  }

  return (0);
}

A way to test this sample would be to :

cat my_file | ./my_sample [...]
Halim Qarroum
  • 13,985
  • 4
  • 46
  • 71
  • Ok- I know about this approach but is it THE fastest way?! getline() uses strings- which have constructors, which is not good for performance..... is there any way to do this using bytes and minimal copying? – user997112 Feb 04 '14 at 20:26
  • Well it depends on what you're planning to do exactly. The perfomance delivered by the above constructs should be enough for most use-cases. Now if you'd like to go using pointer to `char` directly as in `C`, you can call the `read` system call directly (e.g `read(0, buffer, size_to_read)` and check for a newline character yourself for each messages. Not sure this would far more efficient than using the STL which does it for you. – Halim Qarroum Feb 04 '14 at 21:02
  • Thanks for the reply- I am fairly certain it would be faster because as soon as strings are involved you start getting memory allocation/deallocation happening everywhere. – user997112 Feb 04 '14 at 23:12
0

If performance matters that much (which is really unlikely) you could avoid the small overhead of C++ streams and directly use Linux or Posix specific syscalls to read a file (plain file in a file system, i.e. not a pipe) like mmap(2) or read(2), madvise(2), and even (in some other thread) readahead(2). See also this answer.

If using read (e.g. because reading some pipe(7), not a disk file) you probably want to use quite big blocks or buffers, e.g. reading at least 64 kilobytes at once.

However, the hardware (disk probably) is probably the bottleneck. Consider buying more RAM (to increase filesystem cache) and some SSD. See also linuxatemyram ...

You may want to use several (e.g. 2 to 6) threads (with pthreads) and/or asynchronous I/O; see aio(7). I don't think it is worth the effort. Avoid race conditions thru synchronization.

Study also the architecture and implementation of high-performance SMTP free software mail transfer agents (MTA) like exim or postfix... They are dealing quite well with an issue similar to yours.

You will need to benchmark your application to tune parameters.

Community
  • 1
  • 1
Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547