2

MY UNDERSTANDING OF STREAMS:

From what i'm to tell our OS makes it so that there are preconnected I/O and error streams available to all programs; they just have to be accessed/called. Within C, we have stdio.h which is a header file containing various functions and variables working with these streams; and establishes connections between the streams and our program. The streams were always there but the header file stdio.h gives instructions / makes functions for our program to utilize them. Is all of this correct? How were these streams established before when we had to establish them manually?

WHAT CONFUSES ME:

1 from wikipedia - "One of Unix's several groundbreaking advances was abstract devices, which removed the need for a program to know or care what kind of devices it was communicating with" "In most operating systems predating Unix, programs had to explicitly connect to the appropriate input and output devices...Older operating systems forced upon the programmer a record structure and frequently non-orthogonal data semantics and device control."

What does the above entail?

2 also from wikipedia - "Another Unix breakthrough was to automatically associate input and output to terminal keyboard and terminal display, respectively, by default — the program (and programmer) did absolutely nothing to establish input and output for a typical input-process-output program (unless it chose a different paradigm). In contrast, previous operating systems usually required some—often complex—job control language to establish connections, or the equivalent burden had to be orchestrated by the program."

Could you expand on this?

TLDR- Is my understanding of streams correct? Can you explain more in depth how streams used to be established before Unix? And finally What resources should someone with these questions go through to further their understandings of streams?

John
  • 29
  • 3
  • 2
    Unfortunately, this is too broad a question. You could start with [IBM’s JCL (Job Control Language)](https://en.wikipedia.org/wiki/Job_Control_Language). But some of us do not want to relive the dark ages. Specifying your own block sizes, disk cylinders,… shudder. I doubt a person can really appreciate (suffer) these things without having had to live them. Requesting the operator mount a certain tape from the archive… – Eric Postpischil Jan 16 '18 at 01:25
  • What Wikipedia page are you refering exactly? IO automatic association existed before Unix, and it looks that there is some confusion between what is processed at OS, level, what is automatically done by a language, and what a programmer has to do... But other OS non derivated from Unix also provided default associations in the 70s (T1600 and its successor SOLAR 16, CDC NOS, Multics, ...) – Serge Ballesta Jan 16 '18 at 07:19
  • The Wikipedia page I'm referencing is https://en.wikipedia.org/wiki/Standard_streams – John Jan 17 '18 at 00:38

1 Answers1

1

I'm not quite sure what you are really asking.

If you are really interested in STREAMS, their type, history, etc., I recommend this article from Linux Journal: LiS: Linux STREAMS.

we have stdio.h which is a header file containing various functions and variables working with these streams; and establishes connections between the streams and our program.

Not quite, stdio.h is a header file that declares structs (like FILE), functions (fprintf), global variables (stdin), etc., see man 0p stdio.h for a whole documentation of this header file. Opening, reading, writing, closing, all these things are done by the program, not the header file. These things are implemented in your standard C library, for Linux that would the the glibc. When you build a program, you usually link it against the glibc which is the one that provides all this functionality and more.

Take a look at this simple program:

#include <stdio.h>

int main(void)
{
    fprintf(stdin, "Hello world\n");
    return 0;
}

In C the entry point of a program is the main function, however this is not the first function to be called by the OS when a new program is executed. An objdump shows:

$ objdump -S printf

printf:     file format elf64-x86-64


Disassembly of section .init:

0000000000000550 <_init>:
 550:   48 83 ec 08             sub    $0x8,%rsp
 554:   48 8b 05 85 0a 20 00    mov    0x200a85(%rip),%rax        # 200fe0 <__gmon_start__>
 55b:   48 85 c0                test   %rax,%rax
 55e:   74 02                   je     562 <_init+0x12>
 560:   ff d0                   callq  *%rax
 562:   48 83 c4 08             add    $0x8,%rsp
 566:   c3                      retq   

Disassembly of section .plt:

0000000000000570 <.plt>:
 570:   ff 35 92 0a 20 00       pushq  0x200a92(%rip)        # 201008 <_GLOBAL_OFFSET_TABLE_+0x8>
 576:   ff 25 94 0a 20 00       jmpq   *0x200a94(%rip)        # 201010 <_GLOBAL_OFFSET_TABLE_+0x10>
 57c:   0f 1f 40 00             nopl   0x0(%rax)

0000000000000580 <fwrite@plt>:
 580:   ff 25 92 0a 20 00       jmpq   *0x200a92(%rip)        # 201018 <fwrite@GLIBC_2.2.5>
 586:   68 00 00 00 00          pushq  $0x0
 58b:   e9 e0 ff ff ff          jmpq   570 <.plt>

Disassembly of section .plt.got:

0000000000000590 <__cxa_finalize@plt>:
 590:   ff 25 62 0a 20 00       jmpq   *0x200a62(%rip)        # 200ff8 <__cxa_finalize@GLIBC_2.2.5>
 596:   66 90                   xchg   %ax,%ax

Disassembly of section .text:

00000000000005a0 <_start>:
 5a0:   31 ed                   xor    %ebp,%ebp
 5a2:   49 89 d1                mov    %rdx,%r9
 5a5:   5e                      pop    %rsi
 5a6:   48 89 e2                mov    %rsp,%rdx
 5a9:   48 83 e4 f0             and    $0xfffffffffffffff0,%rsp
 5ad:   50                      push   %rax
 5ae:   54                      push   %rsp
 5af:   4c 8d 05 ba 01 00 00    lea    0x1ba(%rip),%r8        # 770 <__libc_csu_fini>
 5b6:   48 8d 0d 43 01 00 00    lea    0x143(%rip),%rcx        # 700 <__libc_csu_init>
 5bd:   48 8d 3d 0c 01 00 00    lea    0x10c(%rip),%rdi        # 6d0 <main>
 5c4:   ff 15 0e 0a 20 00       callq  *0x200a0e(%rip)        # 200fd8 <__libc_start_main@GLIBC_2.2.5>
 5ca:   f4                      hlt    
 5cb:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
 5d0:   48 8d 3d 59 0a 20 00    lea    0x200a59(%rip),%rdi        # 201030 <stdin@@GLIBC_2.2.5>
 5d7:   48 8d 05 59 0a 20 00    lea    0x200a59(%rip),%rax        # 201037 <__TMC_END__+0x7>
 5de:   55                      push   %rbp
 5df:   48 29 f8                sub    %rdi,%rax
 5e2:   48 89 e5                mov    %rsp,%rbp
 5e5:   48 83 f8 0e             cmp    $0xe,%rax
 5e9:   76 15                   jbe    600 <_start+0x60>
 5eb:   48 8b 05 de 09 20 00    mov    0x2009de(%rip),%rax        # 200fd0 <_ITM_deregisterTMCloneTable>
 5f2:   48 85 c0                test   %rax,%rax
 5f5:   74 09                   je     600 <_start+0x60>
 5f7:   5d                      pop    %rbp
 5f8:   ff e0                   jmpq   *%rax
 5fa:   66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)
 600:   5d                      pop    %rbp
 601:   c3                      retq   
 602:   0f 1f 40 00             nopl   0x0(%rax)
 606:   66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
 60d:   00 00 00 
 610:   48 8d 3d 19 0a 20 00    lea    0x200a19(%rip),%rdi        # 201030 <stdin@@GLIBC_2.2.5>
 617:   48 8d 35 12 0a 20 00    lea    0x200a12(%rip),%rsi        # 201030 <stdin@@GLIBC_2.2.5>
 61e:   55                      push   %rbp
 61f:   48 29 fe                sub    %rdi,%rsi
 622:   48 89 e5                mov    %rsp,%rbp
 625:   48 c1 fe 03             sar    $0x3,%rsi
 629:   48 89 f0                mov    %rsi,%rax
 62c:   48 c1 e8 3f             shr    $0x3f,%rax
 630:   48 01 c6                add    %rax,%rsi
 633:   48 d1 fe                sar    %rsi
 636:   74 18                   je     650 <_start+0xb0>
 638:   48 8b 05 b1 09 20 00    mov    0x2009b1(%rip),%rax        # 200ff0 <_ITM_registerTMCloneTable>
 63f:   48 85 c0                test   %rax,%rax
 642:   74 0c                   je     650 <_start+0xb0>
 644:   5d                      pop    %rbp
 645:   ff e0                   jmpq   *%rax
 647:   66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
 64e:   00 00 
 650:   5d                      pop    %rbp
 651:   c3                      retq   
 652:   0f 1f 40 00             nopl   0x0(%rax)
 656:   66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
 65d:   00 00 00 
 660:   80 3d d1 09 20 00 00    cmpb   $0x0,0x2009d1(%rip)        # 201038 <__TMC_END__+0x8>
 667:   75 27                   jne    690 <_start+0xf0>
 669:   48 83 3d 87 09 20 00    cmpq   $0x0,0x200987(%rip)        # 200ff8 <__cxa_finalize@GLIBC_2.2.5>
 670:   00 
 671:   55                      push   %rbp
 672:   48 89 e5                mov    %rsp,%rbp
 675:   74 0c                   je     683 <_start+0xe3>
 677:   48 8b 3d aa 09 20 00    mov    0x2009aa(%rip),%rdi        # 201028 <__dso_handle>
 67e:   e8 0d ff ff ff          callq  590 <__cxa_finalize@plt>
 683:   e8 48 ff ff ff          callq  5d0 <_start+0x30>
 688:   5d                      pop    %rbp
 689:   c6 05 a8 09 20 00 01    movb   $0x1,0x2009a8(%rip)        # 201038 <__TMC_END__+0x8>
 690:   f3 c3                   repz retq 
 692:   0f 1f 40 00             nopl   0x0(%rax)
 696:   66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
 69d:   00 00 00 
 6a0:   48 8d 3d 41 07 20 00    lea    0x200741(%rip),%rdi        # 200de8 <__init_array_end+0x8>
 6a7:   48 83 3f 00             cmpq   $0x0,(%rdi)
 6ab:   75 0b                   jne    6b8 <_start+0x118>
 6ad:   e9 5e ff ff ff          jmpq   610 <_start+0x70>
 6b2:   66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)
 6b8:   48 8b 05 29 09 20 00    mov    0x200929(%rip),%rax        # 200fe8 <_Jv_RegisterClasses>
 6bf:   48 85 c0                test   %rax,%rax
 6c2:   74 e9                   je     6ad <_start+0x10d>
 6c4:   55                      push   %rbp
 6c5:   48 89 e5                mov    %rsp,%rbp
 6c8:   ff d0                   callq  *%rax
 6ca:   5d                      pop    %rbp
 6cb:   e9 40 ff ff ff          jmpq   610 <_start+0x70>

00000000000006d0 <main>:
#include <stdio.h>

int main(void)
{
 6d0:   55                      push   %rbp
 6d1:   48 89 e5                mov    %rsp,%rbp
    fprintf(stdin, "Hello world\n");
 6d4:   48 8b 05 55 09 20 00    mov    0x200955(%rip),%rax        # 201030 <stdin@@GLIBC_2.2.5>
 6db:   48 89 c1                mov    %rax,%rcx
 6de:   ba 0c 00 00 00          mov    $0xc,%edx
 6e3:   be 01 00 00 00          mov    $0x1,%esi
 6e8:   48 8d 3d 95 00 00 00    lea    0x95(%rip),%rdi        # 784 <_IO_stdin_used+0x4>
 6ef:   e8 8c fe ff ff          callq  580 <fwrite@plt>
    return 0;
 6f4:   b8 00 00 00 00          mov    $0x0,%eax
}
 6f9:   5d                      pop    %rbp
 6fa:   c3                      retq   
 6fb:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)

0000000000000700 <__libc_csu_init>:
 700:   41 57                   push   %r15
 702:   41 56                   push   %r14
 704:   41 89 ff                mov    %edi,%r15d
 707:   41 55                   push   %r13
 709:   41 54                   push   %r12
 70b:   4c 8d 25 c6 06 20 00    lea    0x2006c6(%rip),%r12        # 200dd8 <__init_array_start>
 712:   55                      push   %rbp
 713:   48 8d 2d c6 06 20 00    lea    0x2006c6(%rip),%rbp        # 200de0 <__init_array_end>
 71a:   53                      push   %rbx
 71b:   49 89 f6                mov    %rsi,%r14
 71e:   49 89 d5                mov    %rdx,%r13
 721:   4c 29 e5                sub    %r12,%rbp
 724:   48 83 ec 08             sub    $0x8,%rsp
 728:   48 c1 fd 03             sar    $0x3,%rbp
 72c:   e8 1f fe ff ff          callq  550 <_init>
 731:   48 85 ed                test   %rbp,%rbp
 734:   74 20                   je     756 <__libc_csu_init+0x56>
 736:   31 db                   xor    %ebx,%ebx
 738:   0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
 73f:   00 
 740:   4c 89 ea                mov    %r13,%rdx
 743:   4c 89 f6                mov    %r14,%rsi
 746:   44 89 ff                mov    %r15d,%edi
 749:   41 ff 14 dc             callq  *(%r12,%rbx,8)
 74d:   48 83 c3 01             add    $0x1,%rbx
 751:   48 39 dd                cmp    %rbx,%rbp
 754:   75 ea                   jne    740 <__libc_csu_init+0x40>
 756:   48 83 c4 08             add    $0x8,%rsp
 75a:   5b                      pop    %rbx
 75b:   5d                      pop    %rbp
 75c:   41 5c                   pop    %r12
 75e:   41 5d                   pop    %r13
 760:   41 5e                   pop    %r14
 762:   41 5f                   pop    %r15
 764:   c3                      retq   
 765:   90                      nop
 766:   66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
 76d:   00 00 00 

0000000000000770 <__libc_csu_fini>:
 770:   f3 c3                   repz retq 

Disassembly of section .fini:

0000000000000774 <_fini>:
 774:   48 83 ec 08             sub    $0x8,%rsp
 778:   48 83 c4 08             add    $0x8,%rsp
 77c:   c3                      retq   

There a some function before main (the entry point from the programmer's perspective), like _start (the entry point from the OS's perspective). This functions are responsible for initializing the things needed before main. Where exactly streams like stdin are initialized, I don't know, I can't find any reliable sources about this.

In this answer there is a good link explaining this in more detail: How the heck do we get to main()?.


from wikipedia - "One of Unix's several groundbreaking advances was abstract devices, which removed the need for a program to know or care what kind of devices it was communicating with" "In most operating systems predating Unix, programs had to explicitly connect to the appropriate input and output devices...Older operating systems forced upon the programmer a record structure and frequently non-orthogonal data semantics and device control."

I think what the article is trying to say, is that in Unix-like systems, as a programmer you don't have to worry about the device behind the communication, because in Unix "everything is a file". You can do

fopen("/some/file.txt", "r");
fopen("/dev/sda1", "r"); // if you have reading access
fopen("/dev/some_device", "r");

your program doesn't need to know that /dev/sda1 is the first partition of a hard drive in a SCSI bus, or /dev/some_device is in reality a PCI device.

It seems that prior to Unix, as a programmer you had to know with which kind of device you wanted to talk to. But I'm too young to relate to this era, I wasn't even born yet, so I might be entirely mistaken.

also from wikipedia - "Another Unix breakthrough was to automatically associate input and output to terminal keyboard and terminal display, respectively, by default — the program (and programmer) did absolutely nothing to establish input and output for a typical input-process-output program (unless it chose a different paradigm). In contrast, previous operating systems usually required some—often complex—job control language to establish connections, or the equivalent burden had to be orchestrated by the program."

This is what I was talking about at the beginning of my answer.


And finally What resources should someone with these questions go through to further their understandings of streams?

I always value knowing the technologies from the past, it's certainly not a bad idea to know how stuff used to worked. If you are really interested behind the scenes, I think that the LiS: Linux STREAMS article was very interesting.

From a programmer's point of view all you need to know is that stdin, stdout, stderr are provided for you at the beginning of your program and that you have to use the functions in stdio.h to access them.

Pablo
  • 13,271
  • 4
  • 39
  • 59