13

While learning C, I made some mistakes and printed elements of a character array that were uninitialized.

If I expand the size of the array to be quite large, say 1 million elements in size and then print the contents, what comes out is not always user unreadable, but seems to contain some runtime info.

Consider the following code:

#include <stdio.h>
main() {

        char s[1000000];
        int c, i;

        printf("Enter input string:\n");
        for (i = 0; ( c = getchar()) != '\n'; i++) {
                s[i] = c;
        }   

        printf("Contents of input string:\n");
        for (i = 0; i < 999999; i++) {
                putchar(s[i]);
        }   
        printf("\n");

        return 0;
}

Just scrolling through the output, I find things such as:

???l????????_dyldVersionNumber_dyldVersionString_dyld_all_image_infos_dyld_fatal_error_dyld_shared_cache_ranges_error_string__mh_dylinker_header_stub_binding_helper_dyld_func_lookup_offset_to_dyld_all_image_infos__dyld_start__ZN13dyldbootstrapL30randomizeExecutableLoadAddressEPK12macho_headerPPKcPm__ZN13dyldbootstrap5startEPK12macho_headeriPPKcl__ZN4dyldL17setNewProgramVarsERK11ProgramVars__ZN4dyld17getExecutablePathEv__ZN4dyld22mainExecutablePreboundEv__ZN4dyld14mainExecutableEv__ZN4dyld21findImageByMachHeaderEPK11mach_header__ZN4dyld26findImageContainingAddressEPKv

and also,

Apple Inc.1&0$U ?0?*?H??ot CA0?"0ple Certification Authority10U ?䑩 ??GP??^y?-?6?WLU????Kl??"0?>?P ?A?????f?$kУ????z ?G?[?73??M?i??r?]?_???d5#KY?????P??XPg? ?ˬ, op??0??C??=?+I(??ε??^??=?:??? ?b??q?GSU?/A????p??LE~LkP?A??tb
?!.t?< ?A?3???0X?Z2?h???es?g^e?I?v?3e?w??-??z0?v0U?0U?0?0U+?iG?v ??k?.@??GM^0U#0?+?iG?v ??k?.@??GM^0?U 0?0? ?H??cd0??0+https://www.apple.com/appleca/0?+0????Reliance on this certificate by any party assumes acceptance of the then applicable standard terms and conditions of use, certificate poli?\6?L-x?팛??w??v?w0O????=G7?@?,Ա?ؾ?s???d?yO4آ>?x?k??}9??S ?8ı??O 01?H??[d?c3w?:,V??!ںsO??6?U٧??2B???q?~?R??B$*??M?^c?K?P????????7?uu!0?0??0

I believe one time my $PATH environment variable was even printed out.

Can the contents of an uninitialized variable ever pose a security risk?

Update 1

Motivator

Update 2

So it seems clear from the answers that this is indeed a security risk. This surprises me.

Is there no way for a program to declare its memory content protected to allow the OS to restrict any access to it other than the program that initialized that memory?

EMiller
  • 2,792
  • 4
  • 34
  • 55
  • 7
    The content of an uninitialized variable can do ANYTHING. One day, the world will end just because of one of those. – Raveline Aug 23 '12 at 15:31
  • Using uninitialized variables leads to Undefined Behavior so yes anything can happen, security risk or bust the whole program or anything imaginable.But rarely would anything like that ever happen. – Alok Save Aug 23 '12 at 15:31
  • Yes there are often special crypto versions of memory handling that ensure values are cleared after use or are never written to a swap file. Using calloc() will pre-zero memory and on most unix platforms malloc() will do the same. – Martin Beckett Aug 23 '12 at 16:14
  • @MartinBeckett most? unix platforms, The 3 I use regularly don't zeroise data on malloc. They do however generally clean memory pages before they hand them over to you. – Tom Tanner Aug 23 '12 at 16:58
  • @TomTanner, I though the gcc/linux stdlib did, but apparently it's more complex than that http://stackoverflow.com/questions/8029584/why-does-malloc-initialize-the-values-to-0-in-gcc – Martin Beckett Aug 23 '12 at 17:05
  • To answer your question in Update 2: Yes! Any memory a process is allocated is restricted to that process and not accessible by others. If sensitive data is handled by this process, the memory should be wiped *after* processing of said data is complete. Initializing memory on allocation does nothing for your process' security in this sense. – Ioan Aug 24 '12 at 12:09

6 Answers6

12

Most C programs use malloc to allocate memory. A common misunderstanding is that malloc zeros out the memory returned. It actually does not.
As a result, due to the fact that memory chunks are "recycled" it is quite possible to get one with information of "value".
An example of this vulnerability was the tar program on Solaris which emitted contents of /etc/passwd. The root cause was the fact that the memory allocated to tar to read a block from disk was not initialized and before getting this memory chunk the tar utility made a OS system call to read /etc/passwd. Due to the memory recycling and the fact that tar did not initialize the chunk fragments of /etc/passwd were printed to logs. This was solved by replacing malloc with calloc.
This is an actual example of security implication if you don't explicitly and properly initialize memory.
So yes, do initialize your memory properly.

Update:

Is there no way for a program to declare its memory content protected to allow the OS to restrict any access to it other than the program that initialized that memory?

The answer is yes (see in the end) and no.
I think that you view it the wrong way here. The more appropriate question would be for example, why doesn't malloc initialize the memory on request or clears the memory on release but instead recycles it?
The answer is that the designers of the API explicitly decided not to initialize (or clear memory) as doing this for large blocks of memory 1)would impact performance and 2)is not always necessary (for example you may not deal, in your application or several parts in your application with data that you actually care if they are exposed). So the designers decided not to do it, as it would inadvertently impact performance, and to drop the ball to the programmer to decide on this.
So carrying this also to the OS, why should it be the OS's responsibility to clear the pages? You expect from your OS to hand you memory in a timely manner but security is up to the programmer.

Having said that there are some mechanism provided that you could use to make sure that sensitive data are not stored in swap using mlock in Linux.

mlock() and mlockall() respectively lock part or all of the calling process's virtual address space into RAM, preventing that memory from being paged to the swap area. munlock() and munlockall() perform the converse operation, respectively unlocking part or all of the calling process's virtual address space, so that pages in the specified virtual address range may once more to be swapped out if required by the kernel memory manager. Memory locking and unlocking are performed in units of whole pages.

Jens
  • 69,818
  • 15
  • 125
  • 179
Cratylus
  • 52,998
  • 69
  • 209
  • 339
  • Security *is* the OS responsibility. Modern OS do clear memory pages before they are handed to applications. However, malloc doesn't necessarily request fresh memory pages from the OS, and instead it will frequently reuse memory that has previously been free()-ed by the process. Any non zero values from uninitialised memory comes from your own process, and not from another process (unless your system is configured with something like CONFIG_MMAP_ALLOW_UNINITIALIZED). – Lie Ryan Sep 22 '18 at 07:22
8

Yes, at least on systems where the data may be transmitted to outside users.

There have been a whole series of attacks on webservers (and even iPods) where you get it to dump the contents of memory from other process - and so get details of the type and version of the OS, the data in other apps and even things like password tables

Martin Beckett
  • 94,801
  • 28
  • 188
  • 263
  • 1
    The world is full of crappy OSes! – Martin Beckett Aug 23 '12 at 15:40
  • 1
    Memory allocated to a running process is not accessible to another process, but once released, it may be allocated to another process just fine. This is why the original process is responsible for clearing it before release, if needed. – Ioan Aug 23 '12 at 16:09
  • 1
    @loan esp. page/swap files are a notorious source of data leaks. – Martin Beckett Aug 23 '12 at 16:12
  • @MartinBeckett That implies any random process having access to that file during normal OS operation, for which I can't think of a legitimate reason/case and should not be allowed. – Ioan Aug 23 '12 at 17:22
  • @loan if your process is given uncleared memory from a swap file then it may contain whatever data the previous app left there. This shouldn't happen in any OS with the least idea of security - but is the reason that you don't edit Top Secret documents on Windows! – Martin Beckett Aug 23 '12 at 18:23
  • @MartinBeckett You are correct, but that applies to any area of memory, it's not specific to swap files. Are you implying that a portion of RAM may be swapped in/out to different locations in the swap file over the course of an allocation's lifetime? That would certainly pose a problem. – Ioan Aug 23 '12 at 19:04
  • @loan, No the concern is more that a previous process could have written any, possibly secret, data to a swap file which wasn't wiped when the process exited or the machine restarted. Your new process gets some swapped memory which potentially contains the old data. – Martin Beckett Aug 24 '12 at 03:19
  • Does Linux zero out pages of dead processes before reusing them for new processes? – Ciro Santilli OurBigBook.com Apr 19 '16 at 17:27
4

It's quite possible to perform some sensitive work in an area of memory, and not clear that buffer.

A future invocation can then retrieve that uncleared work via a call to malloc() or by checking the heap (via an unitiaised buffer/array declaration). It could inspect it (maliciously) or inadvertently copy it. If you're doing anything sensitive it thus makes sense to clear that memory before binning it (memset() or similar), and perhaps before using/copying it.

Brian Agnew
  • 268,207
  • 37
  • 334
  • 440
  • This is the best solution and approximation of the problem. The only other thing I see is a process which terminated abnormally and was unable to clear the contents. In that case, the OS releases the memory as is and could pose a risk. – Ioan Aug 23 '12 at 16:06
1

From the C standard:

6.7.8 Initialization

"If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate."

indeterminate value is defined as:

 either an unspecified value or a trap representation.

Trap representation is defined as:

Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. If such a representation is produced by a side effect that modifies all or any part of the object by an lvalue expression that does not have character type, the behavior is undefined.41) Such a representation is called a trap representation.

Accessing such a values leads to undefined behaviour and can pose security threats.

This paper Attacks on uninitialized variables can give some insights on they can be used to exploit the system.

P.P
  • 117,907
  • 20
  • 175
  • 238
0

If you are concerned about security, safest way is to allways initialize every variable you're going to use. It may even help you find some bugs. There may be some good reasons for not initializing memory, but in most cases initializing every variable/memory will be a good thing.

Jan Spurny
  • 5,219
  • 1
  • 33
  • 47
0

Reading uninitialized memory leads to undefined behavior. Bear in mind that what it means to be initialized depends on the invariant of a particular type. For example, it may be required for some pointer to be non-null, some enum to be from a valid range or a certain parameter to be a power of two. Situation complicates further with compound structures. An arbitrary sequence of bytes may not represent a valid object. This is why zeroing memory is not enough. If the expected invariant is broken, some code path relying on it will behave in an undefined manner and may pose a security issue.

marski
  • 580
  • 5
  • 11