79

I know C# gives the programmer the ability to access, use pointers in an unsafe context. But When is this needed?

At what circumstances, using pointers becomes inevitable?

Is it only for performance reasons?

Also why does C# expose this functionality through an unsafe context, and remove all of the managed advantages from it? Is it possible to have use pointers without losing any advantages of managed environment, theoretically?

Joan Venge
  • 315,713
  • 212
  • 479
  • 689

5 Answers5

96

When is this needed? Under what circumstances does using pointers becomes inevitable?

When the net cost of a managed, safe solution is unacceptable but the net cost of an unsafe solution is acceptable. You can determine the net cost or net benefit by subtracting the total benefits from the total costs. The benefits of an unsafe solution are things like "no time wasted on unnecessary runtime checks to ensure correctness"; the costs are (1) having to write code that is safe even with the managed safety system turned off, and (2) having to deal with potentially making the garbage collector less efficient, because it cannot move around memory that has an unmanaged pointer into it.

Or, if you are the person writing the marshalling layer.

Is it only for performance reasons?

It seems perverse to use pointers in a managed language for reasons other than performance.

You can use the methods in the Marshal class to deal with interoperating with unmanaged code in the vast majority of cases. (There might be a few cases in which it is difficult or impossible to use the marshalling gear to solve an interop problem, but I don't know of any.)

Of course, as I said, if you are the person writing the Marshal class then obviously you don't get to use the marshalling layer to solve your problem. In that case you'd need to implement it using pointers.

Why does C# expose this functionality through an unsafe context, and remove all of the managed advantages from it?

Those managed advantages come with performance costs. For example, every time you ask an array for its tenth element, the runtime needs to do a check to see if there is a tenth element, and throw an exception if there isn't. With pointers that runtime cost is eliminated.

The corresponding developer cost is that if you do it wrong then you get to deal with memory corruption bugs that formats your hard disk and crashes your process an hour later rather than dealing with a nice clean exception at the point of the error.

Is it possible to use pointers without losing any advantages of managed environment, theoretically?

By "advantages" I assume you mean advantages like garbage collection, type safety and referential integrity. Thus your question is essentially "is it in theory possible to turn off the safety system but still get the benefits of the safety system being turned on?" No, clearly it is not. If you turn off that safety system because you don't like how expensive it is then you don't get the benefits of it being on!

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • Thanks Eric for replying. Can you please tell me what "referential integrity" means? Is it the use of references instead of pointers? – Joan Venge Mar 02 '11 at 19:55
  • 9
    @Joan: That every reference actually refers to *something valid* or is *null*. Pointers do not have that property; a pointer can be referring to memory that isn't any good at all. But managed references have that property; if you have a reference to a string, that thing is *always* either null or a valid string; you are guaranteed to not be in a situation where you have a non-null reference to something that isn't a valid string. – Eric Lippert Mar 02 '11 at 20:45
  • Thanks Eric, I understand it now. – Joan Venge Mar 02 '11 at 20:49
  • Old answer, but it helped me today still. Thanks. – Tipx May 25 '13 at 21:46
  • `if you do it wrong then you get to deal with memory corruption bugs that formats your hard disk` Really? I thought .Net would abort your program if it tries to access to a memory out of it dedicated scope. – Masoud Keshavarz Jan 22 '18 at 07:00
  • 2
    @masoudkeshavarz: Nope. With managed pointers it's impossible to forge a pointer to arbitrary memory. With unmanaged pointers in unsafe code, well, let's just say they are called "unmanaged" and "unsafe" **for a reason**. You can do anything you like with an unmanaged pointer in unsafe code, including corrupting the .NET runtime data structures. – Eric Lippert Jan 22 '18 at 14:35
  • 2
    holy hell, been searching for a clear no-bs answer for an hour and this is amazing. thank you! – krivar Dec 04 '18 at 08:57
20

Pointers are an inherent contradiction to the managed, garbage-collected, environment.
Once you start messing with raw pointers, the GC has no clue what's going on.

Specifically, it cannot tell whether objects are reachable, since it doesn't know where your pointers are.
It also cannot move objects around in memory, since that would break your pointers.

All of this would be solved by GC-tracked pointers; that's what references are.

You should only use pointers in messy advanced interop scenarios or for highly sophisticated optimization.
If you have to ask, you probably shouldn't.

SLaks
  • 868,454
  • 176
  • 1,908
  • 1,964
  • 8
    +1 for *If you have to ask, you probably shouldn't*. Excellent advice :-) – Darin Dimitrov Mar 02 '11 at 18:42
  • 1
    Your conclusion is right, but most of your explanation is wrong. Pointers and references are no different from the perspective of the garbage collector. What breaks the GC is when a pointer or reference is stored in an untyped memory area, because the GC no longer knows whether it's just a numeric value, or the address of a managed object. – Ben Voigt Mar 02 '11 at 18:46
  • @Ben: That's not true. There is nothing preventing the GC from knowing what an `Object*` means, or updating its value when the object moves. However, it doesn't know whether you'll write `*(a + 7)`. This is why you cannot do reference arithmetic – SLaks Mar 02 '11 at 18:46
  • 1
    @SLaks: I didn't say that references and pointers are no different, I said they are not different **from the perspective of the garbage collector**. The GC couldn't care less whether you took the address of an array element, or started with a pointer to a different element and did arithmetic to find the one you're pointing at now. – Ben Voigt Mar 02 '11 at 18:49
  • 1
    @SLaks: Even in native C and C++, pointer arithmetic is only allowed within the confines of a single object/allocation (e.g. an array). The garbage collector moves entire objects together anyway, pointers would not break. – Ben Voigt Mar 02 '11 at 18:50
  • @Ben: What about `int* d = &b - &a;`, followed by `*(a + d)`? – SLaks Mar 02 '11 at 18:52
  • 1
    @SLaks: Equally (il)legal in managed and native code. For example, the C++0x draft says "When two pointers to elements of the same array object are subtracted, the result is the difference of the subscripts of the two array elements. ... Unless both pointers point to elements of the same array object, or one past the last element of the array object, the behavior is undefined." – Ben Voigt Mar 02 '11 at 18:54
  • 3
    @SLaks: Considerably. BTW, your hypothetical GC-tracked pointer does exist in other .NET languages (albeit with some restrictions -- it can only be an automatic variable), and it does support arithmetic: [`interior_ptr`](http://msdn.microsoft.com/en-us/library/y0fh545k.aspx) – Ben Voigt Mar 02 '11 at 19:00
  • @Ben: I've never heard of that. Thanks! – SLaks Mar 02 '11 at 19:01
  • @Ben: Do you know which popular .NET languages have that GC-tracked pointer? – Joan Venge Mar 02 '11 at 19:30
  • 1
    @Joan: I would have thought the MSDN page would have said, but it isn't obvious. `interior_ptr` is C++/CLI. You can surely get the same effect with MSIL (decompile some C++/CLI code using it and see what the IL equivalent is). I'm not aware if any other languages have it. – Ben Voigt Mar 02 '11 at 19:35
5

The GC can move references around; using unsafe keeps an object outside of the GC's control, and avoids this. "Fixed" pins an object, but lets the GC manage the memory.

By definition, if you have a pointer to the address of an object, and the GC moves it, your pointer is no longer valid.

As to why you need pointers: Primary reason is to work with unmanaged DLLs, e.g. those written in C++

Also note, when you pin variables and use pointers, you're more susceptible to heap fragmentation.


Edit

You've touched on the core issue of managed vs. unmanaged code... how does the memory get released?

You can mix code for performance as you describe, you just can't cross managed/unmanaged boundaries with pointers (i.e. you can't use pointers outside of the 'unsafe' context).

As for how they get cleaned... You have to manage your own memory; objects that your pointers point to were created/allocated (usually within the C++ DLL) using (hopefully) CoTaskMemAlloc(), and you have to release that memory in the same manner, calling CoTaskMemFree(), or you'll have a memory leak. Note that only memory allocated with CoTaskMemAlloc() can be freed with CoTaskMemFree().

The other alternative is to expose a method from your native C++ dll that takes a pointer and frees it... this lets the DLL decide how to free the memory, which works best if it used some other method to allocate memory. Most native dlls you work with are third-party dlls that you can't modify, and they don't usually have (that I've seen) such functions to call.

An example of freeing memory, taken from here:

string[] array = new string[2];
array[0] = "hello";
array[1] = "world";
IntPtr ptr = test(array);
string result = Marshal.PtrToStringAuto(ptr);
Marshal.FreeCoTaskMem(ptr);
System.Console.WriteLine(result);


Some more reading material:

C# deallocate memory referenced by IntPtr The second answer down explains the different allocation/deallocation methods

How to free IntPtr in C#? Reinforces the need to deallocate in the same manner the memory was allocated

http://msdn.microsoft.com/en-us/library/aa366533%28VS.85%29.aspx Official MSDN documentation on the various ways to allocate and deallocate memory.

In short... you need to know how the memory was allocated in order to free it.


Edit If I understand your question correctly, the short answer is yes, you can hand the data off to unmanaged pointers, work with it in an unsafe context, and have the data available once you exit the unsafe context.

The key is that you have to pin the managed object you're referencing with a fixed block. This prevents the memory you're referencing from being moved by the GC while in the unsafe block. There are a number of subtleties involved here, e.g. you can't reassign a pointer initialized in a fixed block... you should read up on unsafe and fixed statements if you're really set on managing your own code.

All that said, the benefits of managing your own objects and using pointers in the manner you describe may not buy you as much of a performance increase as you might think. Reasons why not:

  1. C# is very optimized and very fast
  2. Your pointer code is still generated as IL, which has to be jitted (at which point further optimizations come into play)
  3. You're not turning the Garbage Collector off... you're just keeping the objects you're working with out of the GC's purview. So every 100ms or so, the GC still interrupts your code and executes its functions for all the other variables in your managed code.

HTH,
James

Community
  • 1
  • 1
James King
  • 6,233
  • 5
  • 42
  • 63
  • Thanks, but when you use pointers, how are they gonna get "cleaned" after you are done? Is it possible to use them in perf-critic situations and then switch back to managed code? – Joan Venge Mar 02 '11 at 18:47
  • Thanks James for additional info. – Joan Venge Mar 02 '11 at 19:38
  • 2
    @Joan: Sure. But *you* are responsible for ensuring that everything is cleaned up, that there are no stray pointers to movable memory lying around, and so on. If you want the benefits of turning off the safety system then you have to take on the costs of doing what the safety system normally does for you. – Eric Lippert Mar 02 '11 at 19:39
  • Thanks Eric, that makes sense. But in cases of performance optimizations via pointers, one will still get the data back into managed world once he/she's done, right? Like managed data -> unmanaged data -> some fast operations on this data -> create managed data from this unmanaged data -> clean unmanaged memory -> back to managed world? – Joan Venge Mar 02 '11 at 19:50
  • 1
    As a further note, you can explicitly notify the garbage collection of memory pressure from unmanaged memory using [`GC.AddMemoryPressure`](http://msdn.microsoft.com/en-us/library/system.gc.addmemorypressure.aspx) and [`GC.RemoveMemoryPressure`](http://msdn.microsoft.com/en-us/library/system.gc.removememorypressure.aspx). You'll still have to release the memory yourself, but this way the garbage collector will take unmanaged memory into account when making scheduling decisions. – Brian Mar 03 '11 at 17:26
  • Joan> Added some more info to my response to answer your question – James King Mar 03 '11 at 17:39
  • Brian> Good methods. To clarify, you would use these functions if your managed object were, say, very small, but allocated a large block of memory that you don't free until your managed object's `Dispose()` method. The GC would normally look at your very small object and possibly pass it by, because it's not worth collecting at that time. `AddMemoryPressure()` lets the GC know, "Hey, free me and you get an additional 64MB of memory freed" – James King Mar 03 '11 at 17:45
3

The most common reasons to use pointers explicitly in C#:

  • doing low-level work (like string manipulation) that is very performance sensitive,
  • interfacing with unmanaged APIs.

The reason why the syntax associated with pointers was removed from C# (according to my knowledge and viewpoint — Jon Skeet would answer better B-)) was it turned out to be superfluous in most situations.

From the language design perspective, once you manage memory by a garbage collector you have to introduce severe constraints on what is and what is not possible to do with pointers. For example, using a pointer to point into the middle of an object can cause severe problems to the GC. Hence, once the restrictions are in place, you can just omit the extra syntax and end up with “automatic” references.

Also, the ultra-benevolent approach found in C/C++ is a common source of errors. For most situations, where micro-performance doesn't matter at all, it is better to offer tighter rules and constrain the developer in favor of less bugs that would be very hard to discover. Thus for common business applications the so-called “managed” environments like .NET and Java are better suited than languages that presume to work against the bare-metal machine.

Ondrej Tucny
  • 27,626
  • 6
  • 70
  • 90
  • 1
    Pointers aren't removed from C#. Maybe you're thinking about Java? – Ben Voigt Mar 02 '11 at 18:46
  • I don't mean *pointers* were removed but the extra syntax was removed, i.e. don't have to write `obj->Property`, `obj.Property` works instead. Will clarify my answer. – Ondrej Tucny Mar 02 '11 at 18:49
  • @Ondrej: [That wasn't removed either.](http://msdn.microsoft.com/en-us/library/s8bz4d5h.aspx) – Ben Voigt Mar 02 '11 at 18:52
  • 1
    Ben is right; you most certainly do have to use arrows (and stars) when dereferencing pointers in C#. Don't confuse *pointers* with *references*; C# supports both. – Eric Lippert Mar 02 '11 at 19:09
  • @Eric Lippert Heh, yeah. However, thinking about a reference as a subset of a pointer, I chose the word 'pointer' as the more general-purpose variant to explain the evolution of a reference—and a “pointer-less” language (its 'safe' portion to be correct)—from the plain old pointer. – Ondrej Tucny Mar 02 '11 at 20:13
  • @Ben Voigt OMG! The syntax was removed from the **safe** portion of the language. The OP is asking an apparently beginner, introductory question. Hence the tone of my answer is rather explaining, without the ambition to cite the C# language specification. – Ondrej Tucny Mar 02 '11 at 20:18
  • @Ondrej: Seems the OP is specifically asking about using pointers and unsafe. An answer that claims to address pointers, but is relevant to neither pointers nor unsafe is not much of an answer. Plus, it has nothing to do with safe or managed. In C# and ISO C and ISO C++ and C++/CLI, pointers use `p->member`. In C# and ISO C++ and C++/CLI, references use `r.member`. – Ben Voigt Mar 02 '11 at 20:41
2

Say you want to communicate between 2 application using IPC (shared memory) then you can marshal the data to memory and pass this data pointer to the other application via windows messaging or something. At receiving application you can fetch data back.

Useful also in case of transferring data from .NET to legacy VB6 apps wherein you will marshal the data to memory, pass pointer to VB6 app using win msging, use VB6 copymemory() to fetch data from the managed memory space to VB6 apps unmanaged memory space..

variable
  • 8,262
  • 9
  • 95
  • 215