Thursday, 23 October 2008

Explanation of Managed Code


Managed code is code that has its execution managed by the .NET Framework Common Language Runtime. It refers to a contract of cooperation between natively executing code and the runtime. This contract specifies that at any point of execution, the runtime may stop an executing CPU and retrieve information specific to the current CPU instruction address. Information that must be query-able generally pertains to runtime state, such as register or stack memory contents.

The necessary information is encoded in an Intermediate Language (IL) and associated metadata, or symbolic information that describes all of the entry points and the constructs exposed in the IL (e.g., methods, properties) and their characteristics. The Common Language Infrastructure (CLI) Standard (which the CLR is the primary commercial implementation) describes how the information is to be encoded, and programming languages that target the runtime emit the correct encoding. All a developer has to know is that any of the languages that target the runtime produce managed code emitted as PE files that contain IL and metadata. And there are many such languages to choose from, since there are nearly 20 different languages provided by third parties – everything from COBOL to Camel – in addition to C#, J#, VB .Net, Jscript .Net, and C++ from Microsoft.

Before the code is run, the IL is compiled into native executable code. And, since this compilation happens by the managed execution environment (or, more correctly, by a runtime-aware compiler that knows how to target the managed execution environment), the managed execution environment can make guarantees about what the code is going to do. It can insert traps and appropriate garbage collection hooks, exception handling, type safety, array bounds and index checking, and so forth. For example, such a compiler makes sure to lay out stack frames and everything just right so that the garbage collector can run in the background on a separate thread, constantly walking the active call stack, finding all the roots, chasing down all the live objects. In addition because the IL has a notion of type safety the execution engine will maintain the guarantee of type safety eliminating a whole class of programming mistakes that often lead to security holes.

Contrast this to the unmanaged world: Unmanaged executable files are basically a binary image, x86 code, loaded into memory. The program counter gets put there and that’s the last the OS knows. There are protections in place around memory management and port I/O and so forth, but the system doesn’t actually know what the application is doing. Therefore, it can’t make any guarantees about what happens when the application runs.

No comments: