How does CLR work?
Within the domain of CLR are executables (consisting of code, data, and metadata), assemblies (consisting of a manifest and zero or more modules), and the Common Type System (CTS) convention set. When programmers write code in their favorite languages, that code is translated into IL prior to being compiled into a portable executable (PE).
The main difference between a Windows PE and a .NET PE is that the Windows PE is executed by the operating system, but .NET PEs are turned over to the .NET Framework's CLR. Recognition of a PE as being .NET or Windows occurs because of the Common Object File Format (COFF) used by Windows operating systems. The COFF specifies two parts of any file: the file data itself and a bunch of header data describing the contents of the data portion. Note: To allow all Microsoft platforms to handle COFF modifications that enable .NET PEs, Microsoft has released new loaders for all of .NET's supported systems (98, 2000, and Me).
Metadata is information about a PE. In COM, metadata is communicated through nonstandardized type libraries. In .NET, this data is contained in the header portion of a COFF-compliant PE and follows certain guidelines; it contains information such as the assembly's name, version, language (spoken, not computer-a.k.a., "culture"), what external types are referenced, what internal types are exposed, methods, properties, classes, and much more.
The CLR uses metadata for a number of specific purposes. Security is managed through a public key in the PE's header. Information about classes, modules, and so forth allows the CLR to know in advance what structures are necessary.
The class loader component of the CLR uses metadata to locate specific classes within assemblies, either locally or across networks. Just-in-time (JIT) compilers use the metadata to turn IL into executable code.
Other programs take advantage of metadata as well. A common example is placing a Microsoft Word document on a Windows 2000 desktop. If the document file has completed comments, author, title, or other Properties metadata, the text is displayed as a tool tip when a user hovers the mouse over the document on the desktop. You can use the Ildasm.exe utility to view the metadata in a PE. Literally, this tool is an IL disassembler.
Different types of files are handled by two different virtual systems in Windows and .NET. If a Windows executable is to interoperate with the .NET Framework, it interfaces with a COM wrapper for the desired .NET functionality, instead of accessing the functionality directly. Similarly, if a .NET application utilizes Windows (COM) objects, it needs a set of classes that expose the functionality, instead of accessing it directly. This communication between .NET and Windows is called "interoperability". Included in the .NET SDK are two sets of two tools each. One set is for .NET-to-COM operations, and the other is for COM-to-.NET operations.
The first pair of tools consists of Regasm.exe and Tlbexp.exe. Regasm.exe registers a .NET Assembly in the Windows registry. Once this is done, the assembly is exposed as a COM object to the Windows OS. Developers who wish to access .NET Assemblies as COM objects in their own applications can use the Tlbexp.exe utility to export a Type Library TLB file to be referenced by their applications. The properties and methods of .NET Assembly are available, just as with any other COM object.
The second pair of tools consists of TlbImp.exe and Xsd.exe. TlbImp.exe is run against a TLB file to create a .NET Assembly in the form of a dynamic-link library (DLL) file.
Custom types in .NET are described through XML Schema Definitions (XSDs). When you run the Xsd.exe utility against an existing XSD file with the "/c" switch, the schema is converted to a C# class definition. As a sidenote, Xsd.exe also generates an XSD file from a .NET Assembly (using metadata in the COFF headers) when run against a .NET PE.
A .NET Assembly contains all the metadata about the modules, types, and other elements it contains in the form of a "manifest". The CLR loves assemblies because differing programming languages are just perfect for creating certain kinds of applications. For example, COBOL stands for Common Business-Oriented Language because it's tailor-made for creating business apps. However, it's not much good for creating drafting programs. Regardless of what language you used to create your modules, they can all work together within one Portable Executable Assembly.
There's a hierarchy to the structure of .NET code. That hierarchy is "Assembly -> Module -> Type -> Method". Let's say you want your computer to calculate your mortgage. You'd create a method called "fnCalculateMortgage" that returns an amortization table.
You could create a whole stand-alone application for this purpose, or you could make it one method of a larger collection of functions (called a "type") that you name "libFinancialFunctions". This library of financial functions could include real-time functions to transfer funds between accounts and other financial functions. The type, in turn, is contained within a module of IL code called "MyAccounting" that contains all the financial and accounting functions your business uses. Finally, the MyAccounting module could be one of several in the final assembly, called "MyMIS", which contains all your business management and operations functions.
Assemblies are made up of IL code modules and the metadata that describes them. Although programs may be compiled via an IDE or the command line, in fact, they are simply translated into IL, not machine code. The actual machine code is not generated until the function that requires it is called. This is the just-in-time, or JIT, compilation feature of .NET.
JIT compilation happens at runtime for a variety of reasons, one of the most ambitious being Microsoft's desire for cross-platform .NET adoption. If a CLR is built for another operating system (UNIX or Mac), the same assemblies will run in addition to the Microsoft platforms. The hope is that .NET assemblies are write-once-run-anywhere applications. This is a .NET feature that works behind-the-scenes, ensuring that developers are not limited to writing applications for one single line of products. No one has demonstrated whether or not this promise will ever truly materialize.
The MSIL Instruction Set Specification is included with the .NET SDK, along with the IL Assembly Language Programmers Reference. If a developer wants to write custom .NET programming languages, these are the necessary specifications and syntax. The CTS and CLS define the types and syntaxes that every .NET language needs to embrace. An application may not expose these features, but it must consider them when communicating through IL.
A big idea in the works
In this article, I've covered various aspects of the .NET platform. This includes how the CLR works to allow programs written in any language to be packaged into assemblies that are executable on any system supporting CLR. The CLS and the CTS, together with the IL Assembly Language Programmers Reference and MSIL Instruction Set Specification, define how the IL in an assembly is generated as well as the process it will go through to be rendered usable by the CLR. Adoption of these standards will determine if .NET becomes the industrywide default Microsoft hopes for.