Your continued donations keep Wikipedia running!    

C Sharp

From Wikipedia, the free encyclopedia

Jump to: navigation, search
The correct title of this article is C#. The substitution or omission of a # sign is due to technical restrictions.
This article is about the programming language. For the musical note, see musical notation.
C#
Paradigm: structured, imperative, object-oriented
Appeared in: 2001 (last revised 2005)
Designed by: Microsoft Corporation
Typing discipline: static, strong, both safe and unsafe, nominative
Major implementations: .NET Framework, Mono
Dialects: 1.0, 1.5 , 2.0 (ECMA)
Influenced by: Delphi, C++, Java, Eiffel
Influenced: Nemerle

C# (see section on naming, pronunciation) is an object-oriented programming language developed by Microsoft as part of their .NET initiative, and later approved as a standard by ECMA and ISO. C# has a procedural, object oriented syntax based on C++ that includes aspects of several other programming languages (most notably Delphi, Visual Basic, and Java) with a particular emphasis on simplification (fewer symbolic requirements than C++, fewer decorative requirements than Java).

This article describes the language as defined in the ECMA and ISO standards, and avoids description of Microsoft's implementation. For a description of Microsoft's implementation, see Microsoft Visual C#.

Contents

Language design goals

The ECMA standard lists these design goals for C#:

  • C# is intended to be a simple, modern, general-purpose, object-oriented programming language.
  • The language, and implementations thereof, should provide support for software engineering principles such as strong type checking, array bounds checking, detection of attempts to use uninitialized variables, and automatic garbage collection. Software robustness, durability, and programmer productivity are important.
  • The language is intended for use in developing software components suitable for deployment in distributed environments.
  • Source code portability is very important, as is programmer portability, especially for those programmers already familiar with C and C++.
  • Support for internationalization is very important.
  • C# is intended to be suitable for writing applications for both hosted and embedded systems, ranging from the very large that use sophisticated operating systems, down to the very small having dedicated functions.
  • Although C# applications are intended to be economical with regards to memory and processing power requirements, the language was not intended to compete directly on performance and size with C or assembly language.

Architectural history

C#'s principal designer, and lead architect at Microsoft, is Anders Hejlsberg. His previous experience in programming language and framework design (Visual J++, Borland Delphi, Turbo Pascal) can be readily seen in the syntax of the C# language, as well as throughout the CLR (Common Language Runtime) core. He can be cited in interviews and technical papers as stating flaws in most major programming languages, for example, C++, Java, Delphi, Smalltalk, were what drove the fundamentals of the CLR, which, in turn, drove the design of the C# programming language itself. His expertise can be seen in C#. Some argue that C# shares roots in other languages, as purported by programming language history chart.

Language features

C# is, in some senses, the programming language which most directly reflects the underlying Common Language Infrastructure (CLI). It was designed specifically to take advantage of the features that the CLI provides. Most of C#'s intrinsic types correspond to value-types implemented by the CLI framework. However, the C# language specification does not state the code generation requirements of the compiler: that is, it does not state that a C# compiler must target a common language runtime, or generate Microsoft Intermediate Language (MSIL), or generate any other specific format. Theoretically, a C# compiler could generate machine code like traditional compilers of C++ or FORTRAN. That said, the Microsoft implementation of C# is by far the predominant one, and this article describes its characteristics and behavior, unless noted otherwise.

Compared to C and C++, the language is restricted or enhanced in a number of ways, including but not limited to the following:

  • There are no global variables. All methods and members must be declared as part of a class.
  • Variable names cannot be duplicated in enclosing blocks, unlike C and C++. This is often treated as a potential cause of confusion and ambiguity in C++ texts, but C# simply disallows this case.
  • Instead of functions being visible globally, such as the printf() function in C, all functions must be declared in classes. Classes are almost always organized into namespaces in order to prevent naming conflicts.
  • Namespaces are hierarchical. Namespaces may also be declared in other namespaces.
  • All types, including primitives such as integers, are subclasses of the object class, and so all inherit the properties and methods of object. For example, every type has a ToString() method.
  • C# supports a boolean type, bool. Statements that take conditions, such as while and if, require an expression of a boolean type. While C and C++ do have a boolean type, it can be freely converted to and from integers, and expressions such as if(a) require only that a is convertable to bool, allowing a to be an int, or a pointer. C# disallows this 'integer meaning true or false' approach on the grounds that forcing programmers to use expressions that return exactly bool helps prevent certain types of programming mistakes.
  • True support for pointers. Pointers can only be used within unsafe scopes, and only programs with appropriate permissions can execute code marked as unsafe. Most object access is done through safe references, which cannot be made invalid, and most arithmetic is checked for overflow. An unsafe pointer can be made to value-types and strings. Safe code can store and manipulate (but not necessarily use) pointers through the System.IntPtr type.
  • Managed memory cannot be explicitly freed, but instead is automatically garbage collected when no references to the memory exist. Garbage collection addresses a common programming mistake of allocating memory and not releasing it, known as a memory leak. C# also provides explicit control of unmanaged resources, such as database connections, through the IDisposable interface and the using statement, which together are an explicit form of "Resource Acquisition Is Initialization" (RAII).
  • Multiple inheritance is not supported, although a class can implement any number of interfaces. This was a design decision by the language's lead architect (Anders Hejlsberg) to avoid complication, avoid "dependency hell," and simplify architectural requirements throughout CLI.
  • C# is more typesafe than C++. The only implicit conversions by default are safe conversions, such as widening of integers and conversion from a derived type to a base type. This is enforced at compile-time, during JIT, and, in some dynamic cases, at runtime. There are no implicit conversions between booleans and integers and between enumeration members and integers, and any user-defined implicit conversion must be explicitly marked as such, unlike C++'s copy constructors.
  • Enumeration members are placed in their own namespace.
  • Accessors called properties can be used to modify an object with syntax that resembles C++ member field access. In C++, declaring a member public enables both reading and writing to that member. In C#, properties allow you to control the access to a member and validate the data.
  • Full type reflection and discovery is available.

C# 2.0 new language features

New features in C# for the .NET SDK 2.0 (corresponding to the 3rd edition of the ECMA ECMA-334 standard) are:

  • Partial classes allow class implementation across more than one file. This permits breaking down very large classes, or is useful if some parts of a class are machine generated. Unlike VB, the partial keyword must appear in declaration of all classes.

file1:

   public partial class MyClass1
   {
       public MyClass1()
       {
           //implementation here
       }
   }

file2:

   public partial class MyClass1
   {
           //implementation here
       public Method1()
       {
       }
   } 
  • Generics or parameterized types. This is a .NET 2.0 feature supported by C#. Unlike C++ templates .NET parameterized types are specialized at runtime rather than by the compiler; hence they can be cross-language whereas C++ templates cannot. They support some features not supported directly by C++ templates such as type constraints on generic parameters by use of interfaces, although in C++ these can be easily implemented. On the other hand, expressions cannot be used as generic parameters, as with C++ templates. Also, they differ from Java in that parameterized types are first-class objects in the Virtual Machine, which allows for optimizations and preservation of the type information. Generics were initially designed and implemented by Microsoft Research, Cambridge. A template language feature for value types also exists.
  • Static classes which represent a concept close to VB.NET modules, cannot be instantiated from code and allow only static members.
  • A new form of iterator that employs coroutines via a functional-style yield keyword similar to yield in Python.
  • Anonymous delegates providing closure functionality.
  • Covariance and contravariance for signatures of delegates
  • The visibility of property get and set accessors can be set independently. Example:
string status = string.Empty;
public string Status
{
    get { return status; }     
    internal set { status = value; }
}
  • Nullable value types (denoted by a question mark, ie int? i = null;), allowing improved interaction with SQL databases.

Nullable types received an eleventh hour improvement at the end of August 2005 (only weeks before the official launch), to improve their boxing characteristics: a nullable variable which is assigned null is not actually a null reference (it's a value type). Hence boxing this value would result in a non-null reference. The following code illustrates the flaw:

int? i = null;
object o = i;
if (o == null)
  Console.WriteLine("Correct behaviour - you are running a version from Sept 05 or later");
else
  Console.WriteLine("Incorrect behaviour, prior to Sept 05 releases");

The late nature of this fix caused some controversy, since it required core-CLR changes affecting not only .NET2, but all dependent technologies (including C#, VB, SQL Server 2005 and Visual Studio 2005).

  • Coalesce operator: (??) returns the first of its operands which is not null:
object nullObj = null; 
object obj = new Object(); 
return nullObj ?? obj; //returns obj

The primary use of this operator is to assign a nullable type to a non-nullable type with an easy syntax:

int? i = null;
int j = i ?? default(int); //can't assign null to int

C# 3.0 new language features

In C# 3.0 there will be new features, driven largely by the introduction of the Language Integrated Query (LINQ) pattern:

  • "from, where, select" keywords allowing to query from SQL, XML, collections, and more (Language Integrated Query [1])
  • Object initialization : Customer c = new Customer(); c.Name = "James"; can be written Customer c = new Customer { Name="James" };
  • Lambda expressions: listOfFoo.Where(delegate(Foo x) { return x.size > 10;}) can be written listOfFoo.Where(x => x.size > 10);
  • Compiler-inferred translation of Lambda expressions to either strongly-typed function delegates or strongly-typed expression trees
  • Anonymous types: var x = new { Name = "James" }
  • Local variable type inference: var x = "hello"; is interchangeable with string x = "hello";. Aside from allowing this syntactic sugar -- which can be of great use when dealing with complex generic types -- it's required to allow the declaration of anonymously-typed variables (see above) because the true name of the type is known only to the compiler at compile time.
  • Extension methods (adding methods to classes by including the this keyword in the first parameter of a method on another static class):
   public static class IntExtensions 
   {
       public static void PrintPlusOne(this int x) { Console.WriteLine(x + 1); }
   }
   int foo = 0;
   foo.PrintPlusOne();

C# 3.0 was unveiled at the PDC 2005. A preview with specifications is available from the Visual C# site at Microsoft.

Microsoft has emphasized that the new language features of C# 3.0 will be available without any changes to the .NET runtime. As a result, C# 2.0 and 3.0 will both be bytecode-compatible with the .NET framework 2.0. (The .net framework 3.0 does NOT include C# 3.0.)

Although the new features may only slightly change simple in-memory queries, such as List.FindAll or List.RemoveAll, the pattern used by LINQ allows for significant extension points to enable queries over different forms of data, both local and remote.

See also Language Integrated Query.

Code libraries

The ECMA C# specification details a minimum set of types and class libraries that the compiler expects to have available and they define the basics required. Most implementations in the open ship with the larger set of libraries.

The .NET Framework is a class library which can be used from a .NET language to perform tasks from simple data representation and string manipulation to generating dynamic web pages (ASP.NET), XML parsing, Web Services/Remoting (SOAP) and reflection. The code is organized into a set of namespaces which group together classes with a similar function, e.g. System.Drawing for graphics, System.Collections for data structures and System.Windows.Forms for the Windows Forms system.

A further level of organisation is provided by the concept of an assembly. An assembly can be a single file or multiple files linked together (through al.exe) which may contain many namespaces and objects. Programs needing classes to perform a particular function might reference assemblies such as System.Drawing.dll and System.Windows.Forms.dll as well as the core library (known as mscorlib.dll in Microsoft's implementation).

Hello world example

The following is a very simple C# program, a version of the classic "Hello world" example.

public class ExampleClass
{
    public static void Main()
    {
        System.Console.WriteLine("Hello, world!");
    }
}

The effect is to write the text Hello, world! to the output console. Each line serves a specific purpose, as follows:

public class ExampleClass

This is a class definition. It is public, meaning objects in other projects can freely use this class. All the information between the following braces describes this class.

public static void Main()

This is the entry point where the program begins execution. It could be called from other code using the syntax ExampleClass.Main(). (The public static void portion is a subject for a slightly more advanced discussion.)

System.Console.WriteLine("Hello, world!");

This line performs the actual task of writing the output. Console is a system object, representing a command-line console where a program can input and output text. The program calls the Console method WriteLine, which causes the string passed to it to be displayed on the console.

Standardization

In August, 2000, Microsoft Corporation, Hewlett-Packard and Intel Corporation co-sponsored the submission of specifications for C# as well as the Common Language Infrastructure (CLI) to the international standardization organization ECMA. In December 2001, ECMA released ECMA-334 C# Language Specification. C# became an ISO standard in 2003 (ISO/IEC 23270). ECMA had previously adopted equivalent specifications as the 2nd edition of C#, in December, 2002.

In June 2005, ECMA approved edition 3 of the C# specification, and updated ECMA-334. Additions included partial classes, anonymous methods, nullable types, and generics (similar to C++ templates and that of java generics). In July 2005, ECMA submitted the standards and related TRs to ISO/IEC JTC 1 via the latter's Fast-Track process. This process usually takes 6-9 months.

ECMA specification 334 covers only the C# language. Programs written in C# commonly use the .NET framework, which is partly described by other specifications and is partly proprietary to Microsoft.

Microsoft released support of the 3rd edition of C# in the .NET SDK 2.0, and Visual Studio 2005, in November 2005.

Microsoft has made it clear that C#, as well as the other .NET languages, is an important part of its software strategy for both internal use and external consumption. The company takes an active role in marketing the language as part of its overall business strategies.

Implementations

There are four known C# compilers:

  • The de facto standard implementation of the C# language is Microsoft's Visual C#.
  • Microsoft's Rotor project (currently called Shared Source Common Language Infrastructure) provides a shared source implementation of the CLR runtime and a C# compiler.
  • The Mono project provides a CLR runtime, an implementation of the .NET libraries, and a C# compiler.
  • The Dot GNU project provides a CLR runtime, an implementation of the .NET libraries, and a C# compiler.

Language name

According to the ECMA-334 C# Language Specification, section 6, Acronyms and abbreviations [2] the name of the language is written "C#" ("LATIN CAPITAL LETTER C (U+0043) followed by the NUMBER SIGN # (U+0023)") and pronounced "C Sharp".

C sharp musical note
Enlarge
C sharp musical note

Due to technical limitations of display (fonts, browsers, etc.) and the fact that the sharp symbol (, U+266F, MUSIC SHARP SIGN, see graphic at right if the symbol is not visible) is not present on the standard keyboard, the number sign (#) was chosen to represent the sharp symbol in the written name of the language. So, although the symbol in "C#" represents the sharp symbol, it is actually the number sign ("#"). Although Microsoft's C# FAQ refers to the sharp symbol in the language name, Microsoft clarifies the language name as follows:

"The spoken name of the language is "C sharp" in reference to the musical "sharp" sign, which increases a tone denoted by a letter (between A and G) by half a tone. However, for ease of typing it was decided to represent the sharp sign by a pound symbol [3] (which is on any keyboard) rather than the "musically correct" Unicode sharp sign. The Microsoft and ECMA 334 representation symbols thus agree: the # in C# is the pound sign, but it represents a sharp sign. Think of it in the same way as the <= glyph in C languages which is a less than sign and an equals sign, but represents a less-than-or-equals sign.", Microsoft Online Customer Service

The choice to represent the sharp symbol (♯) with the number sign (#) has led to confusion regarding the name of the language. For example, although most printed literature uses the correct number sign [4], some incorrectly uses the sharp symbol.

The "sharp" suffix has been used by a number of other .NET languages that are variants of existing languages, including J# (Microsoft's implementation of Java), A# (from Ada), and F# (presumably from System F, the type system used by the ML family). The suffix is also sometimes used for libraries, such as Gtk# (a .NET wrapper for GTK+ and other GNOME libraries) and Cocoa# (a wrapper for Cocoa).

One interpretation of the name C# is that it denotes an improved version of C, by analogy with the musical note, which is half a step above the C note. This is similar to the play on words used by the language name C++; "++" is a C operator that increases a variable by one.

Another interpretation is that the sharp symbol represents 4 addition symbols together, C++ with ++ ontop.

See also

External links

Wikibooks
Wikibooks has a book on the topic of

References

  1. ^ LINQ
  2. ^ http://www.ecma-international.org/publications/standards/Ecma-334.htm
  3. ^ The "pound" symbol is known in every English speaking country (with the exception of North America) as the "hash" or Number_sign
  4. ^ http://www.microsoft.com/MSPress/books/imgt/5029.gif


Preceding:
Subsequent: Polyphonic C#, , Spec#
Personal tools