Wednesday, May 20, 2009

A Comparative Overview of C#

This article focuses on the new ways of programming C# offers, and how it intends to improve upon its two closest neighbors, Java and C++. C# improves on C++ in a similar way to Java in many respects, so I'm not going to be re-explaining things like the benefits of a single-rooted object hierarchy. This article begins with a brief summary of the similarities between C# and Java, and then goes into exploring the new C# features.

Background: In June 2000, Microsoft announced both the .NET platform and a new programming language called C#. C# is a strongly-typed object-oriented language designed to give the optimum blend of simplicity, expressiveness, and performance. The .NET platform is centered around a Common Language Runtime (similar to a JVM) and a set of libraries which can be exploited by a wide variety of languages which are able to work together by all compiling to an intermediate language (IL). C# and .NET are a little symbiotic: some features of C# are there to work well with .NET, and some features of .NET are there to work well with C# (though .NET aims to work well with many languages). This article is mostly concerned with C#, but sometimes it is useful to discuss .NET too. The C# language was built with the hindsight of many languages, but most notably Java and C++. It was co-authored by Anders Hejlsberg (who is famous for the design of the Delphi language), and Scott Wiltamuth.

1. C# and Java

Below is a list of features C# and Java share, which are intended to improve on C++. These features are not the focus of this article, but it is very important to be aware of the similarities.

  • Compiles into machine-independent language-independent code which runs in a managed execution environment.
  • Garbage Collection coupled with the elimination of pointers (in C# restricted use is permitted within code marked unsafe)
  • Powerful reflection capabilities
  • No header files, all code scoped to packages or assemblies, no problems declaring one class before another with circular dependencies
  • Classes all descend from object and must be allocated on the heap with new keyword
  • Thread support by putting a lock on objects when entering code marked as locked/synchronized
  • Interfaces, with multiple-inheritance of interfaces, single inheritance of implementations
  • Inner classes
  • No concept of inheriting a class with a specified access level
  • No global functions or constants, everything belongs to a class
  • Arrays and strings with lengths built-in and bounds checking
  • The '.' operator is always used, no more ->, :: operators
  • null and boolean/bool are keywords
  • All values are initialized before use
  • Can't use integers to govern if statements
  • Try Blocks can have a finally clause
2. Properties

Properties will be a familiar concept to Delphi and Visual Basic users. The motivation is for the language to formalize the concept of getter/setter methods, which is an extensively used pattern, particularly in RAD (Rapid Application Development) tools.

This is typical code you might write in Java or C++:

foo.setSize (getSize () + 1);
label.getFont().setBold (true);

The same code you would write like this in C#:

foo.size++;
label.font.bold = true;

The C# code is immediately more readable by those who are using foo and label. There is similar simplicity when implementing properties:

Java/C++:

public int getSize() {
return size;
}

public void setSize (int value) {
size = value;
}

C#:

public int Size {
get {return size;
}
set {size = value;
}
}

Particularly for read/write properties, C# provides a cleaner way of handling this concept. The relationship between a get and set method is inherent in C#, while has to be maintained in Java or C++. There are many benefits of this approach. It encourages programmers to think in terms of properties, and whether that property is most natural as read/write vs read only, or whether it really shouldn't be a property at all. If you wish to change the name of your property, you only have one place to look (I've seen getters and setter several hundred lines away from each other). Comments only have to be made once, and won't get out of sync with each other. It is feasible that an IDE could help out here (and in fact I suggest they do), but one should remember an essential principle in programming is to try to make abstractions which model our problem space well. A language which supports properties will reap the benefits of that better abstraction.

One possible argument against this being a benefit is that you don't know if you're manipulating a field or a property with this syntax. However, almost all classes with any real complexity designed in Java (and certainly in C#) do not have public fields anyway. Fields typically have a reduced access level (private/protected/default) and are only exposed through getter/setters, which means one may as well have the nicer syntax. It is also totally feasible an IDE could parse the code, highlighting properties with a different color, or provide code completion information indicating if it is a property or not. It should also be noted that if a class is designed well, then a user of that class should only worry about the specification of that class, and not its implementation. Another possible argument is that it is less efficient. As a matter a fact, good compilers can in-line the default getter which merely returns a field, making it just as fast as field. Finally, even if using a field is more efficient that a getter/setter, it is a good thing to be able to change the field to a property later without breaking the source code which relies on the property.

3. Indexers

C# provides indexers allow objects to be treated like arrays, except that like properties, each element is exposed with a get and/or set method.

public class Skyscraper
{
Story[] stories;
public Story this [int index] {
get {
return stories [index];
}
set {
if (value != null) {
stories [index] = value;
}
}
}
...
}

Skyscraper empireState = new Skyscraper (...);
empireState [102] = new Story ("The Top One", ...);

4. Delegates

A delegate can be thought of as a type-safe object-oriented function pointer, which is able to hold multiple methods rather than just one. Delegates handle problems which would be solved with function pointers in C++, and interfaces in Java. It improves on the function pointer approach by being type safe and being able to hold multiple methods. It improves on the interface approach by allowing the invocation of a method without the need for inner-class adapters or extra code to handle multiple-method invocations. The most important use of delegates is for event handling, which is in the next section (which gives an example of how delegates are used).

Interoperability

I thought it would be useful to group interoperability into three divisions: Language interoperability, Platform interoperability, and Standards interoperability. While Java has its defining strength in platform interoperability, C# has it's strength in language interoperability. Both have strengths and weaknesses in standards interoperability.

Language Interoperability: This is the level and ease of integration with other languages. Both the Java Virtual Machine and the Common Language Runtime allow you to write code in many different languages, so long as they compile to byte code or IL code respectively. However, the .NET platform has done much more than just allow other languages to be compiled to IL code. NET allows multiple languages to freely share and extend each others libraries to a great extent. For instance, an Eiffel or Visual Basic programmer could import a C# class, override a virtual method of that class, and the C# object would now use the Visual Basic method (polymorphism). In case you were wondering, VB.NET has been massively upgraded (at the expense of compatibility with VB6) to have modern object oriented features.

Languages written for .NET will generally plug into the Visual Studio.NET environment and use the same RAD frameworks if needed, thus overcoming the "second rate citizen" effect of using another language.

C# provides P/Invoke, which is a much simpler (no-dlls) way to interact with C code than Java's JNI. This feature is very similar to J/Direct, which is a feature of Microsoft Visual J++.

Platform Interoperability:Generally this means OS interoperability, but over the last few years the internet browser has emerged as a platform in itself.

C# code runs in a managed execution environment, which is the most important technological step to making C# run on different operating systems. However, some of the .NET libraries are based on Windows, particularly the WinForms library which depends on the nitty gritty details of the Windows API. There is a project to port the Windows API to Unix systems, but this isn't here now and Microsoft have not given any firm indication of their intentions in this area.

However, Microsoft hasn't ignored platform interoperability. The .NET libraries provide extensive capabilities to write HTML/DHTML solutions. For solutions which can be implemented with a HTML/DHTML client, C#/.NET is a good choice. For cross-platform projects which require a more complex client interface, Java is a good choice. Kylix, a version of Delphi which allows the same code to compile to both Windows and Linux may also be a good choice for rich cross-platform solutions in the future.

Microsoft has submitted the C# specification as well as parts of the .NET specification to the ECMA standards body.

Standards Interoperability: These are all the standards like databases systems, graphics libraries, internet protocols, and object communication standards like COM and CORBA, that the language can access. Since Microsoft owns or plays a big role in defining many of these standards, they are in a very good position to support them. They of course have business motivations (I'm not saying they are or are not justified) to provide less support for standards which compete with their own - for instance - CORBA competes with COM and OpenGL competes with DirectX. Similarly, Sun's business motivations (again I'm not saying they are or are not justified) means Java doesn't provide as good support for Microsoft standards as it could.

C# objects, since they are implemented as .NET objects, are automatically exposed as COM objects. C# thus has the ability to expose COM objects as well as to use COM objects. This will allow the huge base of COM code to be integrate with C# projects. .NET is a framework which can eventually replace COM - but there is so much deployed COM code that by the time this happens I'm sure .NET will be replaced by the next wave of technology. Anyway, expect .NET to have a long and interesting history!

Conclusion

I hope this has given you a feel for where C# stands in relation to Java and C++. Overall, I believe C# provides greater expressiveness and is more suited to writing performance-critical code than Java, while sharing Java's elegance and simplicity, which makes both much more appealing than C++.

Full article: Comparative C#