Warning: C# is not C++   2010-07-15 00:19:22

I just spent the last two or three hours learning an important lesson about classes, structs, POD types & argument passing in C#. They have some nasty fundamental differences from their C++ counterparts, and the language is not to be approached on the assumption that "hey, I know object orientation, what can possibly go wrong?"

Call me old-fashioned, but when programming, I like knowing with some certainty when I pass an argument into a function whether or not a copy will be taken. I also like to be able to delete objects and know that they're gone - if the program crashes later on because something else still wants that object, it's my fault and I'll fix it. So it was with some trepidation that I started playing with C# this week, and so far my fears are founded.

I thought that the whole point of reference counting and garbage collection was to make that sort of thing easier, so there was much wailing and gnashing of teeth when I discovered that C# has different behaviours for classes versus POD types, and that as with seemingly all such languages, it's never quite clear which is which when you're actually looking at code. By default, an instance of a POD type - like an integer - is passed by value; if you want a function to modify such an argument, you have to explicitly pass it by reference. Objects (class instances), however, are passed by reference - or, rather, when you construct an object what you get is a reference, and the reference is passed by value.

Let's consider that last sentence again, as it's a little confusing at first sight. When you construct an object, you get a variable of type "reference to object". By default, this variable - as with POD types - is passed by value, so when you pass it into a function, that function gets a local copy of the variable, but the value - i.e. which object it refers to - is the same. So when you use the variable to manipulate the referenced object, the changes in object state are visible outside the function; but if you change the variable to refer to a different object, that change is purely local. You can of course pass "reference to object" variables explicitly by reference, in which case if a function changes the variable, the passed-in variable now points to the new object as well - so you can do stuff like class factories.

What is really confusing, though, is that when you instantiate a struct, you don't get a variable of type "reference to struct" - you just get the struct. Pass it into a function by value, and changes to its properties and fields are not visible outside the function, because you're operating on a copy - but you can, of course, have fields of type "reference to object" in the struct, and use those to make visible changes to the referenced objects.

So... what happens if you're a newbie to the language, and without realising what you're doing, you create a struct, pass it (by value) into a function, then have that function change a "reference to object" field of the struct to refer to a new object? The change to the field value isn't visible outside the function, because the struct was passed by value, so the function has just wasted its time. With the speed of modern computers, it will have wasted an immeasurably small amount of time; you, on the other hand, will waste vast swathes of time figuring out WHAT THE FUCK IS HAPPENING.


1 Stu   2010-07-18 21:59:15

This was very educational. However, I mostly use VB, which kinda uses variable references like:

"Oh what a GOOD BOY you declared a variable! WOW! Maybe someday you'll write a FUNCTION!"

2 Stavron   2010-08-15 14:59:07

Ah, yes, I remember that confusion with Java, where int, float, etc are "POD", but Strings are objects, so you can easily find yourself accidentally modifying someone else's strings, as it were.

This got me thinking, and the thoughts expanded into a blog post of their own, so I guess I'll have to pass it by reference ;)

You must be logged in to post comments.