Thursday, January 31, 2008

Objects vs SoC Objects

Object Oriented Programming rocks the house. The reason why it is good can be summed up with a single word: Abstraction.

I had an awesome professor in college who said that a person is only so smart. They can only keep so many things in their minds at one time and can only fully understand some of those already limited things. The only way, then, that a person can do anything complicated is to abstract out the details. This frees up more space in their limited heads for other things.

This is true. And because it's so enormously true I'm going to provide some examples of abstractions people use all the time.
  1. Language
  2. Money
  3. Time, Dates
  4. Math
  5. Physics
Language is certainly the most powerful example. I say "Dog" and you know what I mean. I could also say say "Golden Retriever." Or I could try not to be so abstract and say "Four legged animal with ears, eyes, tail, nose with extremely sensitive smell, hair all over, and isn't a cat." Of course, now I'm wrong, because there are Dogs that have no hair... So, obviously, language is a very powerful abstraction.

In computer programming, we have programming languages. "Modern" programming languages are typically considered modern because they allow more powerful abstractions. Object Orientation is one such abstraction. Though I'm not sure it really qualifies as "modern" anymore...

It's important that programming languages have good abstraction techniques. Software development is complicated. So for a person to do software development they must be able to abstract things. Its not enough to just abstract things on paper either. Things have to be abstracted in code. What are these things? Algorithms and data. That's about all there is in programming. You have data that has some meaning and you have algorithms that manipulate that data in meaningful ways.

Objects allow us to abstract both data and algorithms. Typically the algorithms are still quite detailed, but we can abstract the details into an object so we don't have to think about them. (Aside: Functional programming allows you to make your algorithms more abstract with fewer details.)

Objects let us abstract things by grouping a set of data and algorithms together into a single named container. And then it goes one step further and lets us create many instances of those containers. So now I can have a Dog object. And I can create many of them. And they can have names and colors and weights and heights and any other data they need. They can also bark and whine and soil the carpet and any other actions (algorithms) they need.

Objects also let us represent commonalities and relationships between things through fun words like inheritance and polymorphism. That just makes the abstraction that much more powerful.

So that's why Object Oriented programming rocks the house: Abstraction. But everyone (who is reading this) knows that. So why did I waste all my time writing it and your time reading it? Because I couldn't stop myself.

What I actually want to talk about is what these objects should actually abstract. As far as I can tell there are two schools of thought. I'm pretty sure the one has already mercilessly beaten the other and kicked it to the curve. But in my schooling I wasn't taught about either, so I'm gonna teach myself about them both here.

The two schools:
  1. Objects should represent "the thing" and everything about it
  2. Objects should do exactly one "thing"
Or if you'd prefer,
  1. Objects represent real world objects
  2. Objects manage "concerns" and concerns are separated between objects
In the first you have one big object that knows everything about dogs and manages everything about dogs. Wanna draw a dog on the computer screen? Dog.Draw(); Want to make puppies? Dog.MateWith( Dog );

In the second you have lots of much smaller objects which together, collectively, know all about dogs. DogDisplayer.Draw( Dog ); DogMating.Mate( Dog, Dog ); Note however that this doesn't preclude you from hiding all the various classes behind the Dog class so that a user of the Dog object could still say "Dog.MateWith( Dog )".

As far as I can tell, current industry thinking is that Separation of Concern Objects are the way to go. However, when object oriented programming is taught it inevitably starts with the Real World Objects because that approach is just easier to understand.

MVC is an example of three Separation of Concern Objects. Just about every Design Pattern depends on Separation of Concern Objects (Observer, Strategy, State, Factory). And the point of just about every Design Principle I've learned is that you should write Separation of Concern Objects, the principles are just defining what Separation of Concern actually means (Single-Responsibility, Open/Closed, Liskov Substitution, Dependency-Inversion, Interface Segregation)!

There are a couple things you have to get over when you start creating these kinds of objects though. First, having lots of objects is not a bad thing. Second, having simple objects that only do one thing is a Good Thing. This often feels like overkill before you do it. After you do it you realize it has the potential to make your life seriously easier.

Okay, but why create many objects? Why not just put all the code in the same object? How is this going to make my life so seriously easier? The biggest reason is that if you put all that code in the same object it will get tangled. That is, two algorithms that really have nothing to do with each other (except that they apply to the same object) will start to affect each other. They'll use the same variables, but in different ways, they'll make assumptions that the others will break. This will become a problem when you need to change one of them and you suddenly and accidentally break the other.

The second reason is that you may want to allow for reuse. If everything is all packaged into one object, you can't reuse only a portion of it without the other parts. The final reason is that you may want to swap out the details of how something behaves for different details.

What would this mean for our Dog? Well, we can still have a Dog. But when we tell him to bark, he'll delegate the actual details of how to bark to a DogBark object. And when you tell him to soil the carpet, same thing.

Now if you decide that he should always bark when he soils the carpet, you can do that. And when you later realize that its bad enough that he's soiling the carpet, but now he's waking you up in the middle of the night barking, you can change it again so he doesn't bark when he soils the carpet. And you wont have to worry that soiling the carpet and barking got all tangled.

Moral of the story: SoC Objects add more abstraction to our abstraction. So when I told you that Unit Testing would improve your code design, this is why.

2 comments:

  1. Another solid post that a whole heartedly agree with but I would draw your attention to the blurry line when it comes to objects as real world entities.

    I still think there is a debate in terms of how to represent raw data in object form. Say, for example, your getting a bunch of "dogs" from a database. Well there are two schools of thought that i know of.

    1) Puts the dogs into individual rows in an object like a dataset, which keeps a very data-driven view of how it works. I mean a dataset is generic, looks and feels like rows in a database.

    2) Puts the dogs into dog objects. SO you have properties like Dog.Name, Dog.Age, etc. And if you have a whole bunch of dogs you pass along a collection of them or maybe even make a Litter object (implements ICollection :-) .) I think with this approach you still get a bit of the "old school" objects as real world things view.

    Right now my work uses both. Our webservices architecture prefers the dataset approach. Our batch architecture uses more of a data object approach. The PHP work i've done recently has favored data objects because PHP doesn't really have anything like a dataset.

    ReplyDelete
  2. Interesting point Josh. That does come into it in certain situations.

    .NET 2.0 was all about the datasets.
    .NET 3.5 is all about the custom data objects.

    Personally I'm a fan of the objects. It's better abstraction. You can do all the same stuff with a Data Table, but you end up passing rows around, and converting data types, and indexing into columns with strings... Objects just make life easier if you're writing lots of code against the data.

    On the other hand, if all you need to do is get the data, bind it, and update/save it then it doesn't matter how you store the data. In that case, a dataset is probably the simplest way to go.

    Still, good point. Databases do have a tendency to shake things up.

    ReplyDelete