Object-Oriented Programming is Bad

2023-09-05

https://www.youtube.com/watch?v=QM1iUe6IofM

Brian Will

2016-01-18

I’m telling you definitively, no, object-oriented programming doesn’t fit any problem and you shouldn’t take it seriously.

What am I not complaining about?

Classes

Classes are not the problem per se.

Where everything goes wrong is when you try and shove every function of your code, every behavior, into an association with a datatype. That leads to disaster.

Performance

Performance is not the problem. Recommended talk, Mike Acton’s “Data-Oriented Design and C++”

Abstraction

Abstraction is actually a worthy goal. In practice, most abstractions aren’t good, it takes a long long time to develop good ones, and as I’ll explain, a major problem with object-oriented programming is it does tend to produce abstractions that aren’t any good.

Aesthetics

Recommended talk, Abner Coimbre’s “What Programming is Never About”

His point is that I think his arguments are valid, but his thesis isn’t what his arguments support.

I think when really pressed he would admit that elegance, simplicity, flexibility, readability, maintainability, structure – all these things you might file under code aesthetics I think you admit actually do matter, but the more accurate way to spin his point is that these surface-level virtues of code are good things and actually important, but object-oriented programming and abstraction-heavy programming in general fails to deliver them. In fact, it provides just the illusion of these things.

Procedural code is better (even when not functional)

I’m pushing procedural programming, not necessarily functional programming. Functional is the future, and we should expect it to be the default for higher-level programming 10 years from now. But it has efficiency problems that make it not viable in certain domains.

So my message is, whether your code ends up functional or imperative, that’s a separate matter. Regardless, your code should be procedural rather than object-oriented.

What are the competing paradigms we’re talking about?

Procedural and imperative (the default)
- procedural: no explicit association between data types and functions/behaviors
- imperative: we don’t have any special handling of shared state
- it’s like a recipe, it’s the obvious way
- but shared state creates problems in the long-term, so we saw other paradigms arise to manage shared state
Procedural and functional (minimize state)
- all or most functions should be pure
Object-oriented and imperative (segregate state)
- “we take the state that makes up a program and instead of sharing it promiscuously, we try and divide and conquer the problem.”
Object-oriented and functional (do both)

Inheritance is irrelevant

Nobody defends it anymore.

Polymorphism is not (exclusively) OOP

Encapsulation does not work

Technically, the point of this talk is that encapsulation does not work at fine-grained levels of code. And that’s the problem with OOP–OOP asserts that everything should be encapsulated all the way down.

Why does OOP dominate the industry?

[8:00]

It’s been theorized that OOP was imposed by management for interchangeability of developers, given that OOP came with promises of code reusability and compartmentalization. Except OOP hasn’t delivered on these promises, and management doesn’t insert themselves in technical decisions that often. Maybe, once OOP was already established, interchangeability of programmers and candidate supply select for OOP, but then that’s effect not cause.

I’m much more inclined to think that object-oriented programming is something that programmers did to themselves, and the question is, then, well, why?

When Java came along in the ’90s, Java seemed simple. It’s way easier and higher-level than the kind of stuff you’d be writing in C++ to directly interface with windows APIs (especially with the transition from Win16 to Win32).

Java was also more accessible in terms of naming conventions: FileInputStream is far easier to read at glance than ioctl/LPCTSTR stuff you would see in linux or windows.

Java didn’t require header files.

Memory management and garbage collection were a big relief as compared with C/C++.

Exceptions might be a pain, but they’re alluring as compared to what we were used to from C: an inbound error value, or saving to a global and checking the global.

People also seemed to appreciate subject.verb(object)

I prefer consistency and I think the distinction between subject and object in many many cases gets very very murky

But this syntax has made it possible for IDEs to offer to people something they’re now addicted to: for this data type, what can I do with it. With this affordance, you can browse or grope your way there through autocompletion. This may also explain why people say object-oriented APIs are easier to use.

The ’90s were also the heyday of GUI-oriented programming, and visual components in a GUI program are built up hierarchically in the same way OOP says objects would be.

What is OOP really?

[15:51]

As software systems got larger and larger, people tried to find units of functionality larger than individual functions and data types. Much like in biology, you would summarize things in terms of organs instead of cells.

The simplest unit above functions and data types was just to allow people to group functions together with data types.

OOP also seemed to present a unit of abstraction and a set of guidelines where we could accrete larger and larger systems.

This line of thinking is what led us to patterns, and then the so-called SOLID principles, and dependency injection and test-driven development and all this stuff, which has subsequently been piled on by people who insist that this is now the one true way to do object-oriented programming. But to me all these best practices represent band-aids. They are compensation for the fact that the original vision of object-oriented programming has never panned out. And every few years, there’s a new ideology in town about how we actually do object-oriented programming, for reals this time.

It’s very easy to miss this dynamic. I know I did for several years. Because I think within all of these addendums to object-oriented programming, there’s lots of mystical speech dancing around genuine insights. But it’s not quite cohesive. Object-oriented programming feels like this circle which we’ve been trying to square for over a generation now.

Why does OOP not work?

[18:08]

An object is a bundle of encapsulated state. We don’t interact with that state directly, all interactions from the outside world come through messages. There is a defined set of messages it can receive, that defined set makes up its public interface. When an object receives a message, it may in turn send messages to other objects, so we can conceive of an object-oriented program as a graph of objects sending messages to one another.

The original conception of a method sends only a copy of some state, so never references. Put another way, a message sends and returns information about state, not state itself. Except, objects themselves are state, so messages can’t send object references. But OOP in practice in Java, C++ doesn’t observe this at all.

But supposing we wanted to observe that rule, if object A is to send a message to object B, B can’t be part of the message sent to A, A must hold a private reference to B from its creation.

Sending messages may (indirectly) read and modify state. So if object B and object C both send messages to object A, then they both can modify the state of A. What do we have here? We have shared state. A might as well be a global variable.

Where in the system of ten objects all sharing this state is the real coordination. And the answer is, there isn’t any.

If we take encapsulation seriously, then the only way to structure our program is not as a freeform graph, but as a hierarchy. Where the root of the program tree is a god object.

In such a graph, what about cross-cutting concerns? When some object needs to send a message to an object and their only common ancestor is the root, for example? Well, not directly, everything has to go through their common ancestor.

No one writes programs this way.

What about partial encapsulation. Like we start with the freeform graph, identify cliques/clusters, and form them into subsystems. Within these systems, we establish local encapsulation via little ‘deity’ objects. Fine, maybe workable, but now we’ve invested in the current dependency structure. If a new cross-cutting concern between subsystems is introduced, we have to start both from scratch.

Whether you follow the rules strictly or loosely, you’re in a bad place. If you follow the rules strictly, most things you do end up at being very unobviously structured and very indirect and the number of defined entities in your codebase proliferates with no end in sight. The nature of these entities tends to be very abstract and nebulous. But alternately, if you follow the rules loosely, what are you even getting, why are you bothering, what is the point?

When I look at your object-oriented codebase, what I’m going to encounter is either this over-engineered giant tower of abstractions, or I’m gonna be looking at this inconsistently architected pile of objects that are all probably tangled together like Christmas lights. You’ll have all these objects giving you this warm fuzzy feeling of encapsulation, but you’re not going to have any real encapsulation of any significance.

[25:41]

What people tend to create when they design object-oriented programs are overly-architected buildings where the walls have been prematurely erected, before we figure out what the needs of the floorplan are. So what happens is, down the line, turns out, “no wait, we need to get from this room over here to that room over there, but oh wait, we’ve erected barriers in between”. So we end up busting holes through all the walls, like the Kool-Aid guy. And the resulting pattern is really not organized at all, it’s just swiss cheese. We thought we were being disciplined–neatly modularizing all the state–but then the requirements changed or we just didn’t anticipate certain details of the implementation and we end up with a mess.

The lesson: absence of structure is more structured than bad structure. Structure takes time to create, and bad structure hinders change by implying one thing that’s different from what’s really going on.

In the object-oriented world, we have to think about all these graphs. We have to think about an inheritance hierarchy, we have to think about a composition graph, we have to think about data flows between the object, and also we’re thinking about a call graph. The liberating thing about procedural code is there’s just the call graph.

With procedural, you still have to think about your data and your functions, but you aren’t imposing extra constraints of grouping or modularizing them.

Working with OO, you look at your list of data types that are obvious to the problem your modelling, and the behaviors that are obvious to implement, and associating these. But inevitably some behaviors don’t map cleanly to any of those types, or to multiple of those types, and now you have to create some artificial additional types that are containers of behavior but represent no data.

In fact, as programs get larger and larger in object-oriented code, it tends to be that these unobvious, unnatural data types tend to actually predominate, you end up with majority of so-called data types which really aren’t there because they’re representing data, they exist simply as a tax to conform to this ideology about code modularization. Very quickly we end up in what Steve Yegge calls the Kingdom of Nouns, where every aspect of our program has to be reconceptualized as not just mere standalone verbs–you know, functions–they have to be reconceptualized as nouns–things that represent a set of behaviors. And so what we get in our object-oriented codebases are all these Service classes and Manager classes and other what I call Do-er classes. These very nebulous and abstract entities.

The matchmaking game constantly presents us with these obnoxious philosophical dilemmas. In object-oriented analysis and design we constantly have to ask ourself stupid questions like “should a message send() itself?”. Because maybe instead we should have some Sender object which send()s messages. Or, wait a minute, maybe there should be a Receiver object which receive()s messages. Or a Connection object which transmit()s messages.

Very quickly the real-world modelling which object-oriented programming promises becomes a fools game, where there aren’t any good answers.

Object-Oriented Programming is generally sold to students on the basis of these trivial examples that neatly model real-world taxonomy. But what we get in practice from object-oriented analysis and design is a lot of very abstract excess structure with no obvious real world analogues.

Note here, that programmers have their own peculiar definition of abstract. When programmers talk about abstraction, they’re generally talking about simplified interface over complex inner-workings. What’s odd about this is that, in more general usage, abstract has a connotation of being hard to understand. Something which is abstract has no resemblance to the things of common daily life.

And it turns out that most things that programs do are abstract in this sense. And so it shouldn’t be surprising that we have great difficulty conceptualizing the components of the typical program in terms of neatly self-contained modules, particularly modules which have any real-world analogue. When we pollute our code with generic entities like Managers and Factories and Services, we’re not making anything easier to understand, we’re just putting a happy face on the underlying abstract business, and for every excess layer of abstraction we’re getting more abstractness.

In attempting to neatly modularize and label every little fiddly bit that our program does, we’re actually just making our program harder to understand.

[30:50]

One obvious exercise is to take some piece of user-visible functionality, and try to figure out from file names/class names where that user-visible functionality is implemented.

Object-oriented design tends to take what otherwise could be relatively self-contained code and split it up into many separate methods across many separate files, typically often in many separate files for God’s sake. This fracturing is accepted because of an ideology about encapsulation, and this notion of classes and methods properly having so-called single responsibilities.

Sure, it’s a trade-off: smaller functions are easier to get right, easier to test. But here we increase the surface-area of our code.

When all your methods are really really short, you end up having to jump all around your code to find any line of logic. A lot of business that otherwise could be neatly sequentially expressed in longer methods gets artificially split up. So it feels like you’ve taken a neatly sorted deck of cards and thrown them into the air so you can play 52-card pickup.

How should we write code without OOP?

[33:18]

You don’t need to avoid classes entirely. Sometimes an association between data types and functions is very strong, so go for it.

The moment you start hemming and hawing about whether a function actually belongs with a data type, just make it a plain function.

But what about shared state?

When in doubt, parameterize. Pass data as explicit parameters to functions, make data flow through the call graph.

Try to bundle globals into structs/records/classes, even if these types end up with only one instance. This can help your code feel more organized. If you do a good job of it, you can eventually make some pass a subset of these into functions.

Favor pure functions.

The brilliant thing about pure functions is that they’re the only truly self-contained unit of code.

Encapsulate (only loosely) at the level of namespaces/packages/modules.

When a new cross-cutting concern is discovered in a project with coarse-grained encapsulation, the sunk cost of the initial setup is not wasted, and restructuring or violating the encapsulation is either way less of a big deal.

Don’t be afraid of long functions.

Although it’s in vogue to split functions, like this:

function myFunc() {
    doStuff();
    doMoreStuff();
    thenThis();
    thenThat();
}

If you do this too much, a logical sequence of steps is split up and out of order in your codebase. Sometimes it’s worthwhile, when this code is called from multiple places, but in many cases it’s not.

If you want high-level comments, consider instead ‘section comments’

function myFunc() {
    // doStuff
    ...
    // doMoreStuff
    ...
    // then this
    ...
    // then that
    ...
}

This way the sequence of logic is totally clear, and the rest of the codebase has less clutter (fewer ‘free variables’). We are freed from deciding on function names, but this code meanwhile is free to be described in depth in English.

The next best thing is a private function or a closure.

There are still good reasons you would definitely want to extract to separate functions: when there is logical complexity that can be extracted; when the indenting is getting too deep; when the code is moderately indented for a long section of code.

You also want to constrain the scope of local variables so readers have lower cognitive load. In most curly brace languages you can just introduce a new scope.

Defining then calling a closure also works: they provide a local scope, while inheriting access to the enclosing function’s variables. As a minor plus, you can safely use return statements without triggering an early return, which makes each closure within a function slightly more self-contained/less dependent.

The ideal seems to be something like an inline anonymous function which sees nothing of its enclosing scope except specific allowlisted values. e.g. the way that php has function ($arg1, $arg2) use ($varFromEnclosingScope) {...}. But additionally it should only be allowed to use copies of the values from the enclosing scope.

In conclusion

if you’ve ever felt any of the paralysis that I felt attempting to do object-oriented programming properly–to square the circle–I think you’ll find abandoning all those ideas and just reverting to procedural code to be a liberating experience.

I can tell you from personal experience that having read these books [GoF’s “Design Patterns”; Fowler’s “Refactoring”; Bob Martin’s “Clean Code”; “TDD by Example”; “Growing Object-Oriented Software, Guided by Tests”] that you don’t need to read them. They don’t have answers, they’re not going to square the circle, and you’re going to waste productive years of your life trying to live up to their ideals.

Part of the problem is that kernels of good ideas have been taken to holistic extremes, in ways that have been disastrous for the industry, and certainly for programming education. There are very few solid, holistic answers about how we should write code. We’d all be better off if we stopped chasing that chimera.