Coherence Penalty for Humans

2020-05-17

https://www.michaelnygard.com/blog/2018/01/coherence-penalty-for-humans/

Michael T. Nygard

2018-01-09

Amdahl’s Law

In 1967, Gene Amdahl presented a case against multiprocessing computers. He argued that the maximum speed increase for a task would be limited because only a portion of the task could be split up and parallelized.

If you break a job up into a ‘parallel fraction’ versus a ‘serial fraction’, the maximum speedup is basically 1 over the serial fraction. Visible on a chart as an asymptote.

From Ahmdahl to USL

The thing about Amdahl’s Law is that, when Gene made his argument, people weren’t actually building very many multiprocessing computers. His formula was based on first principles: if the serial fraction of a job is exactly zero, then it’s not a job but several.

Neil Gunther extended Amdahl’s Law based on obsevations of performance measurements from many machines. He arrived at the Universal Scalability Law. It uses two parameters to represent contention (which is similar to the serial fraction) and incoherence. Incoherence refers to the time spent restoring a common view of the world across different processors.

In a single CPU, incoherence penalties arise from caching.

Across database nodes, incoherence penalties arise from consistency and agreement algorithms.

Effect of USL

When you graph the USL as a function of the number of processors, you get a downward facing parabola as processors increase, reflecting that at some point you at some point your coherence cost exceeds the benefits of parallelization.

We’d often like to increase the number of processors and get more throughput. There are exactly two ways to do that:

Reduce the serial fraction
Reduce the incoherence penalty

USL in Teams?

Think of a project as the “job” and people on the project as “processors”.

Whatever time the team members spend re-establishing a common view of the universe is the incoherence penalty.

For a half-dozen people in a single room, that penalty might be really small. Just a whiteboard session once a week or so.

For a large team across multiple time zones, it could be large and formal.

Sometimes tools and languages can change the incoherence penalty. One of the arguments for static typing is that it helps communicate across the team. In essence, types in code are the mechanism for broadcasting changes in the model of the world.

All of these techniques are aimed at the incoherence penalty. Let’s recall that overscaling causes reduced throughput. So if you have a high coherence penalty and too many people, then the team as a whole moves slower. I’ve certainly experienced teams where it felt like we could cut half the people and move twice as fast. USL and the incoherence penalty now helps me understand why that was true–it’s not just about getting rid of deadwood. It’s about reducing the overhead of sharing mental models.

In The Fear Cycle I alluded to codebases where people knew large scale changes were needed, but were afraid of inadvertent harm. This would imply a team that was overscaled and never achieved coherence. Once lost, it seems to be really hard to re-establish.

USL and Microservices

By the way, I think that the USL explains some of the interest in microservices.

Part of the premise for microservices is that they don’t need the integration work, integration testing, or delay for synchronized deployment.

I think you could regard interface changes between microservices as requiring re-coherence across teams. Too much of that and you won’t get the desired benefit of microservices.