Bane's Rule

You don’t understand a distributed computing problem until you get it to fit on a single machine first.

Speeding up computing can be thought of as three different approaches: high (vertically scaling e.g. more RAM and faster CPU/storage), wide (distributed work), and deep (refactoring).

I saw this happen at work where an engineer rewrote a Spark job distributed over many machines to a single large machine calculating the same output using Unix commands and pipes faster than the distributed version.

Read the source comment on HN