Google SRE Book
Google tells us how it tries to add clarity to the relationship between teams working on product development (who are usually focused on marketplace success) and people who work on reliability. Normally in companies these groups' incentives are disaligned, which ends up being a source of conflict.
Vendor-client contractual relationships have for a long time had the concept of SLA–the parties spell out their ongoing obligations, and lay out formulas for the damages to be paid for unmet obligations. SLOs are basically that, transposed to conflicts of interest occurring in the same company. For conflicts of interest under the same P&L sheet, this has to be resolved socially/politically–“within the family”.
That makes up about 20% of the book by volume, so it’s actually a little sketchy/underspecified (i.e., not well enough specified to implement in your own med-large business as-is). Filling in those gaps seems to be a big focus of the follow-up SRE Workbook.
By volume, the majority of the book covers scaling and supporting several major/typical system types.
My notes are about 50 pages, about 1/8 as big as the book itself.