Can someone expand on why the book is so good? I found it very unremarkable but it's recommended so much around here that I suspect that I've missed something important.
Did you find it unremarkable because you already know the material, or because the material isn't relevant to what you do?
Anyway, some reasons:
* Draws together a wealth of material on databases and distributed systems that wasn't explained in a systematic, accessible way anywhere else. It provides a map to someone coming to this hard-to-navigate area for the first time.
* Great balance between being conversant with the academic research (without being too abstract) and being practically applicable (without being too tied to details of particular technologies)
* Shows underlying unity and concepts of very different data technologies, e.g. why classic relational database write-ahead logs and replication are very similar to streaming platforms like Kafka
* Subjective, but it is very clear, accessible, and well-written. This is very very hard to do and quite rare in technical books.
It's my favorite technical book of the decade and my first recommendation to anyone who asks me how to really "level up" as a senior engineer.
While I agree with all your points (I also think the book is a good one), I don't agree with the "level up" part. I don't see how, after reading carefully the book, any software engineer can "level up". If you don't have practical experience implementing (or dealing with, or maintaining) some of the scenarios the book talks about, then in no way one can "level up" just by reading the book.
Ah, yes, I should have said “first _reading_ recommendation” for how to level up! (And added: primarily for web backend or data engineers.)
Yes, of course you can’t level up just by reading a book; experience is the only way. The key thing - as you know, if you like the book - is that this book provides a coherent framework for thinking about data systems that engineers can fit their particular experience into. That “weird race condition” becomes less mysterious when it’s framed in terms of concepts like write skew or phantoms; joining two streams together becomes a problem about time and ordering. And so on. In fact the reason it’s such a great book is that you can read it with not much experience or background beyond basic database knowledge, absorb a lot of the ideas, and then keep coming back to it as your experience grows and you encounter new kinds of scenarios in your day to day work.
The book provides tons of reference. I know that while reading the book, I experimented with many of the software described in the book. The author makes a good job at comparing various techs in putting them in context. This is very hard to do by just following a web of tutorials on the web.
Seems many people haven't been exposed to a systematic overview of the topic otherwise, and it works great at providing that. At least that'd be my guess where your experience differs.
Its a practical overview of a difficult and modern topic. That in itself would make the book good, but for me, it is that the book goes deeper into the research and algorithms than most O'Reilly books -- but it does this without becoming a Science textbook.
It is easy to tell the people working in big data that haven't read the book, just from the mistakes they make.
Unremarkable if you're in that field with experience. If you're one of the majority of developers working around the edges or new to the subject it is quite enlightening.