I am working as DevOps engineer and I have been learning Distributed Systems for the past months and from my search I found some resources to learn concepts, the most important ones are
Book: “Distributed Systems for Fun and Profit” and a course from Chris Colohan on Youtube.
I have grasped these concepts so far:
FLP impossibility result, consensus can’t be achieved in an Async system where a node can crash.
Strong consistency model and how CAP theorem forces you to choose between CP or AP in the event of partition
Time and Vector clocks and how can they be used for conflict resolution
Consensus is needed for fault tolerance and how Paxos implements it, RAFT as well.
Partial Quorums for systems like Dynamo and different sharding techniques including Hash based and Consitent hashing.
Gossip protocols for replica synchronisation and finding membership in a cluster, SWIM protocol
Google File system paper
Caching and different techniques
Now my question is that I have not built any distributed system or even not sure if I have covered most of the things, what other topics would you recommend me to study? I want to know where the gaps are?