Skip to content

System Design

CAP theorm

Consistency

Availability

Partition Tolerence

in a distributed system, you can only support two of the following guarantees:

finace industry : consistency soical network: availability

Consistency patterns

Week consistency

After a write, reads may or may not see it. A best effort approach is taken. VoIP, video chat, realtime multiplayer game.

Eventual consistency

After a write, reads will eventual see it. Data is repilcated asynchronously. DNS and email

Strong consistency

After a write, reads will see it. Data is repilcated synchronously. DB system,

Availability patterns

Fail-over

replication

  • Active-passive With active-passive fail-over, heartbeats are sent between the active and the passive server on standby. If the heartbeat is interrupted, the passive server takes over the active’s IP address and resumes service.

  • Active-active In active-active, both servers are managing traffic, spreading the load between them.

Cons fail-over

  • Fail-over adds more hardware and additional complexity.
  • There is a potential for loss of data if the active system fails before any newly written data can be replicated to the passive.

CDN(Content delivery network)

a content delivery network is a globally distributed network of proxy servers, serving content from locations closer to the user.

Pull CDNs

Pull CDNs grab new content from your server when the first user requests the content.

Push CDNs

Push CDNs receive new content whenever changes occur on your server.

Cons

  • CDN costes could be signifianct depending on traffic, although this should be weghed with additional costs.
  • Content might be stale it is updateed befroe the TTL expires it. Like the pull CDNs

Load balancer

Load balancers distrbute incoming client requestes to computing resources such as application servers and databases.

Layer 4 load balancing

TCP, UDP

Layer 7 load balancing

http, https Haproxy, Nginx all soupport 4 and 7. Nginx use stream moudule in 4

cons

  • The load balancer can become a performance bottleneck if it does not have enough resources.
  • increased complexity

Reverse proxy

A reverse proxy is a web server that centralizes internal serivces and provides unifiled interface to the public.

Additional benefits include:

  • Increase security
  • inreased saclability and flexibility
  • SSL
  • Caching
  • Compression
  • Static content

Application layer

The single responsibility principle advocates for small and autonomous services that work together.

Microservices

Like order, user, search, etc.

Service Discovery

Systems such as Consul, Etcd and Zookeeper.

Cons

Microservices can add complexity in terms of deployments and operation.

Database

ACID is a set of properties of relational database transactions.

  • Atomicity - Each transaction is all or nothing
  • Consistency - Any transaction will bring the database from one valid state to another
  • Isolation - Executing transactions concurrently has the same results as if the transactions were executed serially
  • Durability - Once a transaction has been committed, it will remain so

Master-slave replication

The master serves reads and writes, replicating writes to one or more slaves, which serve only read. Slaves can also replicate to additional slaves in a tree-like fashion. If the master goes offline, the system can continue to operate in read-only mode until a slave is promoted to a master or a new master is provisioned.

Cons

Additional logic is needed to promote a slave to a master

Master-master replication

Both masters serve reads and writes and coordinate withe each other on writes. If either master goes down, the system can continue to operate with both reads and writes.

Cons

  • You’ll need a load balancer or you’ll need to make changes to your application logic to determine where to write.
  • Most Master-master systems are either loosely consistent or hanve increased wirte latency due to synchronization.

Cons replication

  • There is a potential for loss of date if the master fails before any newly written data can be replicated to other nodes.
  • Writes are replayed to the read replicas. If there are a lot of writes, the read replicas can get bogged down and can not do many reads.
  • The more read slaves, the more replicate, which leads to greater replication lag.
  • On some systems, writing to the master can use mutiple threads for parallel writing, while read replicas only support sequential writing wiht a single thread.
  • Replication adds more hard ware and additional complexity.

Federation

Sharding

Denormalization

SQL tuning

NoSQL

SQL or NoSQL

Reasons for SQL:

  • Structured data
  • Strict schema
  • Relational data
  • Need for complex jonis
  • Transactions
  • Clear patterns for scaling

Reasons for NoSQL:

  • Semi-structured data
  • Dynamic or flexible schema
  • Non-relational data
  • No need for complex joins
  • Store many TB PB data

Cache

Caching improves page load times and can reduce the load on your servers and databases.

Client caching

Caches can be located on the client side like browser.

CDN caching

CDNs are considered a type of cache.

Web server caching

Web servers can also cache requests, returing responses without having to contact applecation servers.

Database caching

Application caching