Architecture
HomeTechnologyLeadership
  • Software Architecture
  • System Design
    • About System Design
    • Non Functional System Characterstics
    • Back of the envelope Calculations
    • DNS
    • Load Balancers
    • Databases
    • Key Value Store
    • CDN
    • Sequencer
    • Distributed Monitoring
      • Monitoring Server Side Errors
      • Monitoring Client Side Errors
    • Distributed Cache
      • Introduction
      • High-level design
    • Distributed Messaging Queues
    • Pub-sub System
    • Rate Limiter
    • Blob store
    • Distributed Search
    • Distributed Logging
    • Distributed Task Scheduler
    • Shared Counter
  • Design Problems
    • Design YouTube
    • Design Quora
    • Design Google Maps
    • Design Yelp
    • Design Uber
    • Design Twitter
    • Design Newsfeed System
    • Design Instagram
    • Design Tiny URL
    • Design Web Crawler
    • Design WhatsApp
    • Design Typeahead
    • Design Google Docs
  • Design Patterns
    • Monads
    • Singleton
  • Topics
    • CAP Theorem
    • ACID
    • Consistency
    • Failure Models
  • *⃣ Miscellaneous
    • Preparing for SDI
    • 📯Public APIs
    • 👾Glossary
Powered by GitBook
On this page
  • Fail-stop
  • Crash
  • Omission Failures
  • Temporal Failures
  • Byzantine Failures

Was this helpful?

  1. Topics

Failure Models

List of failure models in order of increasing difficulty to deal with it.

  • Fail-stop

  • Crash

  • Omission

  • Temporal

  • Byzantine

Fail-stop

A node in the distributed systems halts permanently. However, the other nodes can still detect that node by communicating with it.

Crash

In this type of failure, a node in the distributed system halts silently, and the other nodes can't detect that the node has stopped working.

Omission Failures

In omission failures, the node fails to send or receive messages.

  • Send Omission Failures: The node fails to respond to the incoming request.

  • Receive Omission Failure: The nod fails to receive the request and thus can't acknowledge it.

Temporal Failures

The node generates correct results, but is too late to be useful. The failure could be due to bad algorithms, a bad design strategy or a loss of synchronization between the processor clocks.

Byzantine Failures

The node exhibits random behaviour like transmitting arbitrary messages at arbitrary times, producing wrong results, or stopping midway. This mostly happens due to an attack by a malicious entity or a software bug.

PreviousConsistencyNextPreparing for SDI

Last updated 1 year ago

Was this helpful?