//
In a distributed system with hundreds of servers, machines fail constantly. Hard drives crash, network cables get disconnected, and processes run out of memory. The system must quickly detect these failures and route traffic away from dead nodes.
A Heartbeat is a periodic signal sent between nodes to indicate that they are still alive and functioning.
A Health Check is a more sophisticated mechanism, typically used by load balancers and container orchestrators (Kubernetes).
Answers the question: "Is the process running?"
/healthz that returns 200 OK.Answers the question: "Is the process ready to accept traffic?"
Answers the question: "Has the process finished its initial startup?"