Skip to main content

Lodestar Run-book

This document outlines procedures and steps to address common alerts and issues.

individual_validator_losing_balance

Description: Validator balance is unexpectedly decreasing.

Action:

  1. Ensure Execution Layer and Consensus Layer attached to the Validator Client are properly synced.
  2. Restart them if they are not synced.
  3. Monitor the alert; if it does not auto-resolve, consider replacing the Beacon Node (BN) machine.

LowExitMeassagesLeft

Description: The number of pre-signed exit messages in the current validator_ejector_exit_messages_subdirectory of the validator-ejector instance with label validator_ejector_node_size="small" is zero.

Action:

  1. Please increase the value of the validator_ejector_exit_messages_subdirectory variable by one.
  2. Run make start-validator-ejector HOSTS=aws-lido-prod-ejector-small to load the exit messages in the new diretory.

NoExitMessagesLeft

Description: The number of pre-signed exit messages on the of the validator-ejector instance with label validator_ejector_node_size="large" is zero.

Action:

  1. Add new pre-signed exit messages to both validator ejector instances (Small and large) to ensure smooth operation.
  2. Run make start-validator-ejector HOSTS=lido_prod_ejector to load the new exit messages.

missed_attestations_in_mass

Description: A significant number of attestations are being missed by validators attached to the Beacon node.

Action:

  1. Usually should auto-resolve in about 10mins. if it does not auto-resolve in 10mins, proceed to the next step.
  2. Redirect the Validator Client (VC) to a backup Beacon Node (BN).
  3. Investigate and resolve issues with the primary BN before reverting the VC.

StuckBeaconNode

Description: A Beacon Node (BN) is unresponsive or stuck syncing.

Action:

  1. Redirect Validator Clients (VCs) to a backup BN.
  2. Restart the primary BN's Consensus Layer (CL) container.

ValidatorMissedBlock

Description: A validator failed to propose a scheduled block.

Action:

  1. Verify all related services are operational.
  2. Collect and share logs from Execution Layer (EL), Consensus Layer (CL), and Validator Client (VC) with the development team for investigation.

BeaconNodeMemoryLeakDetected

Description: A memory leak has been detected in the Beacon Node process.

Action:

  1. Monitor the situation closely.
  2. Restart the Beacon Node process to mitigate immediate memory concerns.
  3. Inform the Lodestar development team of the issue for further investigation.

  • individual_validator_losing_balance
  • LowExitMeassagesLeft
  • NoExitMessagesLeft
  • missed_attestations_in_mass
  • StuckBeaconNode
  • ValidatorMissedBlock
  • BeaconNodeMemoryLeakDetected