ChainSafe Infrastructure Documentation
Welcome to the internal infrastructure engineering documentation for ChainSafe. This handbook contains everything you need to understand, operate, and maintain our production infrastructure.
Quick Links
📐 Architecture
- Architecture Overview - Network, compute, security, and HA architecture
- Network Architecture - Network design and topology
- Security Architecture - Security policies and practices
🛠️ DevOps Tools
- Ansible - Configuration management
- Terraform - Infrastructure as code
- Docker - Containerization
- GitHub Actions - CI/CD workflows
📊 Observability & Monitoring
- Monitoring Overview - Complete monitoring architecture
- Alerting & On-Call Guide - PagerDuty, alerts, and on-call procedures
- Monitoring Inventory - All monitoring components
- Logging - How to send logs to Loki
- Metrics - How to send metrics to Prometheus
🚀 Projects
- Lodestar - Ethereum consensus client infrastructure
- Filecoin - Filecoin node infrastructure
- Polkadot - Polkadot validator infrastructure
- Orbitor Gateway (IPFS) - IPFS HTTP gateway infrastructure
- Celestia - Celestia validator and bridge node infrastructure
- Aztec - Aztec node infrastructure
- Canton - Canton validator infrastructure
- Namada - Namada infrastructure (coming soon)
Getting Started
New to the infrastructure team? Start here:
- Read the Architecture Overview to understand our infrastructure design
- Review Monitoring Overview to understand our observability stack
- Check project-specific documentation for the services you'll be working with
- Bookmark the On-Call Guide for when you're on-call
Key Resources
- Alertmanager: https://alertmanager.chainsafe.dev
- Prometheus: https://prometheus.chainsafe.dev
- Grafana Cloud: https://chainsafe.grafana.net
- PagerDuty: https://chainsafe.pagerduty.com
This documentation is maintained by the ChainSafe Infrastructure team. For questions or updates, please open an issue or PR in the infrastructure-documentation repository.