What to do in case disaster strikes? A directory authority perspective

Posted on Oct 3, 2025

Availability

risks detection reaction
1 censuses every hour and clients use it for 24h (it’s only fresh for that time) consensus-health list right now and prometheus, so relay ops could raise the alarm (after 6 consecutive fails)
every night at midnight UTC: shared random value is generated; falls back to a shared value consensus-health list right now and prometheus, so relay ops could reaise the alarm
bugs that crash relays/dir-auths (maybe possible to get induced remotely)
sybil attack (running 100000 relays)
dir-auths are getting on a block list
dir-auths network reachability failures prometheus; consensus-health web page
Dirauths being DoS’ed client bootstrap failures; consensus-health web page

Integrity

risks detection reaction
5 keys get busted/compromised human; canary; mail to email list; ping signal group
LEAs is giving out a subpeona

Security

risks detection reaction
Debian RNG bug

Debuggging

Ideas for debugging issues:

  • share logs (real time) with some people (in UTC)
  • metrics port being exposed to
  • expose how long it takes to upload votes (as a task)
  • maybe MetricsPort could have a sensitive/non-sensitive version for decentralized debugging

How/Where to reach out

  • contact points
    • dirauth operators
      • email list/alias
      • individuals/spouses
        • emails
        • signal
        • IRC
    • ICE contingency
    • TPI
      • network health
    • AFK calendar for dir-auth folks could get subscribe to
      • maybe just shared at the (bi-)monthly dirauth meeting

Next steps

  • ctor ticket for SRV brittleness (GeKo)
  • share with dirauth operators (GeKo)
  • $dirauth do their thing (GeKo + dirauths)
  • share with tor-relays@ what we got (GeKo)