Skip to content

Roles and Staffing

Security SpecialistOperations & StrategyDevops

How you staff incident response depends on your team size and structure. This document covers options from small teams to larger organizations.

See Incident-Response-Policy for role definitions.


Core Roles (All Team Sizes)

Every incident needs these roles filled, even if one person wears multiple hats:

RoleResponsibility
Incident LeaderCoordinates response, assigns tasks, makes decisions
ScribeDocuments everything in Incident Log
RespondersExecute fixes, investigate, implement mitigations

For larger incidents, add:

  • Communication Manager - Handles internal/external comms
  • Subject Matter Experts - Specialists for specific domains

Small Teams (2-5 people)

Approach

Everyone knows everything. Just make sure someone is always reachable.

Whether you need formal on-call depends more on your commitments (SLAs, assets held, user expectations) than team size alone. Small teams with high-value assets may still need structured coverage.

Structure

  • Designate 1-2 people as default Incident Leaders (only one leads any given incident)
  • Everyone else responds as needed
  • Leader also serves as Scribe for minor incidents
  • Separate Scribe for P1/P2 incidents

Expectations

  • Keep a shared contact list (see Contacts)
  • Establish one communication channel for incidents
  • Someone should always be reachable (informal coverage)

What You Might Not Need

  • Formal on-call rotation (unless your commitments require it)
  • Separate First Responder program
  • Multiple communication managers

Medium Teams (5-15 people)

Approach

Define subject matter experts. Consider a simple on-call rotation.

Structure

Subject Matter Experts (SMEs)
DomainPrimaryBackup
Smart Contracts
Infrastructure
Frontend
Security
On-Call Options

Option A: Informal

  • No formal schedule, but SMEs are expected to be reachable during their working hours
  • Clear escalation for after-hours: who to call first

Option B: Simple Rotation

  • Weekly rotation among willing team members
  • One person on-call, responsible for initial triage
  • They pull in SMEs as needed

Expectations

  • SMEs respond quickly when paged for their domain
  • On-call person handles initial assessment and escalation
  • Separate Scribe and Incident Leader for P1/P2 incidents

Larger Teams (15+ people)

Approach

Formal First Responder program with trained personnel and scheduled on-call.

First Responder Program

What First Responders Do:
  • Initial triage when an incident is detected
  • Assess severity
  • Kick off the incident response process
  • Pull in the right people
  • Hand off to Incident Leader
What First Responders Don't Do:
  • Fix the issue themselves (unless they're also the SME)
  • Make major decisions without escalation
Why This Model:
  • Distributes knowledge across the organization
  • Reduces burden on any single team
  • Ensures someone is always ready to start the process
  • Doesn't require deep expertise in all domains

On-Call Structure

Consider parallel schedules for different domains:

ScheduleCoverageRotation
Infrastructure24/7Weekly among 6-8 people
Smart Contracts24/7Weekly among 6-8 people

ex. With 8 people per rotation, each person is on-call one week every two months.

First Responder Training

Before going on-call, complete:

  • Review Incident-Response-Policy
  • Review Incident Log and Post-Mortem templates
  • Read 2-3 past post-mortems
  • Understand basic architecture (infra and smart contracts)
  • Know how to reach SMEs and Decision Makers
  • Test alerting system access

On-Call Expectations

During your shift:
  • Keep alerting device accessible
  • Respond to pages within 15 minutes
  • Triage and escalate appropriately
  • You don't need to fix everything. Get the right people involved
Off-shift:
  • Stay current on documentation
  • Review new post-mortems
  • Participate in tabletop exercises

Decision Makers

Regardless of team size, define who can make high-stakes decisions during P1 incidents:

RoleNameContact

These people should be reachable 24/7 for critical incidents. Consider:

  • Founders / C-level
  • Security Lead
  • Engineering Lead
  • Legal (for incidents with legal implications)

Tools Checklist

Ensure your on-call personnel have access to:

  • Alerting system (PagerDuty, etc.)
  • Communication platform (Slack, Discord, etc.)
  • Video conferencing
  • Monitoring dashboards
  • On-call schedule
  • Contacts list

Choosing Your Model

Team SizeRecommended Approach
2-5Informal coverage, designated leaders
5-10SME structure, optional simple rotation
10-15Simple rotation with SMEs
15+First Responder program with parallel schedules

Start simple and add structure as you grow. A lightweight process that people follow beats a heavyweight process that gets ignored.


See Incident-Response-Policy for how these roles work during an actual incident.