Workflows

Escalation Triggers

When and how to escalate test healing attempts

Escalation Triggers - When to Ask vs Act

This document defines when Claude Code should proceed autonomously versus when to stop and ask the user for guidance.

Core Principle

Act like a competent test developer who knows when they need input.

A skilled developer doesn't ask about every decision, but also doesn't make assumptions about business logic or take destructive actions without confirmation.


Autonomy Matrix

PROCEED AUTONOMOUSLY ✅

Claude Code should proceed without asking when:

1. Standard Exploration

SituationAction
Finding test filesSearch standard locations, read files
Finding Page ObjectsSearch standard locations, read methods
Finding credentialsSearch standard locations (.env, fixtures, config)
Understanding test structureRead test files, infer patterns
Finding helpers/utilitiesSearch standard locations, read implementations

2. Standard Operations

SituationAction
Opening a URLNavigate via Playwright MCP
Taking screenshotsCapture for analysis
Reading page stateUse browser_snapshot
Clicking visible elementsInteract to test flow
Filling form fieldsEnter test data
Running existing testsExecute via Bash

3. Clear Fixes

SituationAction
Locator typoFix the typo
Missing :visible filterAdd it
Missing .first() for strict modeAdd it
Obvious timeout too shortUse getTimeout() or increase reasonably
Missing awaitAdd it
Import missingAdd import statement

4. Pattern Following

SituationAction
Project uses Page ObjectsCreate/use Page Objects for new tests
Project has helper functionsUse existing helpers instead of inline code
Project has specific naming conventionsFollow them
Project has specific locator patternsFollow them

ASK FIRST 🛑

Claude Code should stop and ask when:

1. Business Logic Questions

SituationWhy Ask
What should this feature do?Claude can see UI but not intent
Is this behavior correct?Only user knows expected behavior
Which flow is the happy path?Business decision
What data is valid for this field?Domain knowledge required
Should this test exist?Product decision

Example prompt:

I can see the form has these fields: [list].

What's the expected flow for submitting this form?
What validation rules should I test?

2. Multiple Valid Approaches

SituationWhy Ask
Two patterns exist in codebaseWhich is preferred?
New pattern vs existing patternConsistency vs improvement
Test granularity (unit vs e2e)Architecture decision
Where to place new test fileOrganization preference

Example prompt:

I see two patterns for authentication in this codebase:
A) Using storageState from global setup
B) Login helper function per test

Which should I use for this new test?

3. Non-Standard Situations

SituationWhy Ask
Project structure doesn't match standard patternsNeed to understand custom setup
Unusual dependencies or toolsMay have special requirements
No Page Objects in a large projectIntentional or technical debt?
Tests mixed with production codeUnusual, need context

Example prompt:

This project doesn't have a page-objects/ directory, but has 50+ test files.

Should I:
A) Create a page-objects/ directory and start using Page Objects
B) Follow the existing pattern of inline locators
C) Something else

4. Destructive Actions

SituationWhy Ask
Deleting test filesIrreversible
Removing test casesMay lose coverage
Modifying shared fixturesAffects other tests
Changing global configAffects all tests
Modifying authentication setupCould break all authenticated tests
Running tests against productionPotentially dangerous

Example prompt:

This test file seems to be duplicated by another file.

Should I delete this one, or is there a reason both exist?

5. Uncertainty Above Threshold

Confidence LevelAction
>90% confidentProceed
70-90% confidentProceed, mention assumption
50-70% confidentAsk first
<50% confidentDefinitely ask

Example prompt:

I'm about to update the locator for the submit button.

Current: `page.locator('.btn-submit')`
Proposed: `page.getByRole('button', { name: 'Submit' })`

I'm 75% confident this is correct based on the screenshot,
but the button text might change based on form state.

Should I proceed or verify first?

Escalation Patterns by Workflow

During Debug Workflow

PhaseAutonomousAsk First
Reading failing test
Opening browser
Navigating to page
Observing current state
Identifying locator issue
Fixing obvious locator bug
Changing test logic🛑 User should confirm
Removing assertions🛑 May reduce coverage
Changing expected values🛑 May indicate bug in app

During Explore Workflow

PhaseAutonomousAsk First
Searching Quoth for patterns
Opening browser
Navigating to feature
Clicking through flow
Observing states
Writing test file✅ (following patterns)
Deciding what to assert🛑 Business logic
Deciding test scope🛑 Coverage decision
Creating new Page Object✅ (if pattern exists)🛑 (if no pattern)

During Bootstrap Workflow

PhaseAutonomousAsk First
Scanning directories
Reading config files
Identifying Page Objects
Identifying helpers
Finding credentials location
Generating documentation
Uploading to Quoth🛑 Confirm before upload
Modifying existing Quoth docs🛑 May override manual edits

How to Ask Well

Good Question Format

[Context]: What I observed/found
[Options]: Clear choices if applicable
[Recommendation]: My suggestion if I have one
[Question]: Specific question to answer

Examples

Good:

I found two timeout patterns in the codebase:

1. Hardcoded values: `await page.waitForTimeout(5000)`
2. Timeout manager: `await page.waitForTimeout(getTimeout('medium'))`

Pattern 2 appears in newer tests. Should I:
A) Update all tests to use the timeout manager (recommended)
B) Only use it for new tests
C) Keep using hardcoded values in this test to match surrounding code

Bad:

What timeout should I use?

Good:

The login test is failing because the submit button locator changed.

Current: `page.locator('#submit-btn')`
Actual button on page: `<button class="btn-primary" type="submit">Sign In</button>`

I recommend changing to: `page.getByRole('button', { name: 'Sign In' })`

This follows the project's preference for role-based selectors. Should I proceed?

Bad:

The locator is wrong. Can I fix it?

Confidence Calibration

Signs of High Confidence (proceed)

  • Pattern matches documented standard
  • Multiple examples in codebase follow same pattern
  • Fix is mechanical (typo, missing import, syntax)
  • No business logic involved
  • Reversible action

Signs of Low Confidence (ask)

  • First time seeing this pattern
  • Conflicting patterns in codebase
  • Involves understanding user intent
  • Could affect other tests
  • Irreversible or hard to reverse
  • Involves production systems

Emergency Stops

ALWAYS stop and ask before:

  1. Running tests against production URL
  2. Deleting files
  3. Modifying environment variables or secrets
  4. Changing CI/CD configuration
  5. Modifying shared authentication state
  6. Making breaking changes to test utilities used by many tests
  7. Disabling tests without understanding why they exist
  8. Committing or pushing changes

Recovery from Mistakes

If Claude Code takes an action that turns out to be wrong:

  1. Acknowledge the mistake
  2. Explain what was assumed incorrectly
  3. Revert if possible
  4. Learn - update this document or Quoth if the pattern should be documented
  5. Ask about similar situations going forward