Testing Reference
This document describes all testing commands and skills available in this workspace — when to use each, how they compare, and recommended testing workflows for common situations.
Quick Comparison
All entries below are skills (the only mechanism we use — Anthropic merged custom commands into skills, and this workspace was never built on .claude/commands/). The Agents column shows how many parallel agents the skill orchestrates.
| Command | Focus | Agents | Output |
|---|---|---|---|
/test-counsel | Persona-based — Would Henk/Fatima/Sem/… succeed? | 8 | {APP}/test-results/ |
/test-app | Perspective-based — Functional, UX, accessibility, performance, security, API | 1 / 6 | {APP}/test-results/README.md |
/test-functional | Feature correctness (GIVEN/WHEN/THEN) | 1 | Chat + optional evidence |
/test-api | REST API endpoints | 1 | Chat + API report |
/test-accessibility | WCAG 2.1 AA (axe-core) | 1 | Chat + a11y report |
/test-performance | Load times, API response | 1 | Chat |
/test-security | OWASP Top 10, Nextcloud roles | 1 | Chat |
/test-regression | Cross-feature regression | 1 | Chat |
/test-persona-henk | Henk's perspective only | 1 | Chat |
/test-persona-fatima | Fatima's perspective only | 1 | Chat |
/test-persona-sem | Sem's perspective only | 1 | Chat |
/test-persona-noor | Noor's perspective only | 1 | Chat |
/test-persona-annemarie | Annemarie's perspective only | 1 | Chat |
/test-persona-mark | Mark's perspective only | 1 | Chat |
/test-persona-priya | Priya's perspective only | 1 | Chat |
/test-persona-janwillem | Jan-Willem's perspective only | 1 | Chat |
Typical Testing Workflows
After implementing a feature (pre-PR)
The standard validation flow before raising a pull request:
/opsx-verify # Confirms implementation matches specs (reads code + artifacts, no browser)
/test-functional # Verifies the feature behaves as specced step by step
/test-counsel # User acceptance from all 8 personas
/create-pr
For a quicker pass when you're confident in the implementation:
/opsx-verify
/test-app # Quick mode: single-agent smoke test across all perspectives
/create-pr
When reviewing someone else's PR
When a colleague or CI agent hands you a PR to review:
/review-pr <PR#> # fetches diff, asks strictness, posts 🔴/🟡/🟢 inline comments, submits APPROVE or REQUEST_CHANGES
Review multiple PRs in parallel by passing more than one:
/review-pr 123 456 # both reviewed simultaneously
The skill detects re-reviews (skips if nothing has changed since your last pass), auto-suggests Strict mode on security-sensitive code (auth, RBAC, CI), and blocks APPROVE if required CI checks are failing.
Full regression sweep (before a release or major merge)
When you want comprehensive coverage — correctness, user experience, and technical quality:
/test-regression # Verify no cross-feature breakage first
/test-counsel # All 8 persona perspectives
/test-app # Full mode: 6 agents (functional, UX, accessibility, performance, security, API)
Run /test-regression first — if existing flows are already broken, there's no point running the broader sweeps.
Quick smoke test
When you just want to confirm the app is up and main flows work:
/test-app # Choose Quick mode when prompted (1 agent)
Or a single persona for a faster targeted check:
/test-persona-sem # Sem (digital native) flow only — fastest meaningful check
Focused area testing
Use single-agent commands when you need to target a specific quality dimension:
| Goal | Command |
|---|---|
| Verify a specific feature works end-to-end | /test-functional |
| Validate REST API endpoints | /test-api |
| Audit accessibility compliance | /test-accessibility |
| Measure page and API speed | /test-performance |
| Security and role checks | /test-security |
| Check nothing unrelated broke | /test-regression |
| One persona's full journey | /test-persona-* |
Feature design review (before implementation)
Use /feature-counsel to get persona feedback on specs before building:
/opsx-ff # Generate all spec artifacts
/feature-counsel # 8 personas analyze specs, suggest missing features
# [review and refine specs]
/opsx-apply # Only then implement
This is the only testing-adjacent command that runs before implementation. It reads specs, not the live app.
Skills (Multi-Agent)
/test-counsel — Persona-Based Testing
Lens: User experience. "Would Henk, Fatima, Sem, Noor, Annemarie, Mark, Priya, or Jan-Willem succeed and be satisfied?"
Agents: 8 (one per persona) — run in parallel.
Use when: You want feedback from realistic user perspectives. Each persona represents a different role, technical level, and set of priorities (citizen, developer, municipal officer, etc.). Output includes in-character findings and verdicts per persona. Best used after /test-functional confirms the feature works — this answers whether users would succeed, not just whether the spec was met.
Cap impact: Very high — 8 parallel agents. Open a fresh Claude window before running. See parallel-agents.md.
See: .claude/skills/test-counsel/SKILL.md
/test-app — Perspective-Based Testing
Lens: Technical quality. "Does everything work from functional, UX, accessibility, performance, security, and API angles?"
Agents: 1 (Quick mode) or 6 (Full mode) — one per perspective.
Use when: You want a structured technical sweep. Each perspective has a specific checklist. Quick mode is low-cost and fine to run regularly. Full mode is for thorough pre-release validation.
Output: {APP}/test-results/README.md — summary with PASS/PARTIAL/FAIL/CANNOT_TEST per perspective.
Cap impact: Low (Quick) to Very high (Full). See parallel-agents.md.
See: .claude/skills/test-app/SKILL.md
/test-counsel vs /test-app
| Aspect | /test-counsel | /test-app |
|---|---|---|
| Lens | Persona — user goals and experience | Perspective — technical correctness |
| Question | "Would Henk/Fatima/Priya/… complete their tasks?" | "Do features work, perform, and meet standards?" |
| Agents | 8 (one per persona) | 1 / 6 (Quick or Full mode) |
| Output style | In-character findings, persona verdicts | PASS/PARTIAL/FAIL, technical notes |
| Best for | User acceptance, UX feedback | Quality gates, regression, coverage |
Both cover the full app. Use both for thorough validation: /test-counsel for user perspective, /test-app for technical perspective.
Commands (Single-Agent)
/test-functional
Feature correctness via browser. Executes GIVEN/WHEN/THEN scenarios from specs or acceptance criteria against the live app.
Use when: You've implemented something specific and want to confirm it behaves exactly as specced, step by step. More targeted than /test-counsel — it follows the spec, not a persona narrative. Good as the first test after implementation before running broader sweeps.
/test-api
REST API testing. Checks endpoints, authentication, pagination, and error responses for Nextcloud app APIs.
Use when: You've added or changed API behaviour — new endpoints, modified responses, or changed data structures. Also useful when Priya's or Annemarie's persona test surfaces API concerns worth investigating further.
/test-accessibility
WCAG 2.1 AA compliance using axe-core. Injects axe, runs automated checks, reports violations. Adds manual verification for keyboard navigation and focus management.
Use when: You've added or changed UI components, forms, or navigation. Should be run before archiving any change that touches the frontend. /test-app Full mode includes an accessibility perspective, but this command goes deeper.
/test-performance
Load times, API response times, network requests. Uses browser timing APIs and sequential API calls to measure real-world performance.
Use when: You've added new pages, heavy queries, or API-heavy features. Also use when /test-app Full mode flags a performance concern that needs more detail.
/test-security
OWASP Top 10, Nextcloud roles, authorization. Checks XSS, CSRF, sensitive data exposure, and role-based access control.
Use when: You've changed authentication, permission logic, or added any form or user input handler. Also use before archiving changes that touch admin interfaces or user data.
/test-regression
Cross-feature regression. Tests unrelated flows to verify a change hasn't broken anything outside its scope. Broader than /test-functional.
Use when: You've made structural changes — database schema, core service updates, shared utilities — where side-effects are plausible. Run this before the persona sweeps in a full regression cycle; if something is already broken it'll surface here first.
/test-persona-*
Single-persona deep dive. Use when you want one persona's full assessment without launching all eight:
| Command | Persona | Role |
|---|---|---|
/test-persona-henk | Henk Bakker | Elderly citizen — low digital literacy |
/test-persona-fatima | Fatima El-Amrani | Low-literate migrant citizen |
/test-persona-sem | Sem de Jong | Young digital native |
/test-persona-noor | Noor Yilmaz | Municipal CISO / functional admin |
/test-persona-annemarie | Annemarie de Vries | VNG standards architect |
/test-persona-mark | Mark Visser | MKB software vendor |
/test-persona-priya | Priya Ganpat | ZZP developer / integrator |
/test-persona-janwillem | Jan-Willem van der Berg | Small business owner |
Use when: You know which persona is most affected by the change, or when you've already run /test-counsel and want a deeper single-perspective follow-up. One agent instead of eight — lower cap cost than /test-counsel.
Test Scenarios
Test scenarios ({APP}/test-scenarios/TS-NNN-slug.md) are reusable, Gherkin-style flows that the test commands pick up automatically. When scenarios exist, the following commands ask whether to include them before launching agents:
| Command | Behaviour |
|---|---|
/test-app | Offers to include all active scenarios before launching agents. Agents execute scenario steps before free exploration. |
/test-counsel | Offers to include scenarios, grouped by persona. Each persona agent receives only the scenarios tagged with their slug. |
/test-persona-* | Scans for scenarios matching that persona's slug. Asks to run them before free exploration. |
/test-scenario-run | Runs scenarios directly (by ID, tag, persona, or all) |
/test-scenario-create
Guided wizard for creating a well-structured test scenario for a Nextcloud app.
Usage:
/test-scenario-create
/test-scenario-create openregister
What it does:
- Determines the next ID (
TS-NNN) by scanning existing scenarios - Asks for title, goal, category (functional/api/security/accessibility/performance/ux/integration), and priority
- Shows relevant personas and asks which this scenario targets
- Suggests which test commands should automatically include it
- Auto-suggests tags based on category and title
- Guides through Gherkin steps (Given/When/Then), test data, and acceptance criteria
- Generates persona-specific notes for each linked persona
- Saves to
{APP}/test-scenarios/TS-NNN-slug.md
Scenario categories and suggested personas:
| Category | Suggested personas |
|---|---|
| functional | Mark Visser, Sem de Jong |
| api | Priya Ganpat, Annemarie de Vries |
| security | Noor Yilmaz |
| accessibility | Henk Bakker, Fatima El-Amrani |
| ux | Henk Bakker, Jan-Willem, Mark Visser |
| performance | Sem de Jong, Priya Ganpat |
| integration | Priya Ganpat, Annemarie de Vries |
/test-scenario-run
Execute one or more test scenarios against the live Nextcloud environment using a browser agent.
Usage:
/test-scenario-run # list and choose
/test-scenario-run TS-001 # run specific scenario
/test-scenario-run openregister TS-001 # run from specific app
/test-scenario-run --tag smoke # run all smoke-tagged scenarios
/test-scenario-run --all openregister # run all scenarios for an app
/test-scenario-run --persona priya-ganpat # run all Priya's scenarios
What it does:
- Discovers scenario files in
{APP}/test-scenarios/ - Filters by tag, persona, or ID as specified
- Asks which environment to test against (local or custom URL)
- Asks whether to use Haiku (default, cost-efficient) or Sonnet (for complex flows)
- Launches a browser agent per scenario (parallelised up to 5 for multiple)
- Agent verifies preconditions, follows Given-When-Then steps, checks each acceptance criterion
- Writes results to
{APP}/test-results/scenarios/ - Synthesises a summary report for multiple runs
Model: Asked at run time. Haiku (default) — fast, cost-efficient. Sonnet — for complex multi-step flows or ambiguous UI states where Haiku may misread the interface. Cap cost scales with the number of scenarios run in parallel.
Cap impact: Low for single scenario; medium for multiple. See parallel-agents.md.
Result statuses: PASS / FAIL / PARTIAL / BLOCKED
/test-scenario-edit
Edit an existing test scenario — update any field (metadata or content) interactively.
Usage:
/test-scenario-edit # list all scenarios, pick one
/test-scenario-edit TS-001 # open specific scenario
/test-scenario-edit openregister TS-001 # open from specific app
What it does:
- Locates the scenario file
- Shows a summary of current values (status, priority, category, personas, tags, spec refs)
- Asks what scope to edit: metadata only / content only / both / status only / tags only
- Walks through each field in scope, showing the current value and asking for the new one
- Supports
+tag/-tagsyntax for incremental tag changes, same for personas - Regenerates persona notes if the personas list changed
- Optionally renames the file if the title changed
- Writes the updated file and shows a diff-style summary
How existing test commands use scenarios
| Command | Behaviour when scenarios exist |
|---|---|
/test-app | Asks to include active scenarios before launching agents. Agents execute scenario steps before free exploration. |
/test-counsel | Asks to include scenarios, grouped by persona. Each persona agent receives only the scenarios tagged with their slug. |
/test-persona-* | Scans for scenarios matching that persona's slug. Asks to run them before free exploration in Step 2. |
Environment
All testing uses:
- Base URL:
http://localhost:8080(local env) orhttp://localhost:3000(dev) - Credentials:
admin/admin(local env) - Browser: Playwright MCP — see browser usage below
Ensure Docker is running and Nextcloud is accessible before testing. See docker.md for environment setup.
Browser Usage
| Context | Browser | When |
|---|---|---|
| Single-agent commands | browser-1 | /test-functional, /test-persona-*, /test-* — one agent at a time |
| test-counsel (8 parallel) | browser-2–browser-5, browser-7 + overflow | One browser per persona |
| test-counsel (1 persona) | browser-1 | When testing a single persona only |
| test-app Quick | browser-1 | Single agent smoke test |
| test-app Full (6 parallel) | browser-2–browser-5, browser-7 + 1 | Each perspective gets a distinct browser |
| User observation | browser-6 | Headed browser for watching tests live |
Rule: Single agent = browser-1. Parallel agents = each gets a distinct browser (browser-2, browser-3, browser-4, …) to avoid session conflicts.
For browser pool configuration, verification steps, and .mcp.json setup, see playwright-setup.md.