Test observability · AI failure clustering

Tame the flaky test.

Self-hosted observability for your E2E tests. Test health is a leading indicator of product health.

curl -fsSL https://brittle.dev/install.sh | sh

Playwright
WebdriverIO
Vitest
Jest

Try the live demo View on GitHub

Apache 2.0 · One Postgres, one container · Runs in your VPC

Per-env matrix main · 7 envs

chromium

firefox

webkit

edge

safari-mac

safari-ios

android

search

article

history

i18n

visual

pass fail flaky

adapters playwright · wdio · vitest · jest

triage new · known · regressed

ai prediction ranks flakes before reruns

deploy one container, one Postgres

How it works

One reporter. One Postgres. Your tests stay yours.

01

Pick your adapter

Drop in the reporter for Playwright, WebdriverIO, Vitest or Jest. Run tests like you do today.
02

Run tests anywhere

CI, laptop, container. Wherever your framework already runs. No agent, no daemon.
03

Triage from the dashboard

Open the run. Failures are tagged NEW, KNOWN, or REGRESSED so you triage what changed, not what you already know.

~/app · pnpm test

$ pnpm test

Running 24 tests using 3 workers

✓ search box accepts query input (1.2s)
✓ article infobox renders metadata (0.9s)
✗ search results echo query [webkit] (3.1s)
✓ history page renders revisions (1.4s)

[@brittlehq/playwright-reporter]

→ uploaded 24 sessions · 1 failed

→ ai prediction: 1 likely flake · 0 regressions

→ brittle.dev/p/web/runs/rh-9j2

1 failed · 23 passed (12.4s)

What's different

Every signal you need to read a test suite.

brittle.dev/p/web/runs/rh-9j2

AI-ranked failures 3 failing · main

tests/search

search results echoes query webkit flake 12% REGRESSED
autocomplete dropdown appears chromium flake 84% KNOWN

tests/visual

article first-paint chrome firefox flake 31% NEW

ai failure prediction

Know which failure is the flake before you rerun.

Each failure gets a flake-likelihood score from prior history, per-env signals, and stack-trace similarity, so the rerun list is already prioritised.

brittle.dev/p/web/sessions/9j2-rh

search results echoes query

chromium-130 · tests/search/search.spec.ts:36

Failed

Video

ready

Trace

ready

Screenshot

ready

Timeline

00:00.0 page.goto · /search
00:01.2 expect.toHaveText · "results"
00:03.4 TimeoutError · locator not visible

session replay

Every session, replayable.

Video, trace, HAR, console log, and command log come back with every session your runner captures. Scrub it in the browser without leaving the dashboard.

brittle.dev/p/web/sessions/9j2-rh/test-results

search results echoes query

Failed 3.1s

00:01.4

03:08

page.goto · /wiki/Albert_Einstein 0.8s
expect.toBeVisible · h1 0.3s
expect.toHaveText · "Einstein" 2.0s

test replay

Watch the failure happen.

Scrub the recording, jump to the failing step, open the trace.

brittle.dev/p/web/tests/wiki-search

article infobox renders metadata 3 envs · main

chromium-x/desktop

STABLE

webkit-x/desktop

FLAKY

firefox-x/desktop

STABLE

per-env signals

One test, every browser, every classification.

Small multiples per env. Stable on chromium and flaky on webkit reads as exactly that.

Integration

Four lines.
The rest is your test code.

Sits alongside your existing reporters. Forwards results, traces, and screenshots.

@brittlehq/playwright-reporter
@brittlehq/wdio-reporter
@brittlehq/vitest-reporter
@brittlehq/jest-reporter

playwright.config.ts

import { defineConfig } from '@playwright/test';

export default defineConfig({
  testDir: './tests',

  // drop in the brittle reporter
  reporter: [
    ['@brittlehq/playwright-reporter', {
      url: process.env.BRITTLE_URL,
      token: process.env.BRITTLE_TOKEN,
    }],
  ],

  projects: [
    { name: 'chromium' },
    { name: 'firefox'  },
    { name: 'webkit'   },
  ],
});