Features
Everything Traceback can do.
Every capability is designed to reduce friction and give your team more time to build.
Autonomous mode
AI operates a real browser with a virtual cursor and keyboard. No selectors, no test code.
Plain English prompts
Write what you want tested the same way you would explain it to a person. Traceback reads the prompt, understands the intent, and works out every step on its own. There is no test syntax to learn and no structure to follow. You describe an outcome and the agent figures out how to reach it, including the navigation, the interactions, the form inputs, and the verification at the end.
Computer use, not selectors
Autonomous mode is built on computer use. The agent sees the rendered screen as a visual surface and controls a virtual cursor and keyboard to interact with it. There are no CSS selectors, no XPath queries, no DOM traversal, and no dependency on how the interface is structured in code. The agent clicks where it sees a button and types where it sees a field. This makes autonomous tests fundamentally resilient to UI changes because they are not coupled to implementation details.
Model selection
Choose which AI model powers each test run. Faster models are well suited for high-frequency runs in CI where speed matters most. More capable reasoning models are better for complex flows that require multi-step decision making or nuanced visual understanding. You can configure model selection at the test level so each run uses the right tradeoff for its specific purpose.
Live viewer and run replay
Every autonomous run is observable in real time. Open the live viewer to watch the virtual cursor navigate your product as the agent works through the test. When the run finishes, the full session is recorded and available for replay. Each step is stored with the rendered browser state at that moment, so you can scrub through the session and see exactly what the agent encountered at any point during the run. Failure analysis is grounded in what was actually on screen.
Parallel and dry runs
Autonomous tests can run in parallel across multiple environments simultaneously. If you have a staging environment and a preview deployment that both need coverage before a merge, Traceback runs against both at the same time. Dry runs let the agent plan and preview what it intends to do without executing the full test. This is useful for reviewing agent behavior before committing to a run or for debugging prompt phrasing.
Cost calculation and reporting
Before a run begins, Traceback shows an estimated cost based on the model, the number of steps, and the expected session length. After the run completes, the actual cost is recorded. Every run produces a structured report that includes step-by-step reasoning notes from the agent, the pass or fail status of each interaction, screenshots at each decision point, and a plain-language failure summary when something goes wrong.
Deterministic mode
Define test steps explicitly and Traceback runs them via Playwright with self-healing.
Structured step editor
Build tests by composing an ordered list of explicit steps. Each step specifies an action type, a target, and any associated input. The editor provides a structured interface for constructing this sequence without writing code. Because execution follows the defined order exactly, deterministic tests are predictable and easy to reason about. When a run fails, you know precisely which step failed and what the state of the browser was at that point.
Playwright execution with self-healing
Under the hood, deterministic mode translates your defined steps into Playwright operations and runs them against a real browser. When a selector stops working because a component was refactored or a class name changed, Traceback detects the failure and attempts to resolve it by finding the correct target in the updated UI. If it succeeds, the test is updated automatically and the run continues. This keeps deterministic tests maintainable over time without constant manual intervention.
Scan view
The scan view is a sequential record of the test run. For each step, it shows the rendered browser state before the action, the action that was taken, and the outcome. You can step through the entire run visually and understand what happened at each stage. This makes debugging fast because you are not reading log output. You are watching the run play out frame by frame.
Parallel and dry runs
Deterministic tests support parallel execution across environments. You can run the same test against your main branch, a feature branch preview, and a staging environment at the same time. Dry runs validate that the step sequence is structurally correct and that Playwright can locate the necessary targets before you execute the full test. This prevents wasted runs caused by configuration mistakes.
Cost calculation
Every deterministic run includes a cost breakdown. You can see how much each run consumed individually and track aggregate usage over time across your team and across projects. This gives engineering and platform teams the visibility they need to plan capacity and understand how testing costs scale as the suite grows.
GitHub integration
Tests run automatically on every pull request and results post directly to the PR.
PR detection and diff analysis
When a pull request is opened or updated, Traceback reads the diff and builds a picture of what changed. It uses this to generate tests that are scoped to the affected surfaces rather than rerunning the entire test suite for every change. A PR that touches the checkout flow gets checkout tests. A PR that updates the settings page gets settings tests. Coverage stays proportional to risk without requiring any manual test selection.
Auto-run on push
Once a repository is connected, tests run automatically on every push with no additional configuration. There is no YAML to maintain and no CI step to wire up manually. Traceback handles the trigger, the execution environment, and the result reporting. If you stop needing a test, you remove it from Traceback. The GitHub side takes care of itself.
Results in your PR and web app
Test results appear in the pull request as a status check and as inline comments on the relevant diff lines when a failure is tied to a specific change. Engineers reviewing the PR see the test status alongside the code without switching contexts. The full run detail, including replay and step-by-step breakdown, is available in the Traceback web app for anyone who needs to dig deeper.
Staging environment workflows
Point Traceback at a staging URL and tests run against it before the PR merges. This catches the category of bugs that only appear in a deployed environment where real network calls, real sessions, and real data are involved. The same test definitions that run in production coverage run against staging, so there is no divergence between your staging and production test logic.
Linear integration
Failed runs create Linear issues automatically. They close on their own when tests pass.
Issue creation from runs
When a run fails, you can create a Linear issue from within Traceback with a single action. The issue is pre-populated with the test name, the failure description, the steps that led to the failure, and a direct link to the run replay. Nothing needs to be copied or summarized manually. The person who picks up the issue in Linear has everything they need to reproduce and fix the problem.
Bidirectional tracking
Every issue created from a run stays linked to that run. When Traceback detects that the previously failing test now passes, the linked issue is closed automatically. The engineering team does not need to remember to close issues when they fix a bug that was caught by a test. The loop closes on its own. If the test regresses again later, a new issue is created and the cycle begins again.
Live issue creation
You can create a Linear issue at any point during a run, not only after it fails. If you are watching a run live and notice something worth tracking, you open the issue panel from the run view and create it without leaving the page. The failure context from that moment in the run is attached automatically. This is useful for capturing observations about product behavior that fall outside the strict pass or fail result of the test.
CLI
Run tests locally against your development server before anything is pushed.
Local execution for both modes
The Traceback CLI runs both autonomous and deterministic tests against a local server. Point it at localhost and it executes the same test definitions that run in your CI pipeline and in production. This means engineers can verify their changes before pushing rather than waiting for the CI run to finish. The local environment is treated as a first-class execution target with the same capabilities as any remote environment.
Synced results
Every run triggered from the CLI syncs to the Traceback web app automatically. Results are stored alongside runs triggered by GitHub events or manual runs from the web interface. Your team sees a complete history of all runs regardless of where they were triggered. There is no separate local results view and no need to share logs manually when debugging a failure that was first caught locally.
Agent tools
Supply tests with real credentials, inboxes, and phone numbers at execution time.
Credential and auth management
Store credentials in Traceback and inject them into runs at execution time. The agent handles the login flow, manages session state across steps, and navigates to authenticated areas of the product without any special test setup. Credentials are never exposed in test definitions or in logs. You define them once and they are available to any test that needs them, across both autonomous and deterministic modes.
Virtual mailbox
Each test run can be provisioned with a real, functioning email inbox. The agent can trigger flows that send email, wait for the message to arrive, read its contents, and interact with it. Verification emails, password reset links, onboarding sequences, and transactional notifications are all testable end to end. The inbox is isolated per run so there is no interference between concurrent test executions.
Virtual phone
Each test run can be provisioned with a real phone number that can receive SMS messages. This enables testing of flows that involve two-factor authentication, phone number verification, or any product feature that communicates with users over SMS. The agent can receive the message, extract the relevant content, and continue the test flow. Phone numbers are isolated per run the same way inboxes are.
MCP
Expose your internal tools and APIs to Traceback agents during test execution.
Tool access for agents
MCP is the mechanism by which Traceback agents access systems beyond the browser surface. During a run, an agent can call internal APIs, query databases, trigger background jobs, or read from external services if those tools are registered and made available. This allows tests to cover the full behavior of a product feature rather than just the visible UI. An agent can verify that a submitted form actually wrote the correct data, not just that the form submitted without an error.
Extensible function layer
Register custom functions and data sources to extend what agents can access during test execution. The function layer is designed to be open-ended. You define what to expose, how to call it, and what it returns. Agents use this layer to reach into the parts of your system that matter for correctness but are not represented in the UI. This is how tests move from surface-level UI checks to meaningful verification of product behavior.
Self-healing
Tests repair themselves when flows change. The suite stays healthy without manual upkeep.
Automatic retries
When a step fails due to timing, a slow network response, or a transient rendering delay, Traceback retries the step before reporting a failure. The retry logic is tuned to distinguish real failures from noise. Flaky failures caused by infrastructure variability do not produce false alerts. Engineers only see a failed run when something in the product actually broke, which keeps the signal clean and maintains trust in the test suite.
Failure detection and resolution
When a flow changes in a way that breaks a test, Traceback identifies the nature of the change and attempts to resolve it. For deterministic tests, it finds the updated target in the current UI and updates the test definition to match. The resolution is logged so engineers can review what changed and confirm the update was correct. The goal is that routine product changes do not generate test maintenance work. Engineers focus on building features, not keeping tests synchronized with them.
Continuous feedback loop
Every run produces data that feeds back into the system. Over time, Traceback builds a model of which parts of your product are stable and which change frequently, which test patterns tend to produce false failures and which are reliable, and how the product behaves across different environments and conditions. Tests that participate in this loop become more stable with each run. The suite improves passively as a byproduct of normal use.
Integrations
Send results and trigger runs through Slack, Notion, and Figma.
Slack
Connect Traceback to Slack to receive run results and failure alerts in the channels where your team already communicates. Failure notifications include the test name, a summary of what went wrong, and a link to the full run in the web app. You can also trigger test runs directly from Slack without opening a browser. This is useful for running tests on demand before a release or sharing a run result with a teammate without leaving the conversation.
Notion
Sync run summaries and test reports to Notion automatically. Teams that maintain living documentation of their testing coverage, release status, or QA processes can connect Traceback and have run data flow into the relevant pages without manual updates. Reports include pass rates, failure summaries, and links back to individual runs. The Notion integration is particularly useful for teams that share test visibility with non-engineering stakeholders.
Figma
Reference Figma files during test creation to compare design specifications against the rendered state of the product. When a component is built, the Figma integration allows Traceback to use the design as a reference for what the implementation should look like. Visual discrepancies between the design and the coded output are surfaced as part of the test result. This closes the loop between design and engineering without requiring a manual review step from the design team.
See it in action.
A 20 minute demo is enough to understand how Traceback fits your workflow.