How to Use AI in QA Testing Without Losing Human Control

Your move

Book a call

Croatia

Split
Dračevac 3D

Zagreb
Radnička cesta 39

Contact

+385 91 395 9711

info@profico.hr

Canada

Toronto
30 Commercial Road

Croatia

Split
Dračevac 3D

Zagreb
Radnička cesta 39

Contact

+385 91 395 9711

info@profico.hr

Canada

Toronto
30 Commercial Road

AI in QA Testing: How I Built a Local AI Workflow That Speeds Up Grunt Work

Jul 3, 2026

Development

Reading time

mins

Author

Alen Suša, QA Engineer

In this article, I wanted to share how AI fits into my day-to-day QA work, what it does, and the lines I keep it inside.

If you want the short version, I use AI as a drafting and research assistant inside a controlled process.

It helps me get through the routine prep work faster, like searching through project notes for background details and putting together initial drafts so I do not have to start from scratch.

That saves me a lot of time on the daily writing and digging, but at the end of the day AI does not decide what passes, what ships, or what gets filed.

I read, edit, and own every single piece of output before it ever reaches a developer or gets turned into an official report.

Here is how that looks.

My ground rules for using AI in QA

Human sign-off on everything

The AI might draft my test plans, bug reports, and review comments, but I read, edit, and take full responsibility for every single word before it ever reaches a developer or gets logged in Jira.

Testing early at the review stage

Instead of waiting for new changes to land in a shared testing environment, I test them right away on the developer's branch while the code is still under review.

That way, any issues get caught and fixed immediately before they ever merge into the main development line.

Verify before claiming a bug

Before I write up a report, I check the actual code to confirm the problem is real rather than just assuming.

If it turns out to be expected behaviour or a known limitation, I drop it and nothing gets filed.

Local environments only

All automated browser testing runs strictly on my own machine against a local server. The tools are blocked from touching shared team environments or live production systems.

Real customer data is off-limits

I generate dummy test data inside my local database directly through the backend. I never let the tool click around in a shared environment to create records.

Plain text under my control

Everything the AI outputs gets saved as simple text files inside my personal note taking app. That keeps all my testing notes human-readable, easy to review, and completely in my hands.

My local AI testing setup

Claude Code running inside VS Code The assistant works in the project repository and my notes vault, not in any live company system.

A set of custom commands I wrote (plain-text definition files I own and can edit). Each one is scoped to a single QA task with its own rules and guardrails baked in. See the table below.

An Obsidian vault as my system of record: one test plan per ticket and a glossary of platform terms.

A local backend (a copy of the platform running on my own machine) for staging test data and reproducing issues safely.

A command-line browser driver for live runs When a command needs to open the running app, to walk through a test case from a plan or to reproduce and verify an issue, it drives the browser with playwright-cli (a terminal tool), not the heavier Playwright MCP integration. It runs against my local server only, under the same limits as everything else.

Command	What the AI does	My checkpoint
/qa-review	Reads the branch changes, the pull request, and the Jira ticket, then writes a single test plan per ticket (analysis plus ready-to-run test cases) into the vault.	I run the manual testing pass myself. The AI writes the plan; it does not decide pass or fail.
/seed-env	Stages ticket-specific test data in my local database through the backend, verifies it through the real API, and saves it as a reusable scenario plus a snapshot.	I decide what data the scenario needs. Runs locally only, never against shared environments.
/live-run	Drives the riskiest one to three cases from the test plan against the running local app with `playwright-cli`, captures evidence screenshots (spinner, error, empty, disabled states), and reports observed vs expected.	A quick pre-pass before my full manual run, local only. It flags what looks off; it never files anything or makes the pass/fail call. I still do the real testing pass.
/bug-report	Confirms the bug is real by reading the code first, then drafts a Jira report in my house format.	I set the priority, attach my own screenshots/recordings, and file it. No report is written for a non-bug.
/pr-comment	Drafts a review comment for a developer's pull request.	Optional. Commenting on the PR for a regression is mandatory; I reach for this only when the issue is hard to put in simple terms. I review and paste it into GitHub myself and attach the visual evidence.
/pw-test	Drafts an automated end-to-end test matching the existing suite's conventions.	Only when I ask. I run the test myself and review the code.
/playground	Builds a local sandbox project (sample data) for exploratory testing.	Local only, no ticket, throwaway data.

How the work stays organised

A central dashboard for every test plan

Every test plan feeds into a single tracking board with one row per ticket.

I group them by feature area, but I can also filter by update types like bug fixes, new features, and refactors, or sort by what was tested most recently.

Nothing gets scattered across random folders, and the entire testing history stays searchable in one place.

A shared glossary to keep terminology accurate

To stop the AI from making up generic wording, I keep a dedicated glossary of platform terms inside my workspace.

Whenever a command runs, it checks those exact definitions so every test plan and bug report matches the exact language our engineering team actually uses.

Flowchart showing a human-led AI QA workflow, mapping out the step-by-step process from ticket arrival and local AI test drafting to manual pass or fail testing and bug reporting.

Let's look at a real ticket from start to finish

This is the typical path a ticket takes through my workflow, showing exactly where AI helps and where I take over.

1. A ticket lands in QA

I check out the developer's branch and run /qa-review.

The AI reads the code changes, the pull request discussion, and the acceptance criteria from Jira, then writes a test plan into my vault: what changed in plain language, where to test it, and a prioritised list of test cases. I read it and adjust it before I trust it.

2. I stage the data

If the plan needs specific conditions, I run /seed-env to set that up in my local database and snapshot it.

The setup is saved as a reusable recipe (one file per ticket) that I can re-run straight from the terminal whenever I need that data again, without going back through the assistant. Nothing leaves my machine.

3. I run a quick live pre-pass

I run /live-run to drive the riskiest one to three cases through the running local app and capture evidence screenshots of the states most likely to break (loading spinner, error, empty, disabled).

It reports observed versus expected and leaves me a labelled folder of screenshots.

This is a sanity sweep, not the verdict: it surfaces obvious breakage early but never decides pass or fail, and it runs against my local server only.

4. I test, by hand

I work through the cases in the plan against my local environment or on the PR Preview URL. This is human testing.

The AI is not clicking through and grading itself as in the previous step. If a case is worth automating, I optionally run /pw-test to draft an end-to-end test for it, then run and review that test myself.

5. If I find something broken

I first check whether it reproduces on the shared QA and/or production environment. If it does, it is a pre-existing bug.

Before filing, I check my test plans to see whether the issue is already covered or filed; if it is, it is a known issue and I do not raise a duplicate.

If it is new, I run /bug-report: the AI confirms the behaviour against the code and drafts a Jira report, and I set the priority, attach my evidence, and file it in Jira myself.

If it does not, it is a regression or new issue this branch introduced, so I leave a comment on the pull request flagging it. That comment is mandatory; the /pr-comment command is not.

Putting the problem in clear, simple terms is the hard part, so I sometimes use /pr-comment to draft the wording, but either way I post it on the pull request myself.

What stays entirely human

Deciding whether something is actually a defect versus expected behaviour.
Judging severity and assigning priority on bug reports.
The actual manual testing pass and the final pass/fail call.
Sign-off before anything reaches a developer, a pull request, or Jira.
Choosing what to automate and reviewing the test code.

Data and environment handling

Browser automation is restricted to http://localhost:3000. It is barred from the shared DEV and QA environments and from production by the

rules written into every command. If the local environment is not running, the tools stop rather than reach for a remote URL.

Test data is created through the backend on my local machine only. It is never created by automating a shared environment, and it is never

committed to the repository.

AI writes notes only where it is told to (test plans go in one folder, screenshots go to a temporary location, never into the vault by accident).

Why this works

The commands are not a black box. Each one is a short, readable text file describing what the AI may and may not do, which I wrote and can change at any time.

The output is plain Markdown I review before use.

The net effect is that I spend less time on the mechanical parts (reading diffs, drafting boilerplate reports, setting up data) and more on the judgement that actually needs a tester: deciding what to test, what is broken, and how much it matters.

Alen Suša, QA Engineer

An illustration of a woman interacting with a humanoid robot behind a marble counter in a modern corporate office or bank lobby, with a sprawling city skyline visible through large windows in the background.

AI in QA Testing: How I Built a Local AI Workflow That Speeds Up Grunt Work

My ground rules for using AI in QA

Human sign-off on everything

Testing early at the review stage

Verify before claiming a bug

Local environments only

Real customer data is off-limits

Plain text under my control

My local AI testing setup

How the work stays organised

A central dashboard for every test plan

A shared glossary to keep terminology accurate

Let's look at a real ticket from start to finish

1. A ticket lands in QA

2. I stage the data

3. I run a quick live pre-pass

4. I test, by hand

5. If I find something broken

What stays entirely human

Data and environment handling

Why this works

Linkedin

Development

Escaping AI Pilot Purgatory: How Can Banks Scale AI to Production (Without Replacing Legacy Cores)

Development

AI Prototyping for businesses: Moving faster to deliver a working prototype in one week

Design

Coding with a designer 101: Building custom Framer components using AI and zero Dev knowledge

Development

Escaping AI Pilot Purgatory: How Can Banks Scale AI to Production (Without Replacing Legacy Cores)

Development

AI Prototyping for businesses: Moving faster to deliver a working prototype in one week

Design

Coding with a designer 101: Building custom Framer components using AI and zero Dev knowledge

Development

5 lessons from 2 years of building AI chatbots for businesses

Design

Testing Claude Design and ChatGPT Images 2.0: Why you still have to figure out what to build

Development

Feeling stuck on how to implement AI in your business? Start with this guide