All posts

Why I Started Building AgenticQA

I've been in QA for more than a decade. Amazon, Honeywell, NCR. Led teams, built frameworks, caught bugs nobody else would have found.

And honestly… it's crazy how innovation has been hard to adopt for QA teams.

We're still doing the same thing we did in 2010: read a Jira ticket, write test cases, run them, file bugs with screenshots and steps to reproduce. Retest, reopen, all over again. Every sprint. Every ticket.

Everything else is now automated: from coding, deployments, infrastructure, and even code reviews. But QA? We're still copy-pasting into Jira.

Going Deep on AI Agents

Around the same time, I started testing agentic systems and chatbots, and decided to go deep on learning how AI agents work. What they can actually do. How they fail. How you orchestrate them into something useful. And I thought: why not point that at the problem I know best?

So I started building. On my own time. No sprint, no ticket, no one asking for it.

The Pipeline

I'm calling it AgenticQA. It's an 8-step AI pipeline where each step is its own agent, running sequentially like a real QA workflow:

  1. Reads the Jira ticket and acceptance criteria
  2. Explores the target URL with a real browser, navigates pages, takes snapshots, reads content
  3. Enriches the criteria using the ticket, Figma designs, and the explorer's findings
  4. Generates test cases, including happy paths and edge cases
  5. Executes them in a real browser using Playwright and captures failure screenshots
  6. Writes bug reports in Jira format, including steps to reproduce, severity, and environment
  7. Runs WCAG 2.1 accessibility checks with axe-core
  8. Outputs a reusable Playwright test script

Built with:

TypeScript React Express Playwright Claude API axe-core Railway

Real-time updates stream to a dashboard so you can watch each agent work. Deployed on Railway as a single service.

What's Next

It's not done. I'm still building, still breaking things, still learning how agentic systems fail in ways I didn't expect. But it already reads real tickets, explores real pages, runs a real browser, and files real bugs.

I'm not trying to replace QA engineers. I'm trying to stop wasting their time on work that a machine should've been doing years ago.

This is post one of a series. I'll share what I'm building as I go, what's working, what's not, and what I'm still figuring out along the way.