AI in QA - Issue #11

Tune in live on Monday mornings @ 7:00AM CT on YouTube or LinkedIn Live

Prioritize Your Learning With a Skill

I came across a skill recently called learning-opportunities that I think is worth sharing. The idea: you install it into your repo, run /orient, and it generates an orientation.md with the information the learning skill needs to teach you about your own codebase. From there you call /learning-opportunities orient and it offers you two lessons that walk you through the core features of the repo, by asking you questions, rather than dumping a wall of explanation on you. The orientation approach actually pulls from empirical research on program comprehension and codebase navigation.

One cool thing about this skill is you can wire it up (via the learning-opportunities-auto companion) so that after every git commit, Claude considers offering a short 10–15 minute exercise: prediction, trace-the-path, debug-this, teach-it-back. If you're brand new to a codebase and want to gain higher-level understanding via Q&A instead of reading every file, this is a useful tool. It tends to surface things that aren't obvious the first time you look at the code. It's also useful as a continued learning effort while you're still actively building and generating code.

The skill is built to counteract a specific risk of AI-assisted coding called the generation effect: accepting generated code and writing less of your own can skip the active processing that builds real understanding. The exercises are grounded in well-established learning science, and the approach interrupts the default fast-fluent agentic flow on purpose, to give you space to reflect on what you're actually shipping. For those of us doing QA work across unfamiliar repos, unfamiliar frameworks, and unfamiliar architectural patterns, that's a low-cost way to keep your own expertise growing alongside the code.

The repo is here - https://github.com/DrCatHicks/learning-opportunities/ and below happy learning!

Headlines & Launches

Two Months Writing Automation Solely with AI
Todd Lemmonds (LinkedIn)
Todd Lemmonds spent two months writing automation solely with AI and declares the 'learn to code' era officially over for QA. Argues the value of QE shifts from syntax to strategy: defining business risk, architecting validation strategies, orchestrating diverse AI testing services, and realizing true shift-left through CoEs. Key quote: 'Stop writing scripts. Start architecting quality.' He works for an agentic AI testing company, which gives him visibility into how top companies are engineering with AI.

Building an Agent Was Easy. Deciding What It Should See Wasn't.
Praveen Varghese (LinkedIn)
Praveen Varghese shares lessons from building an AI agent: the hard part wasn't the agent itself, it was deciding what context it should see. Three approaches tried: (1) giving it source code — led to hard-to-trace failures plus IP concerns, (2) structured documentation — suffered from drift and information overload, (3) MCP — agent pulls only what it needs via deliberately exposed tools/knowledge. Key insight: give too much info and it gets confused, give fragments and it's confidently wrong, give nothing and it hallucinates. The context layer IS the agent.

Tools & Frameworks

AI-Driven SDLC Harness for Claude Code
Mostafa Ashraf (LinkedIn)
Mostafa Ashraf shares a key lesson from building an AI-driven SDLC harness: your Reviewer agent must have zero write access. If it can edit code, it will silently fix things instead of flagging them — defeating the purpose of review. His harness uses structured comment types (R for Developer, T for Tester, S for spec compliance) routed by an orchestrator, so fixes always go back to the original author. The repo supports multi-agent TDD workflows with Azure DevOps, Jira, GitHub, GitLab, and more.

GitHub - MobAI-App/mobai-mcp · GitHub
MobAI (GitHub)
MobAI MCP Server — a mobile device automation tool via MCP. Controls Android and iOS devices/emulators through a DSL-first interface (execute_dsl) that batches actions like tap, swipe, type, assertions, waits, and conditional branches into a single JSON script. Works with Claude Code, Cursor, Windsurf, and any MCP-compatible tool. Requires MobAI desktop app running locally. Also supports .mob test file management. 172 stars, Apache 2.0 license.

aimock — Deterministic mock infrastructure for AI apps
CopilotKit
Deterministic mock infrastructure for AI apps. Mocks everything your AI app talks to: LLMs (14 providers with streaming), MCP tools, A2A agents, vector DBs (Pinecone/Qdrant/ChromaDB), search & rerank, moderation, and multimedia APIs. Key feature: record & replay mode proxies real API calls to capture fixtures, then replays them deterministically in CI. Also does drift detection — monitors real APIs daily and ships fixture updates when response formats change. Has Vitest/Jest plugins, Docker support, chaos testing, and a comparison table showing it covers way more than MSW or other mock tools.

Foundations

10 Years of Exploratory Testing: The Foundation for Agentic AI
Maaret Pyhäjärvi (LinkedIn)
Maaret Pyhäjärvi reflects on 10 years of contemporary exploratory testing and why it's become the foundation for amplified results in the AI era. Six things she says were already true before AI changed the surface area.

Techniques & Tutorials

Prompt Test Personas and Templates
Andrejs Doronins (LinkedIn)
Andrejs Doronins cuts through fluffy AI personas — 'senior QA with 10 years experience' doesn't move the needle. What actually matters: specific techniques to use (EP/BVA, decision tables, CRUD testing), good/bad examples for each technique, constraints on what not to do, and output format rules. Strong prompts run 500-1500 words. He's published starter prompt templates for test generation.

Using Claude for Manual Testing Tasks — Additional Ideas
Samantha Louw (LinkedIn)
Samantha Louw shares practical ways to use Claude (with computer use/Chrome tab access) for manual testing: use it as a rubber duck before bothering developers, investigate console/network errors in real-time and pinpoint the broken code, analyze logs and traces, compare implementations against Figma designs for visual diffs, and find inconsistent bug reproduction causes in the codebase. Emphasizes it's about augmenting QA, not replacing it.

Replace Vibes with Tools | Chiron Codex Library
Chiron Codex (Chiron Codex Library)
From the Chiron Codex 'Patterns in AI-Augmented Software Development' book: when a task is deterministic and well-specified, don't let the AI guess at it — wrap it in a tool. LLMs waste tokens trial-and-erroring commands like test runners, often running only partial suites or creating incorrect invocations. Instead, create scripts with clear help docs, distinct return codes, and terse output; expose them via MCP servers or agent skills. Example: an agent skill that wraps xcodebuild to run all unit/integration tests correctly every time. Part of a larger pattern language for AI-assisted development.

Self-Diagnosing CI: Auto-Creating Jira Tickets and Running a Playwright Healer Agent in GitHub Actions — Paul Yardley QA
Paul Yardley (Paul Yardley QA)
Paul Yardley walks through building a self-diagnosing CI pipeline: auto-creating Jira tickets on Playwright failures and running a Claude Code healer agent in GitHub Actions to attempt automatic test repair. Detailed postmortem of 5 consecutive CI failures — wrong model name, rate limits, max-turns exit codes, output buffering eating results, and a results.json race condition. The healer case study (TC-04) is the real gem: it correctly diagnosed two root causes but proposed a fragile .nth(4) fix that would silently break later. Key takeaway: healer output is a starting point for human review, not a merge-without-inspection artifact.

Research & Data

AI and Testing: Using Local Models for Testing
Jeff Nyman (Tester Stories)
Nyman walks through using a local LLM to analyze a web app, generate test cases, hunt for bugs, and produce Playwright scripts. The AI got 80% there but missed critical interaction details — a practical demonstration of AI-augmented testing's strengths and limits.

AI Isn’t Making Technical Recruiters Obsolete: It’s Raising the Bar for Curiosity - InsitePeek
InsitePeek (InsitePeek)
Article based on a conversation with Butch Mayhew about AI's impact on technical recruiting for QA roles. Argues companies over-index on framework keywords (Playwright, Cypress) instead of engineering fundamentals — debugging, testing strategy, system thinking. As AI lowers friction between tools, transferable skills matter more than tool-specific experience. Butch's advice: recruiters should 'walk a hundred yards in an engineer's shoes'.

Hot Take 🔥

Share Your Immediate Thoughts on AI Testing
Richard Bradshaw (LinkedIn)
Richard Bradshaw shares a slide from a talk asking for immediate reactions to the claim about testing keeping up with AI-speed development. The 80+ comment thread features strong takes that are worth reading to capture a pulse of the Testing community.

Quick Links

Code Coverage in Agentic Engineering
Jesse Black (jesseblack.net)
Jesse Black explores code coverage strategy in the age of AI coding agents. Chasing 100% coverage becomes a token drain — agents bloat context windows maintaining tests on untouched code. His biggest wins: switching to diff-based coverage gates (only new/changed code needs coverage), using branch + named function coverage instead of just line coverage, and relaxing requirements for UI files. Jesse built covgate, a Rust CLI for diff coverage gates. This one is more focused on unit/integration tests than UI or API tests.

Atlassian Teamwork Graph: The context engine behind your AI—everywhere - Inside Atlassian
Atlassian Blog
Atlassian launches Teamwork Graph — a context engine that connects people, goals, code, and content across Atlassian + connected SaaS apps, now with 150B+ objects. Key releases: Teamwork Graph CLI (open beta) for piping graph context into coding agents like Claude Code and Cursor, Rovo MCP Server (open beta) so any MCP-speaking AI assistant can query and update the graph, and custom Forge connectors for bringing proprietary data in.

Warden
Sentry (Sentry)
Warden by Sentry — an AI-powered code review CLI that runs skills (defined as SKILL.md files) against your code changes. Ships with built-in security-review and code-review skills, plus you can create custom ones for API design, architecture, test coverage, auth hardening, etc. Runs locally before push or as a GitHub bot on every PR, posting findings as suggested changes you can apply with one click. Skills are just Markdown prompts — no build steps or SDKs needed.

awesome-ai-testing repo
Tugkan Boz (GitHub)
Awesome AI Testing — a comprehensive curated list of AI-powered testing tools, frameworks, and resources for QA engineers. Covers 17+ categories: test generation, MCP-based testing, self-healing frameworks, E2E platforms, mobile AI testing, visual AI testing, natural language authoring, LLM-as-judge evaluation, analytics/triage, code coverage, test data generation, mocking, performance, accessibility, API testing, LLM system testing, and browser automation for agents. Each entry marked as open source, commercial, or open core.

Claude's First Day at Dunder Mifflin (The Office Parody)
X (Twitter)
Fun one - The Office parody video imagining Claude's first day at Dunder Mifflin.

If something in this issue made you think differently about how your team approaches AI in testing, pass it along. The best conversations about AI and QA are happening in Slack channels and stand-ups, not just newsletters.

Have something worth featuring? Reply and send it my way, I read every link.

Thanks for reading,
Butch Mayhew

AI in QA Newsletter Review Livestream