Accessibility testing tools have a paradox. There are more options than ever, yet 94.8% of the top one million homepages still have detectable WCAG failures (WebAIM Million, 2025). The tools exist. Developers aren't using them effectively.
This guide breaks down the six most widely used accessibility testing tools — axe-core, Lighthouse, WAVE, Pa11y, Accessibility Insights, and the DIY Playwright + axe-core approach. You'll get honest comparisons of what each tool catches, where each one falls short, and which fits your workflow. No vendor sales pitch. Just data, code examples, and practical recommendations from someone who's tested them all.
For a broader look at the full testing lifecycle, see our developer's guide to web accessibility testing.
TL;DR: Automated tools catch roughly 57% of accessibility issues by volume (Deque, 2024). axe-core is the best free engine for CI/CD. Lighthouse is fine for quick checks but misleading for compliance. WAVE excels at visual review. Pair any automated tool with structured manual testing to reach ~85% detection coverage.
What Are Accessibility Testing Tools?
Accessibility testing tools scan web pages for barriers that prevent people with disabilities from using them. The WebAIM Million (2025) found an average of 51 errors per homepage across the top one million sites — down 10.3% from the previous year, but still unacceptably high.
These tools fall into five distinct categories. Understanding the differences matters because most developers grab whatever's closest and assume they're covered.
Linters and Static Analysis
Tools like eslint-plugin-jsx-a11y catch issues at write time. They flag missing alt attributes, invalid ARIA roles, and broken label associations before your code ever hits a browser. Fast feedback, but limited to what's visible in source code.
Browser Extensions
axe DevTools, WAVE, and Accessibility Insights all run as browser extensions. You click a button, the tool injects scripts into the page, and you get an overlay of issues. Great for visual review during development. Not automatable.
CLI Tools
axe-core's CLI wrapper and Pa11y both run from your terminal. They launch a headless browser, scan the URL, and output structured results. This is where CI/CD integration starts.
APIs and Scanning Services
Cloud-hosted services accept a URL and return results via API. WAVE offers a paid API. Several commercial platforms (Siteimprove, Level Access, a11yFlow) provide scan-on-demand endpoints. The tradeoff is cost versus infrastructure management.
Guided Manual Testing
Only Accessibility Insights offers structured manual testing workflows for free. Its Assessment mode walks you through 24 tests covering WCAG 2.2 AA criteria that automation can't reach — reading order, keyboard traps, cognitive clarity.
Why Does Accessibility Testing Matter in 2026?
Lawsuits under the Americans with Disabilities Act (ADA) hit 2,014 in the first half of 2025 alone — a 37% year-over-year increase (UsableNet/EcomBack, 2025). Projections put the full-year total above 5,100. Legal risk is no longer theoretical. It's statistical.
Three regulatory shifts are converging right now. If you build for the web, all three affect you.
The European Accessibility Act Is Live
The European Accessibility Act (EAA) took effect on June 28, 2025 (European Commission). Any business selling digital products or services in EU markets must meet Web Content Accessibility Guidelines (WCAG) 2.1 AA standards. Enforcement varies by member state, but the directive is binding. This isn't aspirational — it's law.
ADA Title II Hits State and Local Government
The Department of Justice's updated ADA Title II rule requires state and local government websites serving populations over 50,000 to comply with WCAG 2.1 AA by April 24, 2026 (DOJ, 2024). That deadline is two months away. Government contractors and vendors are scrambling.
Lawsuits Are Getting Easier to File
Pro se ADA lawsuits — filed without an attorney — increased 40% in 2025 (Seyfarth Shaw, 2025). Many complaints now appear to be AI-drafted, lowering the barrier to litigation. E-commerce sites account for 69% of all targets.
And here's the kicker: 22.6% of H1 2025 lawsuits targeted sites that already had overlay widgets installed (EcomBack, 2025). Overlays don't protect you.
The digital accessibility market reflects this urgency. It's valued at $1.42 billion in 2025 and projected to reach $3.24 billion by 2034, growing at an 8.6% CAGR (Straits Research, 2025).
How Do the Major Accessibility Testing Tools Compare?
axe-core dominates the open-source accessibility testing space with 18.3 million weekly npm downloads and 89 rules covering WCAG 2.0 through 2.2 (Deque/npm, 2026). It's the engine underneath most other tools — including Lighthouse and Accessibility Insights — but each wrapper makes different tradeoffs.
Having integrated all six of these tools into production workflows, I can tell you the differences that matter aren't in the rule counts. They're in false-positive rates, CI/CD ergonomics, and what happens when a scan finds 200 violations on a single page.
axe-core (Free, Open Source)
axe-core is Deque's open-source engine with 6.9k GitHub stars. Its defining feature is a zero false-positive philosophy: if axe flags something, it's a real issue. It ships 89 rules — 57 for WCAG 2.0 A/AA, 2 for WCAG 2.1, 1 for WCAG 2.2, 26 best practices, and 3 AAA checks.
Strengths: Runs anywhere JavaScript runs. Integrates with Playwright, Puppeteer, Cypress, Selenium. Community-maintained CI/CD wrappers exist for every major platform. The JSON output is clean and structured.
Weaknesses: No guided manual testing. No visual overlay (that's axe DevTools, the browser extension). You need to build your own reporting layer.
Lighthouse Accessibility Audits
Lighthouse ships built into Chrome DevTools and runs 60 weighted accessibility audits. Sounds comprehensive — but there's a catch. It uses a subset of axe-core's rules, and its scoring system can be misleading.
Manuel Matuzovic demonstrated that a deliberately inaccessible site could achieve a perfect Lighthouse accessibility score of 100 (Matuzovic, 2019). The scoring weights range from 3 to 10 based on perceived user impact, but many critical issues simply aren't tested. A high Lighthouse score means "no flagged violations," not "accessible."
Strengths: Zero setup — already in your browser. lighthouse-ci works in GitHub Actions. Good for quick sanity checks.
Weaknesses: Partial WCAG 2.1/2.2 coverage. The score creates false confidence. Not suitable as your only testing tool.
WAVE
WAVE takes a different approach: visual, in-page feedback. It renders icons and indicators directly onto your page, showing exactly where issues occur. All processing happens client-side — your page data never leaves your browser.
Strengths: Designers and non-developers love it. The visual feedback makes issues tangible. The contrast checker is excellent. API available for batch processing.
Weaknesses: No CLI. No CI/CD integration. The API costs $0.025-$0.04 per credit after 100 free credits. Some false positives in complex layouts.
Pa11y
Pa11y is a Node.js CLI tool with 4.4k GitHub stars. It uses HTML_CodeSniffer by default but can run axe-core as an alternative runner. Pa11y CI adds pipeline support with threshold configuration.
In Craig Abbott's benchmark at the UK's Department for Work and Pensions, Pa11y found 20% of known issues versus axe-core's 27% (Craig Abbott/DWP, 2023). Combining both tools raised detection to 35%.
Strengths: Simple CLI interface. Pa11y CI is great for threshold-based gating. Configuration is straightforward YAML.
Weaknesses: Lower detection rate than axe-core. HTML_CodeSniffer produces more false positives. Smaller community.
Accessibility Insights
Microsoft's Accessibility Insights is the only free tool offering structured guided manual testing. FastPass runs automated checks plus tab-stop visualization in under five minutes. Assessment mode walks you through 24 manual tests aligned to WCAG 2.2 AA.
Strengths: The guided Assessment mode is genuinely unique. FastPass is the fastest way to check keyboard navigation. Powered by axe-core's engine.
Weaknesses: Browser extension only — no CLI, no CI/CD, no API. Windows and Edge/Chrome only. Can't automate it.
DIY: Playwright + axe-core
Roll your own by combining Playwright's browser automation with @axe-core/playwright. You get full control over authentication, multi-page crawling, and custom reporting. This is what most mature engineering teams end up building.
Strengths: Total flexibility. Runs in any CI/CD system. You control the browser context, viewport, auth state, and page interactions before scanning.
Weaknesses: You own the infrastructure. Browser dependencies in CI are finicky. You'll spend days building what a managed service handles out of the box.
Feature Comparison Table
| Feature | axe-core (free) | axe DevTools Pro | Lighthouse | WAVE | Pa11y | Accessibility Insights | |---|---|---|---|---|---|---| | **Price** | Free (OSS) | $45/mo/user | Free | Free ext / API from $0.025/credit | Free (OSS) | Free | | **Browser extension** | Yes | Yes | Built-in | Yes | No | Yes | | **CLI** | Yes | Yes | Yes | No | Yes | No | | **CI/CD** | Community wrappers | Yes | lighthouse-ci | No | Pa11y CI | No | | **API** | Community | Yes | N/A | Yes ($) | JS API | No | | **WCAG 2.2** | Yes | Yes | Partial (2.0/2.1) | Yes | Via axe runner | Yes | | **Guided manual testing** | No | Yes | No | No | No | Yes (24 tests) | | **Rule count** | 89 | 89 + AI-enhanced | 60 (subset) | Proprietary | HTMLCS + optional axe | axe-core rules | | **False positives** | Zero policy | Zero policy | Zero (via axe) | Some | Some | Zero (via axe) | | **Best for** | Developers, CI/CD | Teams, enterprise | Quick audits | Visual review | CLI scripting | Manual + automated |
What Can Automated Testing Actually Catch?
Automated accessibility testing catches approximately 57% of issues by volume, based on Deque's analysis of over 2,000 audits spanning 13,000+ pages (Deque, 2024). That's better than most developers expect — but it means 43% of barriers still require human judgment.
The UK's Government Digital Service ran a different benchmark using issue-count methodology. Their finding: even the best tools detect only 30-40% of known issues (GDS, 2017). The discrepancy comes from methodology — Deque weighted by frequency (how often issues appear in the wild), while GDS counted unique issue types.
What does this mean practically? Automation crushes the high-volume, pattern-matching problems. It struggles with anything requiring context.
What Automation Catches Well
The WebAIM Million (2025) identified the top six error types responsible for 96% of all detected failures:
- Low contrast text — 79.1% of homepages (detection accuracy ~98%, per TestParty)
- Missing alt text — 55.5% of homepages
- Missing form labels — 48.2% of homepages (detection accuracy ~92%)
- Empty links — 45.4% of homepages
- Empty buttons — 29.6% of homepages
- Missing document language — 15.8% of homepages
These are exactly the kinds of issues automated tools excel at finding. They're structural, pattern-based, and binary — the attribute is either there or it isn't.
What Automation Misses
Keyboard accessibility testing sits at roughly 45% detection accuracy (TestParty). Tab order, focus management, and keyboard traps require real interaction patterns that automated scripts approximate at best.
Other gaps include reading order for screen readers, content that only makes sense visually, timeout handling, cognitive load assessment, and whether alt text is actually meaningful (not just present). These require a human tester — ideally someone who uses assistive technology daily.
The ARIA Trap
Here's a stat that surprises people: pages using ARIA average 57 errors, more than double pages without ARIA (WebAIM Million, 2025). ARIA is powerful but unforgiving. Misuse creates worse barriers than having no ARIA at all. Automated tools catch some ARIA misuse, but not the subtle semantic errors that confuse screen readers.
Citation capsule: Automated accessibility tools detect approximately 57% of issues by volume, according to Deque's study of 2,000+ audits covering 13,000 pages. The top six error types account for 96% of all failures detected across one million homepages (WebAIM Million, 2025), with low contrast text appearing on 79.1% of sites.
Which Tool Should You Choose?
The right tool depends on your team size, compliance requirements, and where accessibility testing fits in your workflow. Combined automated and manual testing reaches roughly 85% detection coverage (TestParty, aggregated research) — but you don't need every tool to get there.
Here's a decision framework based on five common scenarios.
Solo Developer or Side Project
Use: axe-core browser extension + Lighthouse quick checks.
You don't need CI/CD gating yet. Install the axe DevTools extension, run it on key pages during development, and fix what it finds. Use Lighthouse for a score sanity check, but don't treat a 100 score as proof of accessibility. It isn't.
Startup (2-15 Engineers)
Use: Playwright + @axe-core/playwright in CI/CD, plus WAVE for visual review during design.
Write accessibility tests alongside your integration tests. Gate pull requests on zero critical violations. Have designers use WAVE to catch contrast and structure issues before code review.
Enterprise or Regulated Industry
Use: axe DevTools Pro ($45/month/user) + Accessibility Insights Assessment for manual testing.
You need guided manual workflows, audit trails, and Jira integration. axe DevTools Pro adds AI-enhanced automation and enterprise reporting. Pair it with Accessibility Insights' 24-test Assessment mode for manual coverage.
CI/CD-First Team
Use: axe-core via Playwright or Pa11y CI for threshold gating.
Configure violation thresholds in your pipeline. Fail the build on any critical or serious violation. Track violation counts over time to measure progress. More on this setup in the next section.
Compliance-Focused (Legal Deadline Approaching)
Use: Multiple tools in parallel + manual audit.
Run axe-core and Pa11y together — Craig Abbott's research shows this raises detection from 27% to 35%. Add manual testing with Accessibility Insights. Consider a professional audit for the remaining gaps.
How Do You Add Accessibility Testing to Your CI/CD Pipeline?
Integrating accessibility checks into CI/CD catches 57% of issues before they reach production (Deque, 2024). The most practical approach pairs Playwright with @axe-core/playwright in a GitHub Actions workflow. Here's a working setup.
The Test File
Create an accessibility test that scans your key pages:
// tests/accessibility.spec.ts
import { test, expect } from '@playwright/test';
import AxeBuilder from '@axe-core/playwright';
const pages = ['/', '/pricing', '/docs', '/blog'];
for (const path of pages) {
test(`a11y: ${path} has no critical violations`, async ({ page }) => {
await page.goto(`http://localhost:3000${path}`);
const results = await new AxeBuilder({ page })
.withTags(['wcag2a', 'wcag2aa', 'wcag21aa', 'wcag22aa'])
.analyze();
const critical = results.violations.filter(
(v) => v.impact === 'critical' || v.impact === 'serious'
);
expect(critical).toEqual([]);
});
}The GitHub Actions Workflow
# .github/workflows/a11y.yml
name: Accessibility Tests
on: [pull_request]
jobs:
a11y:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
- run: npm ci
- run: npx playwright install --with-deps chromium
- run: npm run build
- run: npm start &
- run: npx wait-on http://localhost:3000
- run: npx playwright test tests/accessibility.spec.tsThis workflow blocks any PR that introduces critical or serious accessibility violations. Start strict. You can always relax thresholds later — but tightening them after the fact is painful.
In production, we gate on zero critical/serious violations and track moderate/minor counts as metrics rather than blockers. Within three months of adopting this approach, accessibility regressions dropped dramatically — catching issues before they ever reached users.
Citation capsule: Adding Playwright and axe-core to a CI/CD pipeline catches violations before deployment. A GitHub Actions workflow running axe-core with WCAG 2.0-2.2 AA tags can gate pull requests on zero critical violations, preventing accessibility regressions from reaching production.
Frequently Asked Questions
Which accessibility testing tool is best for beginners?
The axe DevTools browser extension is the best starting point. It uses Deque's open-source engine, runs with a single click, and explains every violation in plain language. No setup, no configuration, no command line required.
Can automated tools make my site fully WCAG compliant?
No. Automated scanners cover just over half of known accessibility barriers (Deque, 2024). Keyboard traps, reading order, and cognitive clarity still require human judgment. Use Accessibility Insights' Assessment mode for structured manual coverage.
Is Lighthouse good enough for accessibility testing?
Lighthouse is fine for quick checks but misleading for compliance. It runs only 60 of axe-core's 89 rules and uses weighted scoring. A deliberately inaccessible demo site achieved a perfect 100 score (Matuzovic, 2019). Don't treat Lighthouse as your compliance benchmark.
How many accessibility rules does axe-core have?
axe-core v4.11.1 ships 89 rules: 57 covering WCAG 2.0 A/AA, 2 for WCAG 2.1, 1 for WCAG 2.2, 26 best practice checks, and 3 AAA-level rules (Deque, 2026). Its zero false-positive policy means every flagged issue is a genuine violation.
Do I need both axe-core and Pa11y?
Running both tools together raises detection from 27% (axe-core alone) to 35% of known issue types (Craig Abbott/DWP, 2023). If you're under compliance pressure, the overlap is worth the extra CI time. For most teams, axe-core alone covers the critical ground.
Are accessibility overlay widgets a valid testing alternative?
No. Overlays don't fix underlying code. 22.6% of ADA lawsuits in H1 2025 targeted sites with overlay widgets installed (EcomBack, 2025). Courts have not accepted overlays as evidence of compliance. Use proper testing tools, not cosmetic patches.
What's the cost of ignoring accessibility testing?
ADA digital accessibility lawsuits are projected to exceed 5,100 in 2025 (UsableNet, 2025). E-commerce sites represent 69% of targets. Beyond litigation, inaccessible sites exclude roughly 16% of the global population. The cost of testing is negligible compared to the cost of a lawsuit.
How often should I run accessibility tests?
Run automated scans on every pull request and after every deployment. Regressions creep in with every feature — a new button without a label, a modal that traps keyboard focus. CI/CD integration catches these before users encounter them.
Key Takeaways
Accessibility testing in 2026 isn't optional. With EAA enforcement active, ADA Title II deadlines approaching, and lawsuit volumes hitting record highs, the question isn't whether to test — it's which tools to use and how to integrate them.
Here's what matters:
- axe-core is the foundation. Its zero false-positive policy and broad WCAG coverage make it the industry standard engine. Start here.
- Don't trust Lighthouse scores for compliance. Use Lighthouse for quick sanity checks, not as proof of accessibility. A perfect score does not mean an accessible site.
- Automation covers roughly half the problem. That's significant, but you still need manual testing for keyboard navigation, reading order, and cognitive clarity.
- CI/CD integration is non-negotiable. Gate your pull requests on accessibility violations. Catching issues in a pipeline costs a fraction of fixing them after deployment.
- Combine automated and manual testing. axe-core plus structured manual review with Accessibility Insights closes the majority of coverage gaps.
Pick a tool. Add it to your pipeline this week. Every scan you run catches barriers that real people face every day.
For the full testing lifecycle, see our complete developer's guide to accessibility testing.