Andreas Burkard, Monitoring Systems for Checking Websites on Accessibility

Today I attended an English-language webinar about the results of an accessibility-tool evaluation study conducted by the Competence Center on Digital Accessibility at Stuttgart Hochschule der Medien (Stuttgart Media University), Germany. The lead presenter was Andreas Burkard, and Prof. Dr. Gottfried Zimmermann assisted with Q&A; Laura Eppler and Kira Frankenfeld were silent members of the presentation. A German-language webinar on the same topic took place separately.

[Edited to add, 27 Oct 2020: HdM has now released links to PDF slides in German and English.]

Here’s the event summary:

The goal of this study was to identify a monitoring system on the market that is best suited for the needs of the university. We evaluated various factors, e.g. coverage of WCAG criteria, percentage of errors found, percentage of false positives. Also, we evaluated the usability of the systems based on an user study. The monitoring systems that were evaluated in the study are (in alphabetic order):

  • Deque: WorldSpace Comply (now called axe Monitor)
  • Pope Tech
  • Siteimprove: Accessibility
  • The Paciello Group: ARC Monitoring

The study results will be published as an open-access paper. I look forward to it—Burkard’s presentation included useful charts showing evaluation details, including comparisons when relevant to manual accessibility checks. Overall, the study found that Paciello Group’s ARC and Deque’s axe Monitor are the most accurate regarding errors relative to WCAG 2.1 criteria, yet the least interesting to use by the small group of user testers (15 participants, two of whose contributions were excluded for inconsistency). SiteImprove scored highest in the user test and overall. The overall criterial are weighted fairly heavily towards gamification and contemporary design over accuracy of error identification and reporting. It’s useful to recall that the study has been undertaken to identify a tool best suited for Stuttgart HdM’s needs.

My brief notes from Burkard’s presentation and slides follow.

SiteImprove’s gamification aspects increase motivation. It’s the only tool in this set that evaluates sites at WCAG 2.1 AAA level. It’s not very adjustable and can be slow.

Deque’s axe Monitor permits assigning issues to individuals (Jira integration), with informative links to Deque University pages; detail pages suggests how to fix a given error. Though it integrates with dev tools in a web browser, it doesn’t collect analytics data from users. It’s the only tool that can scan “processes” at the moment, via its scripting support (if I understood correctly, I would term it support for specific user flows). It also permits specifying browser user-agent, including mobile. Its design feels old-fashioned, not very motivating; out of the box, it supports WCAG AA level only.

Paciello Group’s ARC dashboard includes a convenient overview of the top errors and the percentage of total error they represent. Good drilldown but issues not assignable. They’re working on support for process/scripting support. Strengths include highlighting focus order and supporting multiple sets of rules. Though ARC found the most errors in terms of completeness, it also had a high rate of false positives—flagged some hidden content that’s intended to be hidden.

Pope Tech’s tool is powered by WAVE, with icons and indicators familiar to me from manual use of WebAIM’s tool. To me, the main value seems to be in having a site-wide rollup.

Burkard also gave an overview of the specific evaluation criteria used and how they were established: some were adopted from the existing literature, some newly devised, and then the criteria were discussed and voted on by several scholars working in accessibility.