Semantic document outlines and heading structures

Last year, we had a big focus on accessibility – it’s vital that we are a diverse and inclusive organisation and it’s equally important that everyone can experience the Museum and what we offer. This is part of our strategic goal to engage and involve the widest possible audience. To meet this goal, we need to ensure our digital experiences are accessible to everyone.

In March 2020, we tested our website with real assistive technology users, funded by the National Lottery Heritage Fund as part of the Urban Nature Project. Following this, we released 25 separate changes to the main nhm.ac.uk website between May-September 2020, each one helping us get a step closer to meeting AA WCAG accessibility standards. You can read more about our work on digital accessibility here.

We could see from accessibility testing that the majority of our pages have the correct structure, and that where they don’t, there are only minor difficulties in navigating the pages. This led to me reviewing the document structure/outline of some of our pages on nhm.ac.uk as we knew they are not as optimised for accessibility – i.e. screen reader and keyboard-only users – as we would like.

The problem

Our heading structures in their current form are helpful in most cases, but having a CMS-driven site where flexibility is given to content authors for where they can place components within certain page templates can lead to difficulties in maintaining a semantically-correct document structure on these templates. This is because any limits or rules on component usage within a page are quite hard to enforce – but on the other hand, you could argue that restricting display of components to only specific templates is too heavy-handed.
Additionally, we know that headings are sometimes incorrectly used for emphasis/visual purposes rather than semantic/structural purposes. We also wondered if the use of landmark elements such as <aside> or <section> on our pages could be improved – but either way, it would be an opportunity to review and learn more about how best to use them.

There were various issues to address:

  1. How can we test/audit the document structure we have now, and in the future?
  2. How should we manage headings at component level across different page templates?
  3. Does it matter if there is more than one H1 on a page?
  4. Are we using landmark elements correctly? And if we wrapped components in landmark elements, would that resolve the heading structure issues?

The goal

I set myself a goal to define a standard approach for using heading and landmark elements across the main NHM site, to ensure a consistent and usable experience (not just for screen reader and keyboard-only users) and also one that gives clarity for developers and designers about which approach to take when creating new components/pages or working on existing ones.

The topic of how to manage semantic heading and document structures in a Content Management System-driven site has always baffled me so I relished the challenge of trying to find an answer. This was a bit of a journey but I think we ended up in a good place and have a bit more clarity now. This comment in an article I read about heading structures made me chuckle and I thought was quite appropriate to kick us off:

“…accessibility is 2 quarters science and 3 quarters art, (no quarters mathematics).”

I would go as far as to say I followed Gartner’s Hype Cycle through this journey:

A diagram illustrating Gartner’s Hype Cycle

Innovation Trigger

And so, I began on my quest to find answers via the tubes of the Interweb. I had a couple of assumptions and gut feelings which at least gave me a starting point for my research – I was open to all of this changing as I genuinely didn’t know if I was right:

Assumption #1: a component-based approach is the way to go

When deciding which heading elements to use, we should take a component-based approach as opposed to a page-based approach. A component-based approach puts more responsibility on the developers to make a decision about the semantic heading structure of a component at the time it is created, whereas a page-based approach would be less robust because it leaves more risk of error when creating a page – it might work at the time, but as soon as a component is switched or changed, the document structure of a particular page could become inaccessible again.

Assumption #2: it would be great if we could “self-contain” our components

If the components can be self-contained and have their own heading structures, then it doesn’t matter which page they are used on and where they are used on that page – each component would be semantically correct in itself, the browser would know what to do and the assistive technologies would know what to do.

My initial reading took me to various sites on the Web, listed here for reference:

Peak of Inflated Expectations

I also tweeted my questions but I don’t really have much of a Twittersphere and I didn’t know who to tweet them to, so it felt like I was talking in an empty room as I didn’t get any response at all after a week or so (no offence to my 200-odd followers). I then joined and posted in the web-a11y Slack group and had some great conversations in there – thank you to the Web A11y members for your advice – you set me on the path I have long been looking for!

I thought I would share my findings below by answering my initial questions.

1. How can we test/audit the document structure we have now, and in the future?

The HeadingsMap browser extension (for Google Chrome and Mozilla Firefox) appears to be quite popular and was recommended in the web-a11y Slack group – the extension generates a document map, or index, of any web document structured with headings and it also shows the HTML5 outline. Its output is similar to the headings list that the NVDA or JAWS screen readers provide. Dafydd from AbilityNet also shared a couple of bookmarklets: one for exposing headings and the other for exposing landmarks on a page. I’ve since found these to be really quick and simple ways of demonstrating an idea or problem related to headings or landmarks.

2. How should we manage headings at component level across different page templates?

One idea suggested in the Slack group was that a component should have a default heading which can then be customised within a page if necessary – this parameterized approach could allow you to address component-level accessibility while also treating holistic accessibility at the content level across the site. The downside is that it would still leave too much freedom to the content author and could prove difficult to manage in the long-term. It was at this point where the advice from web-a11y Slack was incredibly helpful and actually sent me on a different path – my assumptions were challenged and I saw the light!

“It is preferable to have a consistent heading structure between pages — e.g. a sidebar with a heading on it should ideally have the same heading level across a site, as opposed to the level being adjusted to make more sense on the individual page.” [Dafydd]

“You need to think about heading hierarchy at the page level because it is pages that your users will be concerned with. If your CMS uses components to reuse/syndicate chunks of content, it’s probably going to cause you problems for getting the heading hierarchy right.“ [Léonie Watson]

Léonie suggested that to mitigate the conflict between using a component within different page templates i.e. different contexts, we could use role=”heading” and aria-level=”N”, where “N” is the heading level that only screen readers will interpret. This will cause screen readers to treat it as a specific heading and to everyone else it will look like and actually be a different one. This is similar to the original idea suggested above in terms having the flexibility to differentiate heading structures on a web page, but the config would be applied at a page template-level rather than at a component-level. Really useful food for thought, there!

3. Does it matter if there is more than one H1 on a page?

The Web A11y Slack group members pointed me to a link which talks about WCAG failures (and non-failures). It was interesting to read that WCAG requires heading markup to have an appropriate hierarchical relationship with other headings – so you could have more than one H1 on a page, as long as the visual presentation of each H1 doesn’t imply a hierarchy. I also found this comment in the article, discussing a need for pragmatism/realism about skipping heading levels, to be really helpful.

As discussed in the Slack group, and from what I’ve understood having read about the subject a bit more, skipped heading levels and using multiple H1’s are not deemed to be a WCAG failure, but they’re not usually considered best practice – just because something isn’t a WCAG failure doesn’t mean it’s not an accessibility failure (which can ultimately impact the user experience). So it seems like, for accessibility, usability and SEO, having one H1 on a page is more ideal than multiple H1’s – but the way in which you use/display multiple H1’s on a page, if you choose to do so, is very important.

“…it is helpful, if there is only one h1 on the page, that it is at the start of the main content area, and that it reflects the page title.” [Léonie Watson]

4. Are we using landmark elements correctly? And if we wrapped components in landmark elements, would that resolve the heading structure issues?

The quick answer to this is: “no, we should do a bit more reading/research to understand how best to use landmark elements on our site”.

Having discussed this further in the Slack group, I learned that it usually makes sense to put a heading at the start of a landmark element. These elements are used to represent sections of content, and prefacing a section of content with a suitable heading makes it easy to scan the page to locate the part you’re interested in. This is a benefit to screen reader users and other users.

As for wrapping components in landmark elements: if only there was some kind of algorithm or automatic thing that would do this…the HTML5 outline algorithm? Well, it doesn’t exist! So for now (and possibly forever), the answer appears to be: “no, wrapping wouldn’t solve the heading structure issues” (or rather, “no, but it doesn’t hurt to do it anyway in the appropriate places”).

Trough of Disillusionment

Despite all the great info and progress being made, I must admit I did start to question if I was doing the right things, asking the right questions or taking the correct approach. I got my answer from @sinabraham in the Slack group:

“This (semantic heading structures) very much deeply matters and you have my personal thanks as a screen reader user for caring about it, because it is so helpful when done right”. [sinabraham]

I wanted to highlight this just to demonstrate that a lot of things in web development aren’t easy – but they can be very much worth it for your users – and you can validate the impact by testing and learning.

Slope of Enlightenment

This validation spurred me on, and I took the opportunity to discuss my findings with some of my team. We had a really good conversation and worked out some principles we should follow and what to do next:

Heading elements should not be used for visuals/emphasis

We should not be using heading elements for styling and would need to take away the default styling on heading elements when used within components (excluding the Text component which uses the CMS Rich-Text Editor). Styles should be applied to headings with classes only. It might take some time to unpick the current typography and CSS on our site, but at least we had a rule to follow going forward.

Less flexibility in the page template is ultimately better for semantics/structure

Ultimately, we should aim to have a fixed layout in our templates, possibly with options for which components to use within them, but overall setting the page structure in the code. This gives greater consistency to the user experience and the accessibility experience.

However, this could take some time to implement, so – for now – we could have a (hidden?) H1 on each page template. Some already have this (either visible or hidden) – if necessary, we could have some logic wrapped around the NHM logo (or similar) to set the aria-level=”1” on pages that don’t have an H1 already. But, the main thing to remember is that every page template needs one H1 heading element.

As a rule of thumb, start a component with a H2 heading

For the rest of our components, we think that “self-containing” these in landmark elements is fine, and on the assumption that every template has a H1, then as long as we start each component with a H2 we will have a better and more consistent document structure. You can start heading elements within landmarks with a H2 i.e. it doesn’t need to be a H1. So this is our “safest bet” right now and would be a solution that avoids us having to make major changes across the entire site.

Plateau of Productivity (and next steps)

Overall, we know we’re not doing too badly, but as a Dev Team we’ve decided we will:

  • Double-check which page templates currently use H1 headings (visible or hidden) and if any are problematic.
  • Look at components that currently use a H1 and get a better understanding of the context in which they are used on our pages. We may consider changing them so that the page’s H1 is separate to the component’s heading structure and start the component structure at H2. If not, then we’d need to make sure that this component is not used multiple times on a page.

We also agreed to consolidate and document the things we have been doing – or will be doing – for the team and wider department to feed into and refer back to in one place:

  • Standardising how we test accessibility (e.g. manually in the browser, or with software, or with automated tests)
  • Ensuring accessibility as default during component development – making a checklist of things we must always consider e.g. ALT text (if applicable), ARIA labels (if applicable)
  • Ensuring that our approach to using heading elements is consistent going forward
  • Review usage of landmark/sectioning elements across the site (page and component-level) to ensure we are making best (and correct) use of them. For example, are we using <main> or <aside>? If not, should we be? Do we understand what all of the sectioning elements are for?

Summary

So, after all the digging and researching about a topic which I’ve struggled to find answers to for a long time, it feels like I’ve found a way forward. Here’s a summary:

  • When using heading and landmark elements, try to use a page-based approach, as opposed to a component-based approach i.e. think about heading hierarchy at the page level because it is pages that your users will be concerned with.
  • Use headings for structure, not visuals on a page.
  • Consider using role=”heading” and aria-level=”N”, where “N” is the heading level that only screen readers will interpret, to differentiate heading levels on a page if necessary.
  • The web community is huge and diverse – it can also feel very overwhelming – but this exercise has proved that there are people out there who can help you tackle a problem and you don’t need to have a million Twitter followers to get the answers you need.
  • Always be open-minded about challenging any assumptions you’re making.

Thanks for reading!

One Reply to “Semantic document outlines and heading structures”

Comments are closed.

%d bloggers like this: