top of page

A Powerful Approach to System Failure and Human Reliability Analysis

  • Writer: JD Solomon
    JD Solomon
  • 34 minutes ago
  • 3 min read
When It Comes to Systems Thinking, Don’t Leave the Human Reliability Analysis Out
When It Comes to Systems Thinking, Don’t Leave the Human Reliability Analysis Out

Failures have a way of forcing us into systems thinking. Unfortunately, when organizations dig into root cause analysis, they often default to the physical components—pumps, valves, servers, sensors—while giving only cursory attention to the human side of the system. That imbalance leads to incomplete conclusions and, worse, repeated failures. If we are serious about systems thinking, then human reliability analysis (HRA) must be part of the conversation from the start.

 

This article offers a practical, streamlined way to incorporate human factors into reliability and risk assessments. The goal is not to turn every engineer into a human‑factors specialist. The goal is to ensure that the human element is treated with the same rigor as the mechanical and digital elements with which it interacts every day.

 

A Real‑World Reminder

A recent water system emergency left a small town without normal water service for several days. After the crisis, the system owner commissioned a formal reliability and risk assessment to understand what went wrong and how to prevent a repeat event.

 

It’s not the first time I have been called in to assess a failed water system. A standard approach I use is a proven seven‑step reliability process aligned with the international risk standard, ISO 31000. The assessment includes equipment reliability, human factors, and system interfaces. Traditional tools and techniques, such as failure modes and effects analysis (FMEA), reliability block diagrams, fault tree analysis, and facilitated workshops, are used to evaluate the physical system.

 

Our team also takes the critical step of formally examining human performance. As a default, two human‑factors methods are applied:

  • HFACS (Human Factors Analysis and Classification System) to evaluate management practices, training, communication, and organizational influences.

  • HEART (Human Error Assessment and Reduction Technique) to estimate the probability of human‑driven failure modes and to quantify the impact of error‑producing conditions.

 

The combination of techniques provided a more complete picture of how people, processes, and equipment interacted to create the crisis. More importantly, the assessment provided how the system could be prioritized for improvement.


Why Human Reliability Analysis (HRA) Matters

People make errors. Those errors are predictable, measurable, and manageable when approached systematically. Ignoring the human element produces misleading risk estimates and unreliable mitigation strategies.

 

HRA methods fall into two broad categories:

  • Qualitative methods use structured facilitation to identify contributors to human error and uncover root causes.

  • Quantitative methods use databases of human tasks and their associated error rates to estimate failure likelihood and prioritize mitigation.

 

Qualitative approaches are usually sufficient for understanding contributors to human error. Quantitative approaches are valuable for comparing risks, justifying investments, and measuring improvement over time.


Two Practical Methods: HFACS and HEART

HFACS

HFACS breaks human error into four levels:

  1. Unsafe acts

  2. Preconditions for unsafe acts

  3. Unsafe supervision

  4. Organizational influences

 

Its strength is found in its ability to push teams beyond operator error into the management and cultural conditions that enable failure. In most organizations, this is where the true root causes reside.

 

HEART

HEART provides a structured framework for estimating the probability of human error for specific tasks. It classifies tasks, identifies error‑producing conditions, and calculates the overall probability of failure. HEART’s advantage is that it produces a number, useful for prioritization and for demonstrating the value of mitigation strategies. Its limitation is that it requires significant subjective judgment, which must be calibrated through experience and team discussion.

 

Its major advantage is that it produces a number.

 

Using More Than One Technique

A simplified approach—using HFACS to understand contributors and HEART to quantify risk—gives organizations a practical, defensible way to bring human reliability into the systems‑thinking conversation. The result is a more complete understanding of risk and a clearer path to meaningful improvement.

 

Don’t Leave HRA Out of Your Systems Thinking

The most effective reliability and risk assessments integrate equipment, human factors, and system interfaces. Leaving out the human element creates blind spots that eventually show up as failures, outages, regulatory scrutiny, or reputational damage.

 

Human error is not an afterthought. It is part of the system. Treat it that way.



Need help getting started? JD Solomon Inc. specializes in asset management systems, reliability, and root cause analysis.

JD Solomon is the founder of JD Solomon, Inc., the creator of the FINESSE Fishbone Diagram®, and the co-creator of the SOAP criticality method©. He is the author of Communicating Reliability, Risk & Resiliency to Decision Makers: How to Get Your Boss’s Boss to Understand and Facilitating with FINESSE: A Guide to Successful Business Solutions.

 


Comments


bottom of page