Regardless of your industry, if you’re an engineer then I’m pretty sure you’ve encountered the term “Functional Safety” at least once in your career. And I bet you thought at that time or maybe still think that functional safety is related to occupational health and safety, electrical safety or something similar. Well, not exactly. In this short post I’ll try to explain with the simplest examples what functional safety is.
If you check any functional safety standard like ISO 26262 or IEC 61508, it will give you about the same definition for functional safety.
Functional safety is an absence of unreasonable risk due to hazards caused by malfunctioning behaviour of the electrical, electronic or programmable electronic systems
Let’s dig into this term a bit deeper, so that it will be easier for us to further understand the idea behind the functional safety concept.
First, let’s talk about hazards. A hazard is something that may cause harm. Examples include an open fire, sharp edges of a machinery equipment, high kinetic energy of a vehicle and water on the floor. All these hazards have the potential of damaging a person who gets in contact with the hazard. An open fire is not harmful until a person gets too close to the fire. The moving parts of a machine are not harmful unless you put your arm between the rolling gears.
If we bring some operational and/or environmental context as how hazard can damage a subject (a person), then we will be able to describe a way how the person can get injured by the hazard. This description is called hazardous event. Imagine that the hazard is high temperature of a machinery part and the context is that the machinery operator accidentally touches the part which has high temperature. The hazardous event here will sound then like
Every hazardous event can be assessed to determine how bad it may be. The value of “how bad” concept is called risk level (or risk) and consists of two parts, severity and probability. Quite often there is confusion when engineers use the term “risk” instead of “hazardous event”. Properly stated, risk is a magnitude value assigned to the hazardous event.
Severity reflects how bad the consequences are; probability or likelihood takes a subject, operational and environmental contexts into account to estimate how likely the outcome can appear given the hazard. Both severity and probability are combined via the instrument called “risk matrix” which converts severity and probability values into the risk value for the hazardous event. See an example of the risk matrix below.
Higher levels of risk are usually treated as unacceptable within the company and hazardous events associated with these levels may lead to serious consequences and/or have very high probability, so they shall be mitigated regardless the price or effort. There are also low levels of risk when the hazardous events happen rarely or potential injuries for persons are slight (like scratches). In this case the level of risk is assumed to be acceptable by the company which understands potential consequences and probabilities for the hazardous event.
The last bit is to figure out what are hazards caused by malfunctioning behaviour of the electrical, electronic or programmable electronic systems. The malfunctioning behaviour is when your item work in an unwanted way (doesn’t work at all or works at a low performance). This may happen because of hardware or software faults and failures. If we speak about some programmable electronic system like vehicle brake controller, its malfunctioning may be caused by SW bugs or HW failures (short in specific resistor) and may lead to loss of braking ability for the whole vehicle which is hazard by its nature.
Now let’s summarise what we know and define what is functional safety at last.
If we may have hazardous events caused by an item‘s malfunctioning due to internal or external faults and failures, but the level of risk associated with this hazardous event is below some acceptable risk threshold, then it is fair to say that the item is functionally safe, or in other words functional safety is achieved for the item.
The picture below illustrates propagation from low level faults to high level hazardous events in the context of functional safety.
Please notice that if the level of risk is not acceptable, then we have to decrease it. There are various techniques that allow to push down the level of risk, but we will discuss them in further posts.