Michael Kelleher, Principal Consultant, DNV GL, Knowledge Management Competence Centre, UK
Tony Potts, Principal Consultant, DNV GL, Manchester, UK
Trixie Pomares, Consultant, DNV GL, Manchester, UK (1)
This article was first published at IChemE Hazards 27 Symposium in 2017
Process safety management can be greatly enhanced through the application of good practice and the management of knowledge. Improving knowledge transfer and the corporate memory can help enhanced learning from, and prevent further occurrence of incidents, thus reducing costs.
Keywords: Knowledge Management, Process Safety Management, Learning from Events, Project Costs
David DeLong (2004) cites an example of how the loss of corporate memory contributed to an incident on the Gulf Coast:
“When an ethylene reactor exploded at a petrochemical plant on the Texas Gulf Coast an investigation found that the unit’s engineer and the operators in the control room at the time of the accident had all been in the job less than a year. Retirements and turnover had left the plant with inexperienced personnel and, not surprisingly, the explosion was attributed to operator error. Having less experienced people working in increasingly sophisticated computer-controlled production operations increases the risks of serious and costly mistakes.”
Implicit in his argument is that the organisation had failed to ensure the sufficient transfer of knowledge from experts to their less experienced colleagues. DeLong’s view is that this lack of knowledge at the point of action was a contributory factor in the incident. Such an argument is worth exploring further and the focus of this paper is in the application of good practice in the management of knowledge to the domain of process safety controls.
DNV GL’s survey report, A New Reality: the outlook for the oil and gas industry in 2016, concludes that cost management is the top priority for 41% of senior sector players globally in the year ahead. The report also identified one of the top cost control measures implemented in 2016 was the headcount reductions of 31%, up from 25% in 2015. As further job losses are anticipated, companies are facing a potential crisis through the loss of knowledge. This is becoming a critical issue which cannot be ignored, as it increases the potential for incidents.
In the next section, this paper will examine the concept of knowledge management before undertaking a deeper look at the analysis of incidents from a knowledge perspective in section 3. The practice of knowledge management is explored in Section 4 and the final section draws the conclusions that process safety management is not solely about physical or technical factors but also about ensuring that the right knowledge is made available at the right time to the right people. This involves interventions that support connections between people and from people to content.
Ignoring Knowledge Management has its own costs
By 1964 forty per cent of the population of the USA had been born in the previous 18 years. The so-called baby boomer generation is now between the ages of 53 and 72. The United Kingdom experienced similar growth in births during the same period with over 850,000 births from the mid-1950s up until the early 1970s and a peak of over one million in the early 1960s (Office for National Statistics, 2015). Countries throughout the western world experienced similar spikes in birth trends. The consequence for organisations today is that members of this generation have become, in many cases, the experts upon which our organisations rely. Often known as crew change, the impact of the retirement of this generation will mean that corporate memory, at least that part that is held by individuals, will be eroded and the knowledge accumulated will be lost and quite possibly require re-building.
This phenomenon manifests itself in terms of insufficient competence levels due to poor learning and less investment in training and development. The latter is often reinforced during times of weak markets that result in short-term financial decisions. The oil and gas industries have not been immune to such challenges and, in fact, the dramatic decline of crude oil prices, which had plummeted 40 percent in the second half of 2014, has contributed significantly to wholesale job cuts that have exacerbated this challenge. Knowledge management can make a positive contribution to this through the systematic transfer of experts’ knowledge to successors and to the development of a culture of long-term knowledge sharing.
The body of data, information and knowledge accumulated by an organisation is its corporate memory. The erosion of the corporate memory due to the loss of knowledge held by individuals is further intensified by paying insufficient attention to the capture of relevant content. In the case of process safety management, content can include access to lessons learned from incidents, case studies, documented understanding of hazards, their triggers and consequences. Such content could be enhanced through greater visual representations to enable staff to gain a richer understanding of safety related phenomena. This lack of attention is further compounded by the frequent failure to make coherent links between process safety information and the need to update operating procedures on a continuous basis. When these procedures are explicitly linked to other knowledge assets such as lessons learned, content libraries, staff expertise profiles, etc., the corporate memory not only forms a repository of knowledge but also becomes a dynamic source for improvements to ongoing operations but can accelerate incident investigations when they occur.
The Cabinet Office (2016) published a paper entitled Knowledge Principles for Government. This paper establishes seven principles and sets a mandate for the public sector to manage its knowledge effectively. The seven principles are:
- Knowledge is a valued asset;
- Knowledge needs the right environment in order to thrive;
- Knowledge is captured where necessary and possible;
- Knowledge is freely sought and shared;
- Knowledge increases in value through re-use;
- Knowledge underpins individual learning; and
- Knowledge underpins organisational learning.
The paper argues that knowledge is an asset which is fundamental to the efficient and effective delivery of public services. These principles are just as applicable to all organisations, whether they operate in the public or private sectors and DNV GL’s approach to managing knowledge invokes the same principles.
Knowledge management is a discipline that promotes an integrated approach to identifying, capturing, validating, storing, retrieving, retaining, transferring and sharing an organisation’s knowledge assets. These assets may include images, documents, policies, procedures and expertise held by individual employees. Knowledge Management is not an end in itself, but a means to an end. In the case of process safety management, the ‘end’ is fewer hazards, fewer incidents and a better understanding of why they occur and how to ensure technical or human errors are not repeated.
Common sense says that learning from successes and failures, sharing knowledge with other employees and smart application of lessons learned in the past will lead to continuous improvement of results. The “availability of knowledge and the ability to exploit it” was identified by McKinsey as the second most significant business trend in 2007. However, for many reasons these learning processes might not function properly in companies anymore and need attention and support. Competing instead of collaborating divisions, differences in culture, pressure of the daily challenges, lack of communication tools and places to meet, poor discipline and counter-productive incentives within the company might get in the way. These barriers create various costs of ignorance because:
- Mistakes are duplicated because earlier ones were not recorded or analysed;
Work is redone because people are not aware of activities, projects in the past or their outcomes;
Customer relationships are damaged because knowledge is not available at the point of action;
Good ideas and best practices are not shared which raises overall costs;
1 or 2 key employees hold crucial knowledge creating continuity risks;
The company learns too slowly which results in delayed product development or missed opportunities;
Employees are frustrated because knowledge resources are not available; and
Business strategies are not aligned with current or future competencies.
As markets improve, some of those jobs lost will be subject to recruitment and as a report by Oxford Economics (2014) shows this has direct financial impact:
“the loss of an employee earning £25,000 a year or more carries an average financial impact of £30,614. These costs are split into two main components. Firstly, and most importantly, is the cost of lost output while a new worker gets up to the standard expected of them (“optimal productivity”). The second cost, which is probably more familiar, is the logistical cost of finding and absorbing a new worker. This includes the cost of advertising, using a recruitment agency, employing a temporary worker and the cost of interviewing and inducting a new employee. This costs on average £5,433.”
With a workforce of around 18,000 staff, for example, and 50% of total annual spend going to projects conservative estimates of direct financial costs include, for example:
- Re-recruiting retirees for only half a year to overcome the loss of productivity whilst waiting for a new recruit or successor to get up to speed, would cost approximately £27m over a five-year period; and
- Spending 1% of total project budget on rectifying repeated mistakes, the cost will amount to over £70m over five years. Cascio (2006) identified that the costs alone to an organisation of people loss are high with estimates ranging in:
- 50-60% of an employee’s annual salary being spent in recruitment; and
- 90 – 200% of annual salary – total costs associated with turnover. Additional challenges highlighted in AON Global Risk Management Survey (2015) are:
- 28% of hiring managers citing lack of experience as key challenge; and
- 38% of hiring managers struggling to find / retain talent they need., These costs of ignorance can be addressed through the effective management of knowledge. The experience of DNV GL process safety experts is that there is also a clear correlation between the loss of knowledge in a company and the successful execution of projects and improved process safety management. The following section explores those connections.
Process Safety Management from a knowledge perspective
It is often stated that the difference between occupational incidents and process safety incidents is that the latter occur less frequently and as such we do not recognise the warning signs as they occur. The other main difference is that process safety incidents can cause fatalities or life-changing injuries to people and cause financial and reputational damage to businesses. Process safety incidents are more likely to be affected by corporate memory loss due to their low frequency, especially if mechanisms are not enacted to ensure knowledge is captured, transferred and easily found and re-used.
From the investigations that DNV GL has completed, it can often be seen that for every major accident that occurs, several similar lower-scale incidents are seen to occur numerous times eventually resulting in the larger incident. This indicates that the organisation has forgotten what is important in preventing the biggest events and, as such, is not reacting strongly enough to these weak signals to prevent their occurrence. An understanding of how these weak signals can lead to major events is built up through knowledge being retained by the organisation to continuously improved the company safety management system and (identify and) inform those in the organisation who need the information
In the HSE (2011) report into the 2005 Buncefield incident, the HSE commented that “There should be a clear understanding of major accident risks and the safety critical equipment and systems designed to control them.” The major accident risks are normally identified as part of the risk assessment process that goes into design and installation and once recorded often the assessments are then filed away with the information being used to update operating instructions and updated if changes are made. Links to this information from major accidents is not always translated into the operating instructions and in DNV GL’s experience this link can often be missing, which prevents full understanding and knowledge being transferred.
The installation, commissioning and initial operation of a site is also often where information about how the plant systems operate and their limitations are learned. Again, in DNV GL’s experience, too often the linkage between these assessments and the commissioning information does not always go into the operating procedures and thus the information can be lost or its significance forgotten over time through retirements or people moving on. Without systems in place to retain knowledge gained through experience, people can forget why controls are in place and certain actions taken and thus may stop doing them. Indeed, this is particularly true among executive management teams where staff turnover can be particularly high and corporate memory particularly short. DNV GL has seen best practice examples of companies that train executives in process safety incidents and link the lessons learned to procedures and processes within the company to maintain such understanding. Indeed, DNV GL has a strong business running major hazard awareness courses for senior managers and operations personnel at our Spadeadam facility. This allows participants to experience the impact of explosions, pool and jet fires that production personnel may be exposed to, as well as understand what causes these events and the impact of them.
The HSE (2011) report recognises that for the Buncefield incident “when situations arise requiring staff to work outside the normal operating envelope, they should be recorded and reviewed by management.” This use of operating instructions is normally the first line of defence against process safety accidents as the following a generic illustration of process safety control as Figure 1 below shows. Operating procedures are designed to maintain the tank in the ‘Normal Operating Zone’. Figure 1 also showing the resulting layers of protection, which are described in more detail below. This demonstrates how process safety control is typically build into the design of systems.
Figure 1. Illustration of Process Safety Control
In Figure 1 the ‘normal operating zone’ is the day-to-day tank operating limits (as set out in operating procedures) and where it is designed to be maintained by the operations staff. If these limits are exceeded the Level Alarm High (LAH) sounds and for a tank such as at Buncefield this is normally the high-level alarm to warn operators that the limit of normal operation has been reached and the tank is now ‘full’. A ‘process safety buffer area’ then normally exists between the high-level alarm (LAH) being reached and the next safety system being triggered – which allows the operators to act to bring the tank level back under control. This is called the ‘troubleshooting zone’. The LHH, or high-high level sensor for the tank at Buncefield was an independent device which should have shut the inlet feed to the tank and prevented any further increase in the tank level beyond the ‘process safety buffer zone’. The ‘process safety buffer zone’ always allows for the tank level to go slightly above the high-high level limit (LHH) as excess fluids in the feed line will always lead to a slight further increase in tank level once the feed rate has been stopped through the action of the independent LHH switch. Should the tank level increase further than this ‘process safety buffer’ then design parameters of the tank will now be reached and the fluids will no longer be contained within the tank, this is called the ‘Known Unsafe or Uncertain Zone’. At some point now fluids will start to overflow from the top of the tank as tank level control has failed.
Operators always strive to stay within the normal operating zone, i.e. not exceeding the tanks parameters including level, assuming these are known and have been written down in their operating procedures. If the limits are exceeded, there is normally a troubleshooting zone build in to systems before safety systems are activated. These actions should be detailed in operating procedures to ensure consistency, which for the Buncefield tanks was the high-level alarm and then the independent high level trip. HSE’s comment again relates to information not being documented in operating procedures for what needs to be done when the normal operating zone is exceeded which, if undocumented, relies on the knowledge and experience of those involved. This acknowledges that, unless best practices are recorded and retained corporate memory is eroded. In addition, the LHH failed because knowledge of its design and operation was not transferred from the vendor through to the operational maintenance crews.
As further evidence shows the Buncefield site was not alone in operating this way. The HSE (2011) report also states that “The types of managerial failings revealed during the Buncefield investigation were often found at other major incidents. The report on the gas explosion at Longford, Australia in 1998 (Lessons from Longford: The Esso Gas Plant Explosion) identified factors associated with the incident which were also present at Buncefield. For example:
- Poor communications at shift handover;
- Lack of engineering expertise on site; and
- Failure to implement management of change processes.”
The lessons from the Buncefield incident do not specifically highlight the loss of corporate knowledge that, in part, led to this event and as such similar failings can be seen to reoccur in subsequent events around the world.
Eight years after the Buncefield incident, an article in Berkeley News (2013) written about the ongoing the Deepwater Horizon Investigation quoted Bob Bea, an experienced disaster investigator. Bea’s experience included: the 1988 Piper Alpha oil platform explosion in the North Sea, the 1989 Exxon Valdez oil tanker spill in Alaska’s Prince William Sound, the 2003 disintegration of Space Shuttle Columbia, and the 2005 levee failures in New Orleans. “There is one common thread to these disasters: they are system disasters. They’re caused by human and organizational malfunctions. And, too often, they occur because there is a failure to learn from past mistakes.” (2)
Failure to learn from the past is one of the key symptoms of a weak corporate memory as a cause of major disasters. The industry needs to ensure that it improves its approaches to competence development and improve the management of safety incident information. To cover these points, the next section introduces two knowledge management concepts: better knowledge transfer, and better corporate memory.
Knowledge Management and Process Safety Management
Better knowledge transfer
Previously, what has been less clear and well known are the best tools to facilitate the process and how to identify and capture critical knowledge. In DNV GL’s experience this has been where companies have often struggled as it can be a specialised area, particularly when the person moving is an expert or is very senior. In DNV GL’s experience companies often use several methods, such as creating job descriptions, having extended handover periods, interviewing, or even having the people retiring help with updating the plant operating procedures to ensure the knowledge they had built up was retained.
Effective management of knowledge ensures that organisation understands the knowledge critical to success, who holds that knowledge and how vulnerable the organisation might be to its potential loss. Within a broader knowledge management programme there is one component programme for the Retention Of Critical Knowledge (ROCK). DNV GL has adopted the acronym of ROCK to ‘package’ the related elements of a knowledge capture and transfer process. 2
ROCK comprises two integrated elements: a risk assessment, and the deployment of knowledge-transfer tools and methods. Even when headcount reduction in companies cannot be averted or as personnel naturally retire, ROCK ensures that knowledge retention is achieved successfully and that is available for reuse when needed.
As DNV GL’s entire revenue is based on project delivery, the knowledge retention across its entire project portfolio is essential to success. Research into good practice in the field of knowledge risk and knowledge transfer has led to a development of a method for assessing the risk of knowledge loss, its potential impact on performance and the effort required to capture and transfer that knowledge to maintain its memory of that knowledge. Figure 2 below provides an indicative assessment where key knowledge areas have been identified and subsequently assessed.
Figure 2. Indicative Knowledge Risk Assessment
Following an initial screening against four criteria: importance, documentation, uniqueness and urgency, individuals are invited to undertake the additional risk assessment in collaboration with their line manager. The risk matrix firstly identifies the knowledge held by the individual and then undertakes the assessment that examines the potential impact of the loss of that knowledge and the effort required involved in its to capture and transfer.
Figure 2 indicates that two knowledge areas warrant capture and transfer exercises with the expert and his or her successors. Such mitigating exercises will depend upon the type of knowledge and the time and resources available. DNV GL’s experience with clients as well as our internal programme suggests that the best results are often limited to one-page visual representations making it easy for successors to deploy that knowledge effectively and for line managers to make decisions to reduce further reduce the impact of losing that person’s knowledge.
Finally, ROCK is fully completed when the artefacts (assets) generated are included in the corporate memory, indexed and tagged for easy search, navigation and re-use.
Better corporate memory
DNV GL works with clients in a range of industries and often helps them with process safety management. As such DNV GL has a good appreciation of what represents best practice in this area. Treating any change or transfer of personnel the same way a change to the operating plant is assessed through a Management of Change process is considered best practice here. This might be, for example, when someone is transferring their role within the company or moving into or out of the company permanently.
Whilst the process described above covers personnel movements, it does not cover the loss of critical knowledge from an organisation through a weak corporate memory, i.e. not learning from events and retaining the information in the organisation. Critical knowledge accumulated through learning from events needs to be identified and managed in a separate manner to the transfer of people.
DNV GL knows that, when investigating incidents, it is not normal to identify corporate memory loss as a causal factor, but often instead cite a lack of training and experience or a lack of process safety information (or similar) instead. However, as knowledge is built up through experience gained and then ideally transferred into a company safety management system. That knowledge is rarely documented and made available to others through inclusion in operating procedures. This does not mean that we must cite ‘weak corporate memory’ as a causal factor as training, experience, learning from events and process safety information are easier terms for people to understand. What we do need to ensure is that investigators understand the mechanisms and why they occur, so that they can make suitable recommendations to prevent reoccurrence, which include capturing and transferring the knowledge that underpins the conclusions form the investigation.
This critical knowledge is often called process safety information. There are several methods for doing this that are used by organisations and this is often led by the regulatory regime under which they operate. In the UK, European and Norwegian offshore industry, the use of Safety and Environmental Critical Elements and Procedures (Norway calls these “Barriers”) are required to be used to identify those equipment and procedural items deemed critical to prevent against major accidents. This approach is becoming more established in the onshore industry. Best practice here is then to link the information on these key controls to the updating of procedures and identifying what other process safety information is needed and should be recorded. This will ensure that the critical knowledge gained through identifying these systems and their importance is retained in the organisation. Knowledge needs to be available to those who need it and kept current and that can only come through building its use into a company safety management system, which is then maintained as current. Only through management acceptance and buy-in to these processes will they ever succeed. Retaining and maintaining critical knowledge is a key part of CCPS ‘Risk Based Process Safety Management’ and DNV GL’s ISRS System, which both set out the elements which should make a good Process Safety Management System. ISO 9001:2015 also has a new section outlining the need for identifying, retaining and making critical knowledge available.
Corporate memory loss is expensive both in terms of the administrative costs associated with people change and recruitment. The impacts on an organisation from corporate memory loss can be perceived to be minor through inconvenience and the redoing of work, the costs of which are hidden yet can have significant impact on project delays, e.g. longer procurement times, rework of design, structural integrity, etc. Corporate memory loss can also be seen to play a large part in major accidents and the linkage here is often lost through a concentration on training and experience. This indicates that the organisation has forgotten what is important in preventing the biggest events and as such is not reacting strongly enough to these weak signals to prevent their occurrence.
Current interest in knowledge retention and transfer within organisations appears to have arisen due to the demographic pressures of an ageing workforce. The scale of potential knowledge loss is helping to inform the debate as to what organisations should do to mitigate the risks of losing knowledge. Should that debate result in a wider acceptance amongst senior managers that managing knowledge is not simply a peripheral activity but core to the success of the organisation’s purpose, then this demographic ‘time bomb’ may yet prove to have had a positive impact on organisational behaviour more generally. Even in circumstances where an ageing workforce is less of a problem – career mobility, flexible working, labour market fluctuations, mergers and acquisitions, for example, the loss of knowledge at any given time will result in reduced performance.
Learning and the accumulation of knowledge can come from a variety of sources, such as risk assessments completed, employee knowledge, lessons learned from experience at other similar sites, or incidents around the world. As an experienced incident investigator DNV GL understands that making strong responses to weak signals is a best practice within some companies it works with.
Put simply, knowledge capture and transfer, as a component of knowledge management programmes, should be a normal part of management practice, embedded into management systems and not an add-on to fight fires. The fires are caused by inadequate succession planning, a lack of management commitment and short-term success measures, amongst other things. The fact that the workforce is ageing has been known for decades. It is only in recent years that specific attention has been paid to how the risks associated with the problem may be mitigated. Those organisations that have embraced the need to adopt knowledge retention and transfer practices will have done so having made a rational decision based on a logic that assumes that deploying resources and undertaking some actions makes good business sense.
What DNV GL finds, when assisting clients practically with process safety management, is that methods for identifying process safety information and linking this through to the full update of operating procedures are often missing. This situation means that the process does not work as designed and safety critical information can be lost. When this is then linked to a lack of assessment of people change, corporate memory can easily be lost.
Process safety management is not solely about technical or process triggers but also about having the right knowledge available to the right people at the right time. This paper argues that knowledge retention strategies and cultivating, nurturing and maintaining the corporate memory should be seen as good practice in process safety management and, perhaps more simply, just good management.
1-The authors are grateful to our colleagues Stuart Greenfield, Eelco Kruizinga and Mark Fisher for their comments on earlier drafts of this paper.
2-Weak corporate memories are not the exclusive domain of the oil and gas industries. Tinline (2016) broadcasted a radio programme on this issue citing problems in public service, nuclear defence and linked the recent recession to a lack of corporate memory at The Treasury.
AON Global Risk Management Survey, 2015, as published on The ONE Brief, How to Keep Hold of Your Institutional Knowledge, Source: https://www.theonebrief.com/how-to-keep-hold-of-your-institutional-knowledge/
Berkeley News, Disaster expert cites ‘failure to learn’ for Deepwater Horizon blowout, April 18, 2013
Cabinet Office, Knowledge Principles for Government, July 2016
Cascio, W.F. 2006. Managing Human Resources: Productivity, Quality of Work Life, Profits (7th ed.). Burr Ridge, IL: Irwin/McGraw-Hill. Mitchell, T.R., Holtom, B.C., & Lee, T.W. 2001. How to keep your best employees: Developing an effective retention policy. Academy of Management Executive, 15, 96-108
David DeLong, Lost Knowledge: Confronting the threat of an aging workforce, Oxford University Press, New York, 2004
HSE, EA, SEPA, Buncefield: Why did it happen? COMAH the competent authority, 2011
Office for National Statistics, Overview of the UK Population, 2015
Oxford Economics, The Cost of Brain Drain: Understanding the financial impact of staff turnover, February, 2014
Phil Tinline, BBC, Too good to be forgotten – why institutional memory matters, March, 2016