Marketo is a marketing automation platform which in late July had to apologize to customers over downtime. Downtime was caused because users could not log in to their accounts due to the main domain, Marketo.com had not been renewed. The primary domain is used for users to login to the automation platform, however a number of ancillary domains and services were also adversely impacted too as a follow-on effect. Marketo CEO, Steve Lucas cited “human and process error” as the primary culprit while hordes of frustrated users took to Social Media to vent, while the company pointed fingers at the domain registrar and the registrar pointed them right back.
Fortunately, no-one is going to lose their life or risk personal injury, though someone may be losing their job. In this instance, major disruption has been caused to a player in the marketing automation space, but one which is increasingly under pressure from its competitors, including Salesforce, one of Marketo’s major partners for integration between marketing and sales teams.
This is a good example of why you cannot afford process error, and must minimize or eliminate human error. The impact on Marketo is going to be larger than just the bottom line and the damage to its brand: this will be touted as a fundamental weakness of the platform, and “ours” is much more robust (quote from sales people from any of Markeo’s competitors). Marketo has commissioned a detailed review of operating procedures which led to the domain renewal miss, but this is rather akin to slamming the stable door shut after the horse has bolted.
This has got to be the nightmare every manager or supervisor must live with: the fear of missing something, anything, which in turn sets of a chain of events leading to disaster.
As I’ve said, no-one is going to lose life or limb, however it is the same combinative factor of human and process error which has led to the deaths of hundreds of thousands over time.
Consider, the following:
Institute of Medicine study reports: 44,000 to 98,000 deaths annually due to medical errors in US hospitals (which if correct makes going to hospital the 8th leading cause of death in the country)
Piper Alpha: an oil rig in the North Sea which exploded, killing 167 people and causing around USD $3 billion in property damage. Cause of accident was determined to be a cascading series of human errors, in using equipment, accident and evacuation processes, and how evacuation could be conducted with the loss of the control room (the room melted in the fire).
Phillips 66 pipeline disaster: an explosion at the Houston Chemical Complex close to Pasadena, TX led to the death of 23 people and injured 314 more. The cause was human error in pipeline operation during routine maintenance.
Challenger Space Shuttle Disaster: failed O-rings led to the explosion of the space shuttle in 1986, despite issues with the O-rings being known by NASA engineers and managers, “groupthink” and herd instinct overrode the obvious safety issues.
We tend to look at process management as a means by which we can achieve consistent standards, within acceptable variation limits, and replicate output at scale. In large measure, process results are obtained by minimizing human interaction and decision-making capabilities for standard work. However, this poses inherent danger in and of itself because sometimes to prevent disaster, it is required that someone speaks up and is heard.
Typically, there a multitude of people within the processes involved who could, and should, have yelled out that the bus they are all sitting in is being driven off the cliff. Even where someone does send out an alarm, there must be a way in which this is communicated clearly, explicitly, and with the force it deserves on a timely basis. Perhaps in this regard, the Challenger Disaster is the most concise example of what happens in such instances.
Process management and optimization is responsible for safety improvements across the board, though it is obviously difficult to put a quantitative set of numbers on lives saved, injuries prevented, and dollars saved. Reducing human interaction has reduced human error, however automation must not be a substitute for replacing human judgment, especially when it comes to making a morality-type judgment.
Implementing effective task management and workflow automation will improve operational performance, and it will also help to minimize error, and therefore loss. That said, no process can be created which will be able to confront every case it is applied to, because there will be edge cases which fall outside process parameters. There will be the instances where the process is simply plain wrong, either due to poor design or simple error in application. There will also be those cases where failure results because human warnings of impending doom are either ignored, or not communicated with high enough priority in time to those who can avert disaster.
Therefore, establishing work automation and process control is not enough to prevent disaster. BPM and universal work platforms must also include effective communication and collaboration tools to make sure everyone’s voice can be heard, along with alerts and notifications which can be used to escalate issues as high as needed within the organization.
Ultimately, Marketo has suffered an embarrassing failure which it attributes to process and human error, however the truth is we are all susceptible to this same cause of failure. How confident are you that your processes are fool-proof for all contingencies? You cannot be so confident, and the risk is exponentially increased when we combine even a small amount of process risk with the risk of human error. Thankfully, in the Marketo instance, disaster equates to a CEO apologizing and some red faces, but for organizations involved in matters of life and death, the impact is much more severe.