When it comes to software products or websites, having them available for use is key to their success.
For many organizations, there’s nothing more frustrating for a customer than a service not being available to use when you need it and there’s a real loss of opportunity for an organization whose key services aren’t able to be used.
For others, having any level of unavailability can be catastrophic and high availability is the aim.
In this article, you will learn what 6 9s availability is, how it’s different than other availabilities, and it’s impact for your product team.
System, database, or application availability is measured as a percentage of time that the service is available over a period of time, (often referred to as “uptime”) and 9s refers to the number of 9s that occur within the percentage.
6 9s is the delivery of availability at a rate of 99.9999 percent, which is the equivalent of no more than 31.5 seconds of unavailability for the period.
We’ve seen that 6 9s refers to a 99.9999 percent availability of the service, but is this really that much more than 99.999 percent, 99.99 percent, or even 99.9 percent?
The differences are as follows:
As you can see, the seemingly small difference between 99.9 percent and 99.9999 percent actually means the difference between 31.5 seconds and 8.76 hours!
What would you prefer with your service? Nearly 9 hours of downtime over the course of a year or just 30 seconds?
|Cost of Downtime
In today’s 24/7 connected world, businesses need to operate around the clock, whether it be an e-commerce shop, the customer database that allows users to sign in to their account, the payment services used to take the transactions, the order management tools, or the logistics software needed to dispatch the goods.
This connected ecosystem means that many different services become mission-critical and this makes the reliability of all systems essential for operation. Any service interruptions can have a negative impact on the business, so organizations seek to enable resiliency to prevent any sort of downtime and quickly recover from costly failures or outages.
For example, in the United States, Amazon sells on average 4,000 products per minute. If Amazon achieved 99.9999 percent availability then they’d run the risk of missing out on 2,100 product sales through 31.5 seconds of downtime, whereas if they achieved 99.99 percent availability then this risk rises to a little over 210,000 products!
This example is used just for scale, as the reality is that if Amazon’s United States shop becomes unavailable the likelihood is that the customer will simply delay their transaction until the service is available again, so having 99.9999 percent availability might not be classed as essential for the Amazon shop.
Where this level of availability is necessary is in industries where just one instance of downtime could be catastrophic, such as air-traffic control or stock market trading.
If the air-traffic control system were unavailable for more than 30 seconds, how far could planes fly without crashing into each other?
If stock market trading systems were unavailable for more than 30 seconds, how many fortunes could be lost through transactions being unable to be completed?
When you understand how the difference in availability levels manifest themselves in terms of time it seems obvious to aim for the highest possible level, however, the reality is that maintaining 99.9999 percent availability will come at a cost.
For this to happen, equipment, networks, and monitoring need to receive investment so that all system components remain operational at all times. Remember, 6 9s availability only allows for 31.5 seconds of downtime per year!
Some examples of considerations for systems operations teams include:
Basically, if it can fail then to achieve 6 9s availability then an alternative, redundant, option needs to be available to be immediately switched to, all of which comes at a cost.
Of course, we are all aiming to achieve 100 percent availability for our systems, but the reality is that it is unreasonable to expect that nothing will go wrong over the course of a year. Even if nothing goes wrong, you’ll still have times when your systems will need upgrading and that will have the ability to impact your availability.
The likelihood is that you don’t work within an environment where 6 9s is essential, however, every level of desired availability is a calculation between the cost of maintaining the level of availability and the cost incurred during downtime.
You should start with your customers and determine what level of access to your systems they actually need:
Once you have this information you can make some decisions on the number of 9s needed to meet their needs and compare the cost of achieving this with the cost of unavailability.
As a product manager, you’re responsible for ensuring that your users can access your product. Because of this, you need to understand the different levels of availability and how a small change in overall percent makes a big difference. However, you also need to consider how available your product needs to be and what cost you’re willing to take on to maintain that.
6 9s is ideal, but depending on the cost, you might not need to have such a high level of availability. Before making a decision, run a cost benefit analysis to make sure you’re prioritizing resources well.
Featured image source: IconScout
LogRocket identifies friction points in the user experience so you can make informed decisions about product and design changes that must happen to hit your goals.
With LogRocket, you can understand the scope of the issues affecting your product and prioritize the changes that need to be made. LogRocket simplifies workflows by allowing Engineering, Product, UX, and Design teams to work from the same data as you, eliminating any confusion about what needs to be done.
Get your teams on the same page — try LogRocket today.
If you think about some of the businesses that market familiarity as a selling point, you actually don’t get negative vibes from them at all.
To maintain operations during disruptions, you must establish, implement, and constantly improve your BCM practices.
Max Wesman details the challenges of scaling his business past its initial market of SMBs to larger customers.