Edge Systems: How Much Redundancy Is Enough?
According to a 2020 survey conducted by Information Technology Intelligence Consulting, a single hour of IT system downtime could cost your organization $300,000 or more. To start.
That’s a staggering estimate, and one that reinforces the critical importance of redundant systems in both small and large data centers. System failures can lead to serious consequences for any business, including loss of productivity and revenue, missed opportunities, and eroded reputation.
Now, move your processes and applications out to The Edge, supporting a manufacturing facility, main distribution hub, transit system, and the energy distribution grid...now calculate the cost impact of failure at The Edge if one of these sites were to be shut down. The losses can increase dramatically without proper planning.
To avoid downtime, organizations plan for redundancies that help ensure the system is always up and running, with a ready supply of power (and sometimes equipment) in the event of an outage. Redundancy also helps facility and IT managers schedule maintenance; without it, a system would need to be shut down in order to perform service.
Redundancy Considerations in Edge Systems
The first step in planning for redundancy in an Edge system is to identify the appropriate redundancy designation for the facility. IT deployments, regardless of size or location, rely on a common system to describe the degree of redundancy incorporated into their power and cooling systems.
- N: “N” signifies the system’s power and cooling infrastructure; this is the baseline capacity with no redundancies
- N+1: The baseline capacity plus an additional component to account for failure and/or maintenance. Data center N+1 redundancy standards typically require an extra unit for every four needed, so if 8 cooling units are required, an N+1 facility would have 10. This designation is commonly adopted for cooling systems in U.S. data centers
- 2N: A fully redundant system with a completely independent mirrored system that can assume all system operations should the first system go offline
- 2N+1: A completely paralleled backup system plus additional components to account for failure and maintenance in each system without the need to shift to a backup system
In choosing the right redundancy level for their needs, facility managers must evaluate a number of factors, one of them being the industry or processes served. Some deployments, like those at the Edge, are within harsh and/or uncontrolled environments, presenting more threats to uptime than are present in a typical data center. In manufacturing settings, for example, dust, debris, liquids and solvents, as well as power surges and other events, may cause equipment failure. Other industries, such as Healthcare and Government, require the maximum percentage of uptime in order to provide continuous service levels for users and to comply with industry or governmental regulations.
Another consideration is cost. While it’s important to maximize the uptime appropriate for your industry, it’s also wise not to overpay for capabilities you don’t need, so identifying the most practical level of redundancy is also a financial decision.
Critical Redundancies in Edge Systems
Many businesses are shifting from enterprise to Edge to take advantage of its low latency, cost savings and other benefits, and redundancy plans designed for these situations are important for two reasons: First, these deployments aren’t always staffed with IT professionals who can easily respond to events like power outages; and second, many Edge deployments are located in remote and/or unmanaged areas, where risks to uptime are even greater than in a more traditional data center.
Power supplies and UPS systems have the biggest impact on uptime, so redundancies are most critical there:
For your most critical servers, redundant power supplies, coming from separate power panels, should be implemented so that if one source fails, the servers will continue operating. Under normal operation, each of the two power supplies will provide half of the power that is needed; if one is powered off for some reason, the other will immediately compensate to provide full power to the device so there is no downtime.
Another benefit of redundant power is that should one stop working, it can be replaced without taking the device it’s connected to offline. You can simply unplug and remove the defective power supply from the device and slide a new one in its place and plug it in. Your second power supply will keep the device running while you’re making the switch.
An uninterrupted power supply (UPS) provides continuous electricity, even when power is out. Found in just about every enterprise data center, UPS systems “clean up” power to IT systems and will have battery backup systems and emergency generators. This allows the facility to operate even without utility power.
This level of support may not be practical for an Edge installation. Each server rack should have at least a UPS system with on-board batteries, matched to the installed load, to maintain power in the event of a power outage or brown-out. Edge UPS systems are not designed to support long term operations, but to provide enough time to have an orderly shutdown of the affected systems.
If practical, redundant UPS systems in a rack should be plugged into a different circuit breaker (ideally on a different electrical box) to ensure constant availability. Should an electrical problem occur with one of the circuits or boxes, the entire rack won’t fail.
Cooling is as critical at The Edge as power, keeping equipment from overheating and minimizing the likelihood of failure. Again, it may not be practical to provide the same levels of redundancy as found in the data center, but you can still insure the highest level of climate control.
For the single footprint, standalone Edge installation, a cabinet mount air conditioner, like the Blue e+, can be installed; two can even be installed on the same footprint to provide N+1 redundancy. As the Edge installation grows, so too will the required capacity for climate control and heat removal. Closed coupled, closed loop systems based on the LCP DX system can provide increased heat removal capacities while supporting multiple footprints in a single row or pod.
Moving beyond the single cabinet, standalone Edge installation, row based cooling can also provide dedicated climate control for the Spine/Edge Data Center, whether in a dedicated room, standalone container or similar space
For any of these deployments, redundancies at the cabinet or row level provide added assurance of maximum uptime. Different Rittal cooling solutions, for example, feature redundant heat exchangers, power infeeds, fans, and temperature sensors.
The importance of redundancy cannot be overstated; without it, every piece of equipment in your Edge installation may be vulnerable to failure caused by loss of power, overheating and other threats, and the result could be catastrophic. Using these basic guidelines – and the insights of cooling experts – facility managers can have peace of mind that their equipment will continue to operate as normal, even when conditions are anything but.