Recent IT governance discussions have focused on the issue of cybersecurity risk — as they should — but managing data center and physical infrastructure risks can be just as vital to the enterprise. If data center errors cause a downtime event and bring IT operations to a halt, business-critical activities are interrupted and the company's market reputation can be damaged. Unlike cyberattacks, IT infrastructure risk is entirely manageable and usually preventable from within the organization, as long as the right oversight criteria and practices are in place.
A company's data center assets can be a source of strategic advantage, or a drain on the enterprise, depending on how they are managed. The aim of IT infrastructure governance is to align data center resources, investment, and management with the business mission. However, this is easier said than done. The complexities of modern enterprise IT infrastructure — data centers, networks, storage, cloud computing, and business continuity/disaster recovery — can be challenging for boards to grapple with, demanding a level of technology expertise that many directors don't have.
Knowing the right questions to ask, having a context for the issues involved, consulting with industry experts, and understanding the strategic implications of a company's data center infrastructure choices lead to effective board decision making. This article discusses some of the factors that boards should be aware of, and how to optimize the strategic value of data center assets. There are seven key aspects to consider:
1. Matching Infrastructure Capabilities to the Business Mission
Any enterprise that relies on 24 x 7 availability of its IT systems and networks needs to ensure data center facilities and equipment are sufficient to protect against a downtime event. In the data center industry's classification system, a facility that is designed, built, and certified to “Tier III” standards can provide that level of availability and resilience, incorporating redundant systems, back-up power sources, and other failsafe measures. Tier III Certification requires “concurrent maintainability,” which means that any piece of critical equipment can be taken offline for maintenance without having to shut down the facility. Redundant systems and the ability to isolate equipment enable live operations to continue uninterrupted.
2. Know the Risks
From a business standpoint, there are clear risks to the enterprise if a data center downtime event occurs: interruption of operating activity and financial transactions, blocked user access, and customer service impact, at a minimum, along with the resulting lost revenue and negative public relations. It can damage the company and the brand, and have liability implications if the service interruption caused any downstream harm to customers.
But there are other types of risk in data center operations, in particular the risk of accidents that can cause facility damage, personnel injury, or even death. For example, it's a common practice in many facilities that do not have concurrent maintainability to perform maintenance on systems that are live, because equipment cannot be shut down without interrupting the production environment. OSHA reports that 80% of industrial fires, explosions, hospitalizations, and fatalities every year are caused by working under these conditions, and it has been tightening scrutiny. If a board of directors allows the company's data center to be maintained while energized, they are responsible for the dangerous site conditions. Not only will the corporation be liable for worksite accidents that occur, board members are personally liable, whether risks were known and ignored or simply overlooked.
3. Build vs. Buy Decisions
Constructing an in-house 24 x 7 data center can be one of the largest capital investments a company makes, costing tens or even hundreds of millions of dollars. A board has to weigh the capital investment and operating expenses versus the ongoing costs of outsourced services, assess capacity and scalability needs, and consider internal management vs. the demands of vendor oversight.
The array of outsource service options includes colocation, hosted, leased, and cloud services. How can boards verify provider qualifications and operating practices? Without transparency, a vendor may be placing the enterprise at unsuspected risk. Requiring evidence of independent, third-party validation of data center providers (such as Tier III Certification) can verify reliability. However, a board may not even be aware of management team decisions to select an outsource data center vendor if the ongoing operating expense falls below a budget threshold that would trigger oversight. It's critical to classify data center deployment as a strategic-level decision that needs to be reviewed at the board level. Decision-making tools such as FORCSS® enable organizations to evaluate the various alternatives available by adequately identifying and accounting for Financial, Opportunity, Risk, Compliance, Sustainability, and Service Quality considerations.
4. Start with the End in Mind
If the company does elect to build and operate its own data center, engaging qualified design and construction firms is a key first step. But all too often operations leadership does not have a seat at the table until late in the process, and the result is a facility that impedes efficiency and does not support effective operations and maintenance approaches that minimize risk and cost long term. Losing minutes a day to inefficiency adds up to hundreds or thousands of man-hours over the life span of a facility, costing much more than any initial savings. Mandating that operations leadership be included in the design review and construction process—i.e., starting the project with the end objectives clearly in mind — minimizes cost overruns and reduces risk during the project's most vulnerable period (years 1 and 2). It also accelerates the start of live operations, closing the gap between CapEx outlays and productive operations, thus enhancing project ROI.
5. Don't Neglect Operations
Analysis of more than 20 years of collected data on facility incidents and outages reveals that the majority are caused by human error. This does not mean responsibility lies with one isolated operator. “Human error” is most likely to occur in an environment with insufficient processes, poorly followed procedures, a lack of training and resources, and a culture that does not support rigor and consistency; in other words, when there is a failure of leadership.
Any large complex system such as a data center demands continued vigilance to maintain performance. If the operations team is not focused on continuous quality improvement, the inevitable result is not stasis, but decline. Just as Tier Certification of data center facilities ensures that physical infrastructure is sufficient to meet the business mission, likewise validating a data center's management and operations effectiveness ensures that day-to-day activities are not placing the enterprise at risk.
6. Trust but Verify
Enterprise IT involves complex, highly technological systems, making it difficult for boards to evaluate the range of solutions. Even with IT expertise among the directors, data centers incorporate a multitude of large-scale industrial systems (e.g., power generation, cooling), and demand multiple cross-disciplinary skillsets (e.g., architectural, construction, electrical, plumbing, fuel systems). How can an organization be sure its business-critical infrastructure is adequately protected from failure, run effectively, and yielding the greatest return? Obtaining industry-recognized credentials from independent experts guarantees thorough and rigorous assessment and verification of all critical design, construction, and operating factors.
Tier Certification of data center Design, Construction and Operation, or a Management & Operations (M&O) Stamp of Approval, assures stakeholders and the marketplace that data availability and risk management are handled effectively. Demonstrating industry-leading quality — and avoiding a downtime incident — enhances market competitiveness. It also demonstrates that a board has exercised sufficient due diligence. If the organization uses colocation/hosting vendors, those who have earned one of these credentials can be trusted to help run the company's critical IT infrastructure reliably.
7. Efficiency: For the Planet and the Bottom Line
If a company builds a data center from the ground up, it is relatively easy to incorporate energy efficiency considerations into the planning process. For enterprises operating existing data center facilities, however, implementing new efficiency measures and installing energy saving equipment can seem like an unwelcome capital expense. It is a false economy, however, as efficiency initiatives — from improving cooling technologies to decommissioning underutilized server equipment — can provide almost immediate payback and long-term OpEx and CapEx benefits that dwarf the relatively small cost of implementation.
Leading data centers in the telecommunications, financial services, health care, and media industries have experienced significant reductions in energy and resource use, decreased ancillary costs, expanded computing capacity, and lowered carbon emissions — while saving tens of millions of dollars. They've avoided having to invest in new data center capacity by making more efficient use of existing space, thus extending the life of their facilities and maximizing asset return. By contrast, postponing taking efficiency action is not a cost-neutral option but inevitably ends up costing the company more, as rising energy costs and underutilization of assets continue to push baseline data center operating expenditures higher. Your organization could be wasting millions — or hundreds of millions — on unnecessary expenditures by running a data center inefficiently. Start asking questions about server utilization and mechanical energy efficiency where the most significant savings can often be found.
With data centers providing the critical IT infrastructure underlying our globally connected economy and society, IT governance is vitally important like never before. Following the IT infrastructure best practices outlined above will allow a corporation to minimize risk and leverage its data center assets as a strategic advantage in terms of both ROI and market perception.