Managing Linux and Windows Servers From a Common Management Framework

Managing Linux and Windows Servers From a Common Management Framework

Wei, Joseph

Now that Windows and Linux-based x86 servers are being deployed in heterogeneous environments to support mission-critical applications, the challenges of enterprise server management have become more complex than ever. But IT managers have new options as standards-based common management framework solutions are emerging to fill the void.

Until recently, the overwhelming majority of enterprise servers running mission-critical applications were firmly grounded in the world of Unix and mainframe technologies. While servers based on the Intel x86 microprocessor architecture, typically running Linux or Microsoft Windows operating systems, have proliferated steadily in the market, these machines have typically been confined to non-critical tasks. Server downtime or application failure did not carry the risk of bringing an enterprise’s business operations to its knees.

But all of that is changing. With improvements in the Windows and Linux server operating systems over the past few years, more and more x86 servers are now being used for business-critical applications. According to market research firm International Data Corp., x86 server shipments are projected to grow to over five million units a year by 2006. In addition, blade servers – which were introduced in 2001 – are now being adopted in many data centers and enterprises. Because of the flexibility and cost advantages these machines provide, blade servers are expected to have the highest growth rate of any server platform, with a 25 percent share of the total 7.6 million units of x86 servers by 2007.

As a result of this evolution, today’s enterprise data centers are now usually equipped with servers from many different vendors, each of which has its own management tools. These heterogeneous and incompatible data center elements are creating unforeseen complexity and management challenges for IT professionals, leading to hidden expenses not usually reflected in the total cost of ownership caldilations for the operation of an enterprise data center.

Different Tools for Different Servers

IT managers often add new servers and hardware to their enterprise computing environments simply because more performance is required. And the decision to buy that hardware is often based on time-critical business performance requirements and cost factors, resulting in heterogeneous, rather than homogeneous, IT environments.

Because of this trend, IT administrators are now forced to deal with and master many different server management tools, some proprietary to specific server brands. This constraint of using proprietary tools has revealed that the most substantial costs facing an enterprise are not related to the purchase of a server, but to the ongoing operations and management of it. That’s because, while some system vendors provide applications that perform hardware monitoring and management, the tools available from that vendor are usually proprietary and only applicable to that particular hardware.

This has led to a phenomenon known as “console-hopping,” whereby IT administrators have to work with numerous management applications and inefficiently hop from one console to another to monitor and manage all of their disparate hardware platforms.

The problem is escalating. Because custom tools are limited in features and functionality, additional management tools are usually needed to deal with other aspects of system management. And because blade servers are housed in a chassis and share many of its elements, yet even another set of management tools that is unique to blade servers must be accommodated. All of these factors are dramatically increasing the expense of maintaining and managing data centers. Management costs have now escalated to levels that are several times the price of the initial hardware purchase.

IPMI: Industry-Standard Methodology

Fortunately for IT professionals, help is on the way in the form of the IPMI (Intelligent Platform Management Interface) specification. IPMI is a platform management standard defined by industry leaders including Intel, Dell, HP, and NEC, as intended to solve the problem of managing dissimilar hardware platforms. IPMI provides the industry with a standard methodology for accessing and controlling bare-metal hardware, even without software installed or running, effectively creating a standardized hardware management layer. Regardless of the nature of the underlying hardware, server management solutions can now use the one standard interface methodology to discover and communicate with a hardware platform, dynamically gather its relevant information and monitor its health and performance conditions.

Using IPMI as the foundation, IT professionals now have a pathway for remotely discovering, controlling and managing x86-based IPMI-enabled hardware platforms that used to be incompatible from the management standpoint from different systems vendors. This includes bare metal management (regardless of whether the server is turned on or the operating system is functional) and image deployment, all the way up to system performance and health monitoring of the server platforms. While many server platforms already support the IPMI standard today, it’s expected that by the end of 2004, at least 70 percent of all new servers will support this new management interface.

There is currently a long list of IPMI adopters that are endorsing the standard (see www.intel.com/design/servers/ipmi/adopterlist.htm).

IPMI support is usually implemented within a baseboard management controller (BMC) that is part of an autonomous, intelligent subsystem on the server platform. This BMC design allows the management application to determine the health of and maintain control of the server, regardless of whether the server is running or non-operational.

IPMI: Remote Possibilities

Inevitably, servers will experience failures due to hardware or applications issues. As previously mentioned, the resolution of these failures has typically required local IT administrators to be physically in front of the “failed” servers to perform those tasks necessary to get the servers back to normal operation.

This necessary IT resource, associated travel costs and time delays have increased the risk of failure in meeting service level agreements. Remote, “lights-out” solutions have been expensive and proprietary, and lacked support for heterogeneous servers. As a result, data centers did not typically use the remote management solutions with their x86-servers.

With IPMI, however, IT professionals can now achieve lights-out management remotely, efficiently and cost-effectively. Costs are identifiable rather than variable with IPMI-enabled servers and associated management software. To provide remote access to the interface, the microcontrollerbased IPMI subsystem will generally share an Ethernet channel with the platform, or may have its own dedicated Ethernet controller.

This channel constitutes the platform’s connection onto the management network. The management channels of the individual IPMI servers, together with the management server, can all be connected to this same management network.

Despite these capabilities, problems still arise. While many vendors may provide simple hardware management applications that utilize the IPMI interface, these applications typically are not designed to manage more than a handful of servers, nor do they include the functions and capabilities to manage the servers up through the OS level. Most management software that comes with an IPMI server focuses on elementary management of one server at a time.

What is ultimately required is a server management solution that enables groups of heterogeneous servers to be managed in an automated fashion from a single console, making operations faster and more accurate. With this level of intelligence, a group of IPMI servers would be capable of being rebooted using a single command. In addition, a new group of bare-metal servers could be provisioned without the need to log onto each server individually and repeat the provisioning tasks. These IPMI-enabled automated and remote management capabilities are opening the door to new management solutions, placing an emphasis on three critical functional server management parameters: provisioning, monitoring and managing today’s heterogeneous enterprise server environments.

The Provisioning Challenge

Many of today’s data centers and large enterprises have site licenses, which mean that new servers are typically acquired without an operating system installed. Provisioning these bare-metal servers is both time- and labor-intensive, as it involves either loading up the operating system and the applications individually, or setting up a “golden” image for each server and then copying each image onto the appropriately matched bare-metal servers.

Provisioning bare-metal servers is typically performed by an IT administrator who must physically be at the location of the servers. While some add-on solutions now claim to provide bare-metal control, most still require the manual setup of the servers. Such local operations are costly, labor-intensive and prone to error.

Server configuration, IT asset inventories and software revisions are often tracked and updated manually, using spreadsheets. But with up to thousands of servers to maintain in a heterogeneous infrastructure, IT administrators are finding it impractical and sometimes downright impossible – to keep this list updated in a timely and accurate way.

In a hosting environment where new customers are being added while others are being reconfigured frequently, the complex and time-consuming manual provisioning of servers opens the door to human error. Time, labor, money, and worst of all, potentially critical ramifications to a company’s bottom line in the event of downtime or server mismanagement, have become unwanted byproducts of the evolution to heterogeneous server computing environments. How can this challenge be met so that business requirements are satisfied?

A Matter of Intelligence

Provisioning is the task of copying a software image onto a server and bringing the system online. In the past this had been a highly manual process, requiring an IT administrator to be physically present at each server to insert CDs and load the software into that server. With the latest generation of server management solutions, however,. provisioning an image or set of images has become an automated task that can remotely load the image from the network. A “golden” image is created when an operating system and its associated applications are set up, tested and saved as a file, which can then be applied to the appropriate hardware.

With the hardware configuration automatically captured as part of the image, a properly configured server management solution can use the record of each server profile and the server images to check the golden image against the new server hardware configuration. This approach eliminates the potential for loading mismatched images by applying the golden image only to servers that have the same hardware configuration as the original, thereby saving time, effort and potentially tedious and frustrating troubleshooting efforts.

Because each server must have its own network settings, the provisioning process incorporates a step where the unique network settings for each new server can be inserted prior to bootup. All of these steps once added significant time and resources to provisioning a new server, and were very prone to human error. But with an automated process, the network settings of each new server can now be inserted while the IT administrator is setting up the provisioning rules. A properly integrated scheduler automates these tasks, and the entire provisioning process becomes a single step task without manual intervention.

Server Monitoring

While robust and automated provisioning capabilities are necessary to provide a common framework for managing heterogeneous server environments, a rich set of monitoring and managing features are also necessary to meet the challenge. The ideal solution should incorporate an alert service that monitors individual servers and server groups in real time, monitoring the performance characteristics of the server’s CPUs, memory and hard disk, while also tracking network capacity and utilization and storing all metrics for future reporting and analysis. The software should also be capable of monitoring such physical characteristics as temperature, fan speed and power module status.

The overall server management challenge takes on new meaning in a heterogeneous environment, particularly that of a global enterprise with many geographically dispersed sites. Today’s solutions must be capable of managing remote or local rackmounted, pedestal and blade servers through a single interface for maximum efficiency. Along with the Intelligent Platform Management Interface (IPMI), these server management solutions include a management interface for blade servers, and a “generic” server interface, which is used to access conventional x86-based servers that do not have IPMI support.

Working from a common framework, the management software should be able to automatically discover servers as they are attached to the network, enabling IT staff to remotely perform all management tasks, including power cycling, rebooting, resetting and shutdown. Lights-out and “out-of-band” management capabilities ideally can provide around-the-clock automated server management across the enterprise.

While there are a number of software solutions that accommodate one or more of these server management functions, the ultimate value is provided in a common framework that integrates all three of these capabilities.

Server Management in Action

Perhaps the best way to understand how a common server management framework can deliver real business benefits is to examine a few potential scenarios relative to today’s enterprise computing. Take, for example, the case of a data center running an e-commerce site. The IT staff is called upon to repurpose 75 servers with the latest version of the Windows operating system, the Microsoft Commerce server and an inhouse-developed application. The servers are of different vintages and from multiple vendors, and some support IPMI v1 .5. And time is of the essence: The company’s marketing department needs the conversion to be finished within a week.

A quick calculation shows that a person can manually provision and set up no more than four servers in a 10-hour workday, so that it would take 15 man days to complete the conversion. But since the organization’s entire IT team consists of only three people, it would not be possible to finish within the work week, even if the whole team focused on the task to the exclusion of all other duties.

Using an automated server management solution, the IT staff could spend one morning building a golden reference platform, reflecting the setup that every re-purposed server should have. This golden server could then be connected into the managed environment, where it would be automatically discovered and a “gather” function would be performed, creating a master image in less than a half an hour.

Since the other 74 servers are already plugged into the same environment (albeit running different operating system versions and applications), an image deployment function would be immediately performed. Within 30 minutes, all of these servers could be up and running error-free, with the appropriate operating systems, applications and settings all under the control of one management environment.

With such an approach, the time required to install the operating system on the servers is reduced dramatically while the task itself is simplified. The system administrator simply connects servers into the network and electrical outlet, and then sets up the server management software to deploy the appropriate operating systems and application images to the servers. Automatic deployment of the image to the servers takes approximately 30 minutes for all 75 servers, meeting the deadline of installing the servers not only within the allotted week, but in less than one business day.

From Blade to Rack and Back

Another scenario further illustrates the benefits of an automated common server management framework. Take the case of an enterprise that maintains data centers in San Francisco, Tokyo and London. Because there is no single corporate IT governance in the choice of server hardware platforms, the data centers have a mix of blade servers and rack servers, purchased from a combination of vendors. Importantly, these data centers play a critical role in supporting the enterprise with various mission-critical applications. Therefore, each of the local IT groups needs to run 24 x 7 operations to properly maintain the servers. The team on duty performs continuous monitoring and maintenance of the hardware.

By deploying an automated server management framework in the data centers, the enterprise reduces the server management workload of each team member per shift, thereby immediately improving efficiency of its data center operations. The remote management capabilities of the solution provided give the IT staff the ability to take advantage of the time difference between the geographically dispersed data centers. Now, instead of running three shifts at every location, one shift per location is adequate. Although the load on each shift grows with an increased number of hardware platforms, the slight increase in headcount for the single shift is still far less than the resources formerly required for each location running three shifts a day. The freed resources can then be deployed on more challenging mission-critical tasks.

Using the common framework, each team can monitor and manage server systems located on a different continents as if those systems were in the room next door. The complete platform monitoring and control and the ability to logically categorize the systems make the tasks easy and less prone to human error. And using the solution’s scheduling features automates the day-today maintenance tasks so that IT staff members can focus on other tasks that are much more beneficial to the enterprise.

Simplifying Complexity

By any measure, the proliferation of Windows – and Linux-based x86 servers in today’s increasingly heterogeneous computing environments is making the task of server management much more complex. But with emerging standards such as the Intelligent Platform Management Interface, and the introduction of management software solutions that support the IPMI standard, help has arrived. Automated, remote management techniques now enable IT professionals to provision, monitor and manage all of their enterprise servers from a single console, taking the sting out of the process and saving time, labor and needless expense. This common management framework is a potent antidote to cure today’s server management ills.

Joseph Wei

Joseph Wei is vice president of marketing for Amphus, providers of integrated enterprise server provisioning, monitoring and management solutions, Joseph has more than 23 years experience in enterprise computing, having held senior management positions with SGI’s Linux clustering group, NEC’s Windows server division, and DEC.

Copyright Publications & Communications, Inc. Jun 2004

Provided by ProQuest Information and Learning Company. All rights Reserved