Hardware Changes' Impact on Application Programming

Article ID: 64444

When I first began to design computers for IBM, the idea of software compatibility was unimportant. If software had to be rewritten in order to run on a new generation of computers, that was OK. Business users in the 1970s, however, made it clear to IBM and other computer vendors that compatibility was important. Their major investments were in software applications, and these applications had to run on the next generation of hardware. Not only did existing applications have to run, they had to run faster.

The requirement to reuse applications for scientific computing, especially for supercomputers, didn't exist until very recently. Rewriting applications for a new generation of scientific computers was the norm. Even in the Unix world, software compatibility was of little importance until business users began to use Unix. Now, whenever IBM announces a new Power Systems model, there is always a statement that the new model is "binary compatible" with the previous models. This statement is aimed at Unix users to assure them that older applications can still run without having to be rewritten or recompiled.

As hardware technologies have continued to evolve, hardware vendors have been able to increase performance, increase throughput, increase capacities, and add new functionality to our computer systems. In order to fully use the new hardware, application programs generally need to be rewritten. However, as long as older applications still see some performance improvements when running on new hardware, there is little incentive to rewrite the old applications.

A great example of this reluctance to rewrite old applications is the move from 32-bit computing to 64-bit computing. Computers with 64-bit hardware have been available since the early 1990s. Today, computers from mainframes to PCs have 64-bit hardware. Although it took nearly 15 years to accomplish the rewrite, 64-bit operating systems are now available for every major hardware platform. Applications are another story.

Very few 64-bit application programs exist today. Unix, Windows, and even mainframe computers overwhelmingly run 32-bit applications on 64-bit hardware. There is, of course, one glaring exception. All IBM i applications run as 64-bit applications, but that's another story. For the rest of the industry, the move from 32-bit to 64-bit software will take at least 25 years and possibly longer.

What if moving to new hardware didn't improve the performance of existing applications? Worse yet, what if performance was degraded unless the applications were rewritten for the new hardware? This could be a disaster for both computer vendors and users.

Surprisingly, this is exactly what could happen in the computer industry over the next couple of years. There is still much debate about exactly what will happen, but many in our industry are convinced that a major rewrite of all applications is the only way forward. Let me explain.

The Core of the Problem

Ever since the first microprocessors emerged in the early 1970s, the way to increase performance has been to make chips that had smaller and smaller features and that ran at higher and higher clock speeds. Higher clock speeds meant that all programs, old or new, saw some performance improvements. This approach to microprocessor design ended a few years ago, however, when the size of the transistors on a chip became so small that much of the electricity pumped into those transistors leaked out, producing a large amount of heat. By this time, there were also so many transistors packed so tightly on these chips that the total heat generated couldn't be simply carried away. Some chip makers believed that without very sophisticated cooling mechanisms, clock speeds above 5GHz would melt the silicon from which the chips were made.

The result was that chip makers stopped increasing clock speeds. This is not to say that advances in silicon technology and chip design ended. Indeed, Moore's Law, which says the number of transistors on a chip doubles every two years, is still very much alive. What has changed is the way chip makers are using those additional transistors predicted by Moore's Law. Those additional transistors are now being used to increase the number of processors, or "cores," in the chip. Chip maker Intel, for example, predicts that in the not-too distant-future we'll see chips with hundreds of cores inside.

Eight Is Enough

Although multicore chips may have solved some problems for the chip makers, they're creating enormous problems for almost everyone else in the computer industry. System manufacturers, operating system designers, compiler writers, application writers, and users are all affected by the decision to implement multicore chips. Single-threaded applications—those applications designed to run sequentially on a single processor—don't benefit from running on multicore chips. These applications must be either rewritten for multicore chips or, at the very least, recompiled with a compiler designed specifically for parallel processing.

By most estimates, greater than 90 percent of all applications today are single threaded. Rewriting or recompiling these sequential applications to run in parallel will be difficult. Most software experts agree that somewhere between four and eight is probably the maximum number of cores that existing applications can use. Going beyond eight will require fairly radical redesign for applications. And yet, chip makers are bound and determined to go well beyond eight cores as the only way to increase performance.

New development tools to deal with what some authors have called "the multicore menace" are rapidly being created. A myriad of new languages and tools designed specifically for parallel programming are appearing almost daily. Microsoft, for example, has already released several new parallel-programming tools and a new programming language called F#. Intel, HP, and several other vendors have also released new programming tools and languages for multicore chips.

Many new parallel-programming languages originally created for programming massively parallel supercomputers are also being proposed for general-purpose use. Two of those languages, Erlang and Clojure, are dialects of Java that enable applications to be distributed across thousands of cores.

To further complicate matters, many computer professionals believe that multicore chips, as they currently exist in conventional general-purpose processors, won't survive much longer. They point out that the problem with a large number of cores on a single chip is the inability to feed data to all the processors. The number of connections to the chip isn't increasing, meaning that the bandwidth to off-chip memories is limited. Hardware vendors, for example IBM and Intel, are proposing to stack memory chips above their processor chips to increase the number of connections to the chip and thus increase the memory bandwidth. This, too, is not a long-term solution.

The biggest news for computer hardware may be the many specialized processors designed specifically for parallel processing. One example is the Cell chip from IBM, which contains a POWER processor and eight special-purpose processors designed for parallel processing. Created originally for gaming platforms, in which intense graphics and realtime responsiveness are extremely important, these chips are now being used for a variety of applications, including supercomputer applications. It won't be long before multicore chips include a variety of different processors for specialized functions.

Intel has recently announced that it, too, is exploring system-on-chip designs—complex microchips that perform specialized tasks on top of general-purpose computations. Programming for these "hybrid architecture" chips will be difficult and will require new programming tools.

About the only thing clear regarding the future of multicore chip development and the software technologies that will be used to create applications for massively parallel chips is that there's no clear future. Although it's imperative that the computer industry move quickly to identify effective tools and techniques that software developers can use to create future parallel applications, there's no indication that this will happen soon.

High Productivity Computing System

One of the most exciting projects in parallel processing was started a few years ago by the Defense Advanced Research Projects Agency (DARPA). It's called the High Productivity Computing System (HPCS), and its goal is to provide a totally new generation of high productivity computing systems that can be used for a wide variety of applications. The reason for the need to create a new generation of computers is because of the way parallel applications are written today.

Using layers of abstraction to hide complexity and to greatly enhance programming productivity has long been a staple of commercial programming. Commercial applications written in assembly language disappeared many years ago. Yet, in the world of programming highly parallel applications, programmers are still living in the Stone Age and using what amounts to parallel assembly language. The new languages and tools being developed for multicore chips are trying to raise the level of parallel programming, but they still have a long way to go.

Because parallel programming languages and tools are primitive, programmer productivity is low. Also, whenever a new generation of hardware emerges, entire applications have to be totally rewritten. There's no ability to reuse existing applications on the new hardware. HPCS is intended to solve the productivity and reuse problems. To solve these problems, DARPA funded research efforts in three companies: Cray, Sun, and IBM.

IBM's Programmable Easy-to-use Reliable Computing System (PERCS) project, funded by DARPA, is an attempt to create a highly adaptable computing system that configures its hardware and software components to match the application demands. Working with Los Alamos National Laboratory and 12 major universities, IBM's goal is to create systems that automatically analyze the workload and dynamically respond to changes in application demands by reconfiguring system components to match application needs.

The PERCS project uses a combined hardware-software design methodology to integrate advances in chip technology, architecture, operating systems, compilers, programming languages, and programming tools to deliver scalable systems that will provide an order-of-magnitude improvement in development productivity for parallel applications by 2010.

To accomplish this, PERCS includes a new open-source, object-oriented language called X10, innovative middleware, and new programming environments that will be supported by hardware features to automate many phases of the program-development process. Some of these components are already available. Other features will be delivered in 2010 with IBM's POWER7 processors.

Whereas the goal of HPCS is to meet the need for commercially successful petascale computing systems for high-end users in government, science, and industry in 2010, IBM has a broader goal in mind. The technologies created for PERCS will be implemented in future versions of Power Systems intended for commercial applications.

End of Multicore Computing?

It should now be obvious that the computer industry will likely see major disruptions in the next few years. Reprogramming applications for multicore chips will be challenging. Up to about eight cores, operating system enhancements and compiler improvements are probably good enough to provide sufficient performance improvements for most of today's applications. Beyond eight cores, it's uncertain whether conventional applications will see any benefits, and they may even see reduced performance.

As more and more cores on a chip compete for the same data, there comes a point at which adding another core will actually slow down the application. Even with all the efforts being expended in rewriting existing applications for multicore chips, there's the strong possibility that multicore computing in its present form won't survive for more than a few years.

Because of the limitations with multicore computing, many computer scientists, especially those in academia, are not only predicting the end of multicore computing, they're predicting the end of conventional computer architectures such as Intel's x86. They argue that the x86 architecture was never designed for parallel processing and that a multicore implementation is just a short-term fix.

Many of these same computer scientists are now calling for the creation of a new stable and enduring computer system architecture that will support massively parallel processing. Perhaps the new system design will look something like the one being created for IBM's PERCS project. Perhaps it will be something else. There's no shortage of proposals for what the future system design should be. There is, however, agreement that it will be very different from today's design.

One of the primary goals of almost all these future system design proposals, whether it's IBM's PERCS or any of the others, is to enable the reuse of existing applications. In other words, the goal of any new design is to be capable of incorporating future hardware and software technologies with minimal impact on existing applications.

Futuristic Design

Does this sound familiar? The goal for a future system design is technology independence. This should not come as a big surprise. The software development investments already made in applications for everything from supercomputing to business computing are far too valuable to simply throw away. The next generation of computer systems must find a way to protect those investments.

As Mark Twain once commented, "History does not repeat itself, but it does rhyme." There's a certain amount of satisfaction in knowing that concepts that emerged in the 1970s, such as technology independence, are being revisited. Viewed as a radical, futuristic concept when it was first introduced in the S/38 in 1978, technology independence, with its ability to incorporate new hardware and software technologies without affecting existing applications, has clearly stood the test of time.

That original design of the S/38 didn't stand still. More functionality continued to be included, and in 1988 the ability to run S/36 applications was added. That merging of two systems resulted in the S/38 being reintroduced to the computing world as the AS/400. The new AS/400 became an instant success with businesses of all sizes.

In 1995, IBM introduced the first 64-bit POWER processors into the AS/400. Thanks to the AS/400's technology-independent design, not a single line of application code had to be modified or even recompiled for the new hardware. No other system has ever been able to move applications to a totally new processor architecture without requiring massive application changes. The AS/400, which was subsequently renamed to iSeries, System i, and finally IBM i, stands alone in this regard.

IBM i today has that very same technology independence that has protected the application investments of hundreds of thousands of businesses all over the world for more than 30 years. Moving to new generations of hardware and software over the years has never required rewriting or even recompiling applications. Even the move to the first commercially available multicore chips in 2001 didn't require application changes. Those same applications that moved seamlessly from one computer generation to the next will continue to move forward in the future. No other computer system can match this record.

Maybe the world is finally ready for some of this "radical" thinking. The HPCS project from DARPA is certainly trying to find ways to avoid having to rewrite applications every time the hardware changes. Microsoft and Intel are putting out new tools as fast as they can to protect their investments in x86 hardware and software, even if the whole concept of multicore chips might be flawed.

And let's not forget about productivity. IBM i and its predecessor systems were designed from the very beginning to make writing applications far more productive than conventional computer systems. Integrating many of the components that the application needs, such as a database, into the operating system is one way to improve productivity. Single-level storage, in which all storage is treated as memory, is another. Built-in security and virus resistance also can make life a lot easier for application programmers.

If, as many people believe, the computing world is at a turning point because of the limitations of multicore hardware, maybe, just maybe, a futuristic design such as the IBM i is the answer. Although it's highly unlikely that the IBM i design will be the only answer, it's comforting to know that IBM i will be there to meet the needs of business computing well into the future.

[Editor's Note: This article is adapted and used with permission from Frank Soltis's foreword to The All-Everything Operating System, by Brian W. Kelly (Let's Go Publish, 2009).]

Frank G. Soltis, who recently retired from his chief scientist job at IBM, is widely regarded as the father of the AS/400. He is the creator of the technology-independent architecture that led to a new breed of business computers, including the AS/400 and the IBM i. Dr. Soltis currently travels the world speaking on long-term IT trends and technology advancements. In addition, he works with leading IBM business partners to help guide their future product strategies.

ProVIP Sponsors

ProVIP Sponsors