Recently, I was reviewing a chapter from a forthcoming university textbook about virtual machines. As I reviewed this chapter, which dealt with what the authors called co-designed virtual machines, I was struck by how long it has taken academia and the computer industry in general to recognize the value of virtual machines.
The authors of this new book point out that even though hardware and software have undergone radical changes since commercial computers first appeared in the 1950s, the interface between the hardware and software has hardly changed during that time. This hardware/software interface, which is commonly called the instruction set architecture (ISA), is decades old. At the time these ISAs were being designed, hardware was very expensive and fairly simple. It was only natural that designers of the time made the ISA to be a direct reflection of the hardware implementation.
As hardware became cheaper and more plentiful, the mismatch between the simple ISAs of the past and the capabilities of modern microprocessors with hundreds of millions of transistors on a single chip became painfully apparent. The answer for some computer designers was to create a new "virtual" ISA below the old ISA. Instructions from the old ISA would then be converted into instructions in the new ISA before being executed by the new hardware.
An excellent example of this type of design is the modern x86 implementation in most of today's PCs. The PC software still sees the old CISC ISA that was designed in the 1970s. The hardware, however, no longer executes this CISC ISA. Instead, the hardware first dynamically converts each instruction from the CISC ISA to a sequence of instructions in a more modern RISC ISA. The hardware then directly executes these RISC instructions. Since the mid-1990s, microprocessors from both Intel and Advanced Micro Devices (AMD) have used this virtual RISC design.
The disadvantage of the approach I've just described is that the hardware alone is responsible for performing the complex mapping from CISC to RISC. The software is totally oblivious to the mapping and sees only the older CISC ISA. As a result, the software can't take advantage of the more modern microprocessor architecture that the hardware implements. Operating systems and applications must live with the old CISC ISA's limitations.
Fortunately, virtual machine technologies do permit the design of a virtual ISA in which hardware and software are designed concurrently and work cooperatively to support the virtual ISA. The authors of this new book about virtual machines call this approach a co-designed virtual machine. To illustrate the concept, they selected the AS/400 as a case study.
You're probably somewhat familiar with the AS/400's high-level architecture. This architecture, which first appeared in the System/38 (S/38), was one of the first co-designed virtual machines. The goal of the S/38 design wasn't to support an existing, conventional ISA, but rather to create a new higher-level ISA designed for software simplicity and hardware independence. That new ISA was called the Technology Independent Machine Interface (TIMI or MI for short). The MI is a virtual machine (VM), although the name VM wasn't used back in the 1970s, when the MI was first named. VM at that time was the name of a mainframe operating system that could support and host other operating systems. Today, the term "virtual machine" has a much broader meaning and can correctly be used to describe the MI.
An early decision of the S/38 designers was to make the MI an object-based ISA. Architected objects and instructions specific to certain object types were built into the MI from the beginning. Objects not only provided additional hardware independence, but they helped provide a level of security and integrity for the system unmatched by most other systems to this day.
Object-based programming was new in the 1970s, when the S/38 architecture was being defined, so it made no sense to require that programs running above the MI be object-based. The MI had to be designed to support programs written in conventional languages (e.g., RPG, Cobol) and object-based languages. Because conventional languages had no way to define their own object types, it was necessary to design specific types of objects directly into the MI.
In contrast, more recent implementations of object-based virtual machines, such as Java Virtual Machines, assume programs written in object-oriented languages, which can define their own object types. Specific object types, therefore, don't have to be designed into the ISAs for these machines. Other than this difference, the object approach of the MI and of the more modern virtual machines is similar. The primary difference is that the MI predates Java and other modern virtual machine implementations by about 20 years.
Another early decision that the S/38 designers made was that the hardware wouldn't interpretively execute the MI architecture. Early studies showed that using hardware to interpret such a high-level ISA didn't provide the level of performance needed for a commercial server. Besides, most commercial applications are executed over and over again. Interpretation is most useful when a program is to be executed once or only a small number of times.
Because the MI wouldn't be directly executed, another lower-level ISA had to be created one that the hardware could execute. Programs at the MI level would be translated into this lower-level ISA before they were executed. This translation would occur only once. The translated code, along with the original MI version of the program, would then be stored in an object for future use.
The S/38 and the early AS/400 models used a CISC ISA as the executable interface. That CISC ISA was typical of the ISAs of the 1970s and 1980s, and in many ways, it was similar to the hardware ISA in IBM's mainframes. In 1995, the executable interface was changed to a modified PowerPC ISA. Today's POWER4 and POWER5 processors implement this same ISA.
The beauty of this overall virtual machine design is that the hardware ISA can change dramatically, as it did in 1995, with no changes required for operating system or application programs. No other commercially available system in history has ever been able to accomplish this feat.
Thus far, this article has focused on the ISA and the execution of programs in a virtual machine. This is typically as far as most virtual machine designs go. The AS/400 design goes beyond this to create a complete virtual system. All aspects of processing, memory, and I/O are virtual.
The first use of virtual concepts in computers was in the early 1960s, with the introduction of virtual memories. Simply put, virtual memory provides a logical memory view that doesn't necessarily correspond to the memory's physical structure. By the late 1960s, almost every computer vendor supported some form of virtual memory.
Computer vendors adopted virtual memory quickly because of the advent of time-sharing systems. Time-sharing was an evolution of the earlier multiprogramming operating systems, in which the system's memory was partitioned into several pieces, with a different program in each partition. When one program was waiting for an I/O operation to be completed, another program could be using the processor. If enough programs could be kept in memory, the processor could stay busy all the time. Multi-programming operating systems then were still basically batch systems.
Time-sharing was a variant of multiprogramming, in which each user had an online terminal. Because these were interactive users, there was more think time involved and less demand for long periods of processor time. This type of computer could handle more users, so many more program pieces had to exist in the memory simultaneously. Interactive users wanted fast response time, so efficient management of these multiple pieces was crucial. Virtual memory held the hope of doing just that.
The original idea behind a time-sharing system was that individual users in various businesses could rent time on a central computer. This rental approach was popular among the many smaller businesses that couldn't afford their own computer. Time-sharing provided them with the resources of a large computer at a fraction of the cost of owning such a computer. Because the computer users were from unrelated businesses, there was never a need to share information between them.
Virtual memory evolved to support time-sharing by giving a separate address space to each user. The memory space of one user was isolated from the memory space of another user, thereby providing a degree of protection between them. When the computer resources were switched to operate on the program for another user, a new address space was used.
Because each user had a separate address space, there was no data or program sharing between users. To enable some form of sharing between users, the designers of these original virtual memory systems decided to keep the file system outside the virtual memory. They created two places to store data and programs: the virtual memory and the file system.
This approach allowed sharing in the file system, but it also introduced additional overhead. Data and programs could be used or changed only when they were in virtual memory. This meant that anything in the file system had to be moved into virtual memory before it could be used or changed.
In many ways, the decision to maintain two separate levels of storage defeated virtual memory's original concept. The programmer sees and manages two levels of storage; the file system and the virtual memory. Opening a file causes a disk write to a memory swap area, and closing a file requires a disk write back to the permanent location. In other words, there are two disk copies of the file while it's in use. A more efficient approach is to have only one copy of the file.
With only one copy, there's no need to reserve disk space for a swap file. With this approach, the entire file system becomes part of virtual memory. The file manager still keeps a directory, but now it relates the file name to the virtual memory location where the file data is stored. The open and close operations no longer need to physically copy the entire file from its permanent location on the disk. Just the portion (or record) the programmer is reading or working on is copied to a memory buffer. We often describe this approach by saying that the files are always used "in place," thereby improving overall system performance.
As in a two-level virtual memory, the memory is still used as a buffer. Processors can operate directly only on data in memory, not on the disk. The difference with only one level is that memory is a cache for all the disk storage, rather than just for a reserved area on the disk. Also, when one user makes a change to a file, the change is instantly available to any other user sharing the file.
This one-level storage model is called "single-level store," and it's the storage model first implemented on the original S/38. Single-level store goes beyond virtual memory to make all storage memory and disk virtual.
This idea of making all storage virtual has recently been rediscovered in IBM and is now being touted as a new technology for future disk storage arrays. See "Ice Cubes" on page 35 for more details about IBM's Ice Cube prototype.
I/O pervades almost every part of a computer system. Although many people think that I/O is only one-third of the processor, memory, and I/O complex, I/O is in fact considerably larger than either of the other two. More hardware and more lines of operating system code are dedicated to the I/O subsystem than to anything else in the system.
What's the I/O subsystem, and why's it so important? Loosely defined, the I/O subsystem comprises the group of components, both hardware and software, responsible for processing input from and delivering output to a variety of devices attached to the system. Whenever any system resource is required a read from or a write to a file, a request for instructions in a program to be executed, a request for some other system object, creating or destroying an object, communicating with a device and that resource hasn't already been brought into memory, the computer must go through the I/O subsystem to retrieve, store, create, or destroy the resource.
Because I/O is so important to any system, it makes sense to consider implementing virtual I/O. Implementing virtual I/O devices enables more efficient device sharing among multiple users and even multiple operating systems. Virtual I/O also lets you add new devices without adversely affecting application software.
These are exactly the reasons that virtual I/O was first used for the S/38 and since then for all subsequent implementations of this server. Again, the name "virtual" wasn't used to describe I/O in those early systems. Instead, virtual I/O devices at the MI level were called logical devices, and an object at the MI level, known as a Logical Unit Description (LUD), was used to describe these devices. It's important to note that a LUD doesn't describe a physical device. Instead, it describes the characteristics of a physical device that an application sees. Here is just another way in which hardware independence is achieved.
An interesting use of virtual I/O is to substitute one technology for another. An example of this is the Virtual LAN used to communicate between logical partitions on current iSeries servers. From an application perspective, it appears that a traditional LAN is being used to communicate between two or more servers. Physically, these servers reside in separate logical partitions on a single server, and the LAN being used to communicate between servers is physically the memory bus. The virtual LAN emulates an Ethernet adapter and provides multiple TPC/IP connections between partitions. And there's never any need to change application code to use these high-speed LANs.
Currently, almost every computer vendor is overusing the term "virtualization." You can easily get the impression that it's some newly discovered technology that holds much promise for the future. Virtual technologies are important for the future, but as I've shown here, the iSeries has more than 25 years of experience implementing virtual technologies.
The decision to create a virtual system is probably the most important decision that the original designers of the S/38 ever made. Without that decision, it's difficult to say where the iSeries would be today. In all likelihood, it would have joined the many other midrange servers from Digital, Data General, Wang, and HP that were popular during the 1980s and 1990s but are now relegated to scrap heaps. Thanks to its virtual system design, today's iSeries and the i5/OS operating system can live virtually forever.
Frank G. Soltis of IBM Rochester created the technology-independent architecture used in the AS/400 and iSeries. He is IBM's iSeries chief scientist and a professor of computer engineering at the University of Minnesota.
|
ICE CUBES
|
| IBM researchers at the Almaden Research Center have developed a prototype of a very large data-storage device built Lego-style with storage-array bricks that can be stacked into a cube-shaped server. Called Ice Cube, this project is an effort to define the ways for end users to easily maintain increasing amounts of data. As a proof of concept, IBM has built and demonstrated a 32 TB storage system composed of 27 bricks in a three-foot cube. Each brick contains twelve 2.5-inch hard drives, three disk controllers, a microprocessor, and an eight-port Ethernet switch. Six of the Ethernet connections, one for each side of a brick, are used to communicate with adjoining bricks at rates of up to 10 GB per second. The other two are used for external communications. According to the researchers, the hard part is the software. Their goal is to make Ice Cube a fully virtualized storage system. Users will be able to add bricks as needed to increase the storage to meet their needs, and software will automatically generate a single system image of data on the cube. If this fully virtualized storage system sounds familiar, it should. We call it single-level store. F.G.S. |