Tuesday, March 07, 2006

A thing or two about buses

PCI (Peripheral Component Interface)

The idea of a bus is simple -- it lets you connect components to the computer's processor. Some of the components that you might want to connect include hard disks, memory, sound systems, and video systems and so on. For example, to see what your computer is doing, you normally use a CRT or LCD screen. You need special hardware to drive the screen, so the screen is driven by a graphics card. A graphics card is a small printed circuit board designed to plug into the bus. The graphics card talks to the processor using the computer's bus as a communication path.
The advantage of a bus is that it makes parts more interchangeable. If you want to get a better graphics card, you simply unplug the old card from the bus and plug in a new one. If you want two monitors on your computer, you plug two graphics cards into the bus. And so on.

Twenty or 30 years ago, the processors were so slow that the processor and the bus were synchronized -- the bus ran at the same speed as the processor, and there was one bus in the machine. Today, the processors run so fast that most computers have two or more buses. Each bus specializes in a certain type of traffic.

A typical desktop PC today has two main buses:

The first one, known as the system bus or local bus, connects the microprocessor (central processing unit) and the system memory. This is the fastest bus in the system.

The second one is a slower bus for communicating with things like hard disks and sound cards. One very common bus of this type is known as the PCI bus. These slower buses connect to the system bus through a bridge, which is a part of the computer's chipset and acts as a traffic cop, integrating the data from the other buses to the system bus.

Technically there are other buses as well. For example, the Universal Serial Bus (USB) is a way of connecting things like cameras, scanners and printers to your computer. It uses a thin wire to connect to the devices, and many devices can share that wire simultaneously. Firewire is another bus, used today mostly for video cameras and external hard drives.




Quick History

The original PC bus in the original IBM PC (circa 1982) was 16 bits wide and operated at 4.77 MHz. It officially became known as the ISA bus. This bus design is capable of passing along data at a rate of up to 9 MBps (megabytes per second) or so, fast enough even for many of today's applications.
Several years ago, the ISA bus was still used on many computers. That bus accepted computer cards developed for the original IBM PC in the early 1980s. The ISA bus remained in use even after more advanced technologies were available to replace it.

There were a couple of key reasons for its longevity:

Long-term compatibility with a large number of hardware manufacturers.
Before the rise of multimedia, few hardware peripherals fully utilized the speed of the newer bus.
As technology advanced and the ISA bus failed to keep up, other buses were developed. Key among these were Extended Industry Standard Architecture (EISA) -- which was 32 bits at 8 MHz-- and Vesa Local Bus (VL-Bus). The cool thing about VL-Bus (named after VESA, the Video Electronics Standards Association, which created the standard) is that it was 32 bits wide and operated at the speed of the local bus, which was normally the speed of the processor itself. The VL-Bus essentially tied directly into the CPU. This worked okay for a single device, or maybe even two. But connecting more than two devices to the VL-Bus introduced the possibility of interference with the performance of the CPU. Because of this, the VL-Bus was typically used only for connecting a graphics card, a component that really benefits from high-speed access to the CPU.

Along Comes PCI

During the early 1990s, Intel introduced a new bus standard for consideration, the Peripheral Component Interconnect (PCI) bus. PCI presents a hybrid of sorts between ISA and VL-Bus. It provides direct access to system memory for connected devices, but uses a bridge to connect to the frontside bus and therefore to the CPU. Basically, this means that it is capable of even higher performance than VL-Bus while eliminating the potential for interference with the CPU.

The frontside bus is a physical connection that actually connects the processor to most of the other components in the computer, including main memory (RAM), hard drives and the PCI slots. These days, the frontside bus usually operates at 400-MHz, with newer systems running at 800-MHz.

The backside bus is a separate connection between the processor and the Level 2 cache. This bus operates at a faster speed than the frontside bus, usually at the same speed as the processor, so all that caching works as efficiently as possible. Backside buses have evolved over the years. In the 1990s, the backside bus was a wire that connected the main processor to an off-chip cache. This cache was actually a separate chip that required expensive memory. Since then, the Level 2 cache has been integrated into the main processor, making processors smaller and cheaper. Since the cache is now on the processor itself, in some ways the backside bus isn't really a bus anymore.
PCI can connect more devices than VL-Bus, up to five external components. Each of the five connectors for an external component can be replaced with two fixed devices on the motherboard. Also, you can have more than one PCI bus on the same computer, although this is rarely done. The PCI bridge chip regulates the speed of the PCI bus independently of the CPU's speed. This provides a higher degree of reliability and ensures that PCI-hardware manufacturers know exactly what to design for.


PCI cards use 47 pins.

PCI originally operated at 33 MHz using a 32-bit-wide path. Revisions to the standard include increasing the speed from 33 MHz to 66 MHz and doubling the bit count to 64. Currently, PCI-X provides for 64-bit transfers at a speed of 133 MHz for an amazing 1-GBps (gigabyte per second) transfer rate!
PCI cards use 47 pins to connect (49 pins for a mastering card, which can control the PCI bus without CPU intervention). The PCI bus is able to work with so few pins because of hardware multiplexing, which means that the device sends more than one signal over a single pin. Also, PCI supports devices that use either 5 volts or 3.3 volts

PCI vs. AGP

The PCI bus was adequate for many years, providing enough bandwidth for all the peripherals most users might want to connect. All except one: graphics cards. In the mid 1990s, graphics cards were getting more and more powerful, and 3D games were demanding higher performance. The PCI bus just couldn't handle all the information passing between the main processor and the graphics processor. As a result, Intel developed the Accelerated Graphics Port (AGP). AGP is a bus dedicated completely to graphics cards. The bandwidth across the AGP bus isn't shared with any other components. Although PCI continues to be the bus of choice for most peripherals, AGP has taken over the specialized task of graphics processing.


Plug and Play

Plug and Play (PnP) means that you can connect a device or insert a card into your computer and it is automatically recognized and configured to work in your system. PnP is a simple concept, but it took a concerted effort on the part of the computer industry to make it happen. Intel created the PnP standard and incorporated it into the design for PCI. But it wasn't until several years later that a mainstream operating system, Windows 95, provided system-level support for PnP. The introduction of PnP accelerated the demand for computers with PCI, very quickly supplanting ISA as the bus of choice.
To be fully implemented, PnP requires three things:

PnP BIOS - The core utility that enables PnP and detects PnP devices. The BIOS also reads the ESCD for configuration information on existing PnP devices.
Extended System Configuration Data (ESCD) - A file that contains information about installed PnP devices.
PnP operating system - Any operating system, such as Windows XP, that supports PnP. PnP handlers in the operating system complete the configuration process started by the BIOS for each PnP device. PnP automates several key tasks that were typically done either manually or with an installation utility provided by the hardware manufacturer. These tasks include the setting of:

o Interrupt requests (IRQ) - An IRQ, also known as a hardware interrupt, is used by the various parts of a computer to get the attention of the CPU. For example, the mouse sends an IRQ every time it is moved to let the CPU know that it's doing something. Before PCI, every hardware component needed a separate IRQ setting. But PCI manages hardware interrupts at the bus bridge, allowing it to use a single system IRQ for multiple PCI devices.

o Direct memory access (DMA) - This simply means that the device is configured to access system memory without consulting the CPU first.

o Memory addresses - Many devices are assigned a section of system memory for exclusive use by that device. This ensures that the hardware will have the needed resources to operate properly.

o Input/Output (I/O) configuration - This setting defines the ports used by the device for receiving and sending information.

While PnP makes it much easier to add devices to your computer, it is not infallible.
Variations in the software routines used by PnP BIOS developers, PCI device manufacturers and Microsoft have led many to refer to PnP as "Plug and Pray." But the overall effect of PnP has been to greatly simplify the process of upgrading your computer to add new devices or replace existing ones.


How It Works

Let's say that you have just added a new PCI-based sound card to your Windows XP computer. Here's an example of how it would work.

1. You open up your computer's case and plug the sound card into an empty PCI slot on the motherboard.
2. You close the computer's case and power up the computer.
3. The system BIOS initiates the PnP BIOS.


This motherboard has four PCI slots.

4. The PnP BIOS scans the PCI bus for hardware. It does this by sending out a signal to any device connected to the bus, asking the device who it is.
5. The sound card responds by identifying itself. The device ID is sent back across the bus to the BIOS.
6. The PnP BIOS checks the ESCD to see if the configuration data for the sound card is already present. Since the sound card was just installed, there is no existing ESCD record for it.
7. The PnP BIOS assigns IRQ, DMA, memory address and I/O settings to the sound card and saves the data in the ESCD.
8. Windows XP boots up. It checks the ESCD and the PCI bus. The operating system detects that the sound card is a new device and displays a small window telling you that Windows has found new hardware and is determining what it is.
9. In many cases, Windows XP will identify the device, find and load the necessary drivers, and you'll be ready to go. If not, the "Found New Hardware Wizard" will open up. This will direct you to install drivers off of the disc that came with the sound card.
10. Once the driver is installed, the device should be ready for use. Some devices may require that you restart the computer before you can use them. In our example, the sound card is immediately ready for use.
11. You want to capture some audio from an external tape deck that you have plugged into the sound card. You set up the recording software that came with the sound card and begin to record.
12. The audio comes into the sound card via an external audio connector. The sound card converts the analog signal to a digital signal.
13. The digital audio data from the sound card is carried across the PCI bus to the bus controller. The controller determines which device on the PCI device has priority to send data to the CPU. It also checks to see if data is going directly to the CPU or to system memory.
14. Since the sound card is in record mode, the bus controller assigns a high priority to the data coming from it and sends the sound card's data over the bus bridge to the system bus.
15. The system bus saves the data in system memory. Once the recording is complete, you can decide whether the data from the sound card is saved to a hard drive or retained in memory for additional processing


All aboard the PCI Express

As processor speeds steadily climb in the GHz range, many companies are working feverishly to develop a next-generation bus standard. Many feel that PCI, like ISA before it, is fast approaching the upper limit of what it can do.
All of the proposed new standards have something in common. They propose doing away with the shared-bus technology used in PCI and moving to a point-to-point switching connection. This means that a direct connection between two devices (nodes) on the bus is established while they are communicating with each other. Basically, while these two nodes are talking, no other device can access that path. By providing multiple direct links, such a bus can allow several devices to communicate with no chance of slowing each other down.

HyperTransport, a standard proposed by Advanced Micro Devices, Inc. (AMD), is touted by AMD as the natural progression from PCI. For each session between nodes, it provides two point-to-point links. Each link can be anywhere from 2 bits to 32 bits wide, supporting a maximum transfer rate of 6.4 GB per second. HyperTransport is designed specifically for connecting internal computer components to each other, not for connecting external devices such as removable drives. The development of bridge chips will enable PCI devices to access the HyperTransport bus.



PCI-Express, developed by Intel (and formerly know as 3GIO or 3rd Generation I/O), looks to be the "next big thing" in bus technology. At first, faster buses were developed for high-end servers. These were called PCI-X and PCI-X 2.0, but they weren't suitable for the home computer market, because it was very expensive to build motherboards with PCI-X.

PCI-Express is a completely different beast - it is aimed at the home computer market, and could revolutionize not only the performance of computers, but also the very shape and form of home computer systems. This new bus isn't just faster and capable of handling more bandwidth than PCI. PCI-Express is a point-to-point system, which allows for better performance and might even make the manufacturing of motherboards cheaper. PCI-Express slots will also accept older PCI cards, which will help them become popular more quickly than they would if everyone's PCI components were suddenly useless.

It's also scalable. A basic PCI-Express slot will be a 1x connection. This will provide enough bandwidth for high-speed Internet connections and other peripherals. The 1x means that there is one lane to carry data. If a component requires more bandwidth, PCI-Express 2x, 4x, 8x, and 16x slots can be built into motherboards, adding more lanes and allowing the system to carry more data through the connection. In fact, PCI-Express 16x slots are already available in place of the AGP graphics card slot on some motherboards. PCI-Express 16x video cards are at the cutting edge right now, costing more than $500. As prices come down and motherboards built to handle the newer cards become more common, AGP could fade into history.






Peripheral Component Interconnect (PCI) slots are such an integral part of a computer's architecture that most people take them for granted. For years, PCI has been a versatile, functional way to connect sound, video and network cards to a motherboard.




But PCI has some shortcomings. As processors, video cards, sound cards and networks have gotten faster and more powerful, PCI has stayed the same. It has a fixed width of 32 bits and can handle only 5 devices at a time. The newer, 64-bit PCI-X bus provides more bandwidth, but its greater width compounds some of PCI's other issues.




A new protocol called PCI Express (PCIe) eliminates a lot of these shortcomings, provides more bandwidth and is compatible with existing operating systems.


High-Speed Serial Connection

In the early days of computing, a vast amount of data moved over serial connections. Computers separated data into packets and then moved the packets from one place to another one at a time. Serial connections were reliable but slow, so manufacturers began using parallel connections to send multiple pieces of data simultaneously.


It turns out that parallel connections have their own problems as speeds get higher and higher -- for example, wires can interfere with each other electromagnetically -- so now the pendulum is swinging back toward highly-optimized serial connections. Improvements to hardware and to the process of dividing, labeling and reassembling packets have led to much faster serial connections, such as USB 2.0 and FireWire.


Sizing Up
Smaller PCIe cards will fit into larger PCIe slots. The computer simply ignores the extra connections. For example, a x4 card can plug into a x16 slot. A x16 card, however, would be too big for a x4 slot.



PCI Express is a serial connection that operates more like a network than a bus. Instead of one bus that handles data from multiple sources, PCIe has a switch that controls several point-to-point serial connections. These connections fan out from the switch, leading directly to the devices where the data needs to go. Every device has its own dedicated connection, so devices no longer share bandwidth like they do on a normal bus.



When the computer starts up, PCIe determines which devices are plugged into the motherboard. It then identifies the links between the devices, creating a map of where traffic will go and negotiating the width of each link. This identification of devices and connections is the same protocol PCI uses, so PCIe does not require any changes to software or operating systems.





Each lane of a PCI Express connection contains two pairs of wires -- one to send and one to receive. Packets of data move across the lane at a rate of one bit per cycle. A x1 connection, the smallest PCIe connection, has one lane made up of four wires. It carries one bit per cycle in each direction. A x2 link contains eight wires and transmits two bits at once, a x4 link transmits four bits, and so on. Other configurations are x12, x16 and x32.





PCI Express is available for desktop and laptop PCs. Its use may lead to lower cost of motherboard production, since its connections contain fewer pins than PCI connections do. It also has the potential to support many devices, including Ethernet cards, USB 2 and video cards.



Two by Two
The "x" in an "x16" connection stands for "by." PCIe connections are scalable by one, by two, by four, and so on.



But how can one serial connection be faster than the 32 wires of PCI or the 64 wires of PCIx? How is PCIe able to provide a vast amount of bandwidth in a serial format?



Faster Speeds, Fewer Connections
The 32-bit PCI bus has a maximum speed of 33 MHz, which allows a maximum of 133 MB of data to pass through the bus per second. The 64-bit PCI-X bus has twice the bus width of PCI. Different PCI-X specifications allow different rates of data transfer, anywhere from 512 MB to 1 GB of data per second.


Devices using PCI share a common bus, but each device using PCI Express has its own dedicated connection to the switch.



A single PCI Express lane, however, can handle 200 MB of traffic in each direction per second. A x16 PCIe connector can move an amazing 6.4 GB of data per second in each direction. At these speeds, a x1 connection can easily handle a gigabit Ethernet connection as well as audio and storage applications. A x16 connection can easily handle powerful graphics adapters.


How is this possible? A few simple advances have contributed to this massive jump in serial connection speed: Taking Apart and Putting Together


PCIe breaks data into packets, marks the packets for reassembly at their destination and reassembles the packets very quickly -- so quickly that the process goes unnoticed by the rest of the computer.


Prioritization of data, which allows the system to move the most important data first and helps prevent bottlenecks

Time-dependent (real-time) data transfers

Improvements in the physical materials used to make the connections

Better handshaking and error detection

Better methods for breaking data into packets and putting the packets together again. Also, since each device has its own dedicated, point-to-point connection to the switch, signals from multiple sources no longer have to work their way through the same bus.




Slowing the Bus
Interference and signal degradation are common in parallel connections. Poor materials and crossover signal from nearby wires translate into noise, which slows the connection down. The additional bandwidth of the PCI-X bus means it can carry more data that can generate even more noise. The PCI protocol also does not prioritize data, so more important data can get caught in the bottleneck. Using the Accelerated Graphics Port (AGP) slot for video cards removes a substantial amount of traffic, but not enough to compensate for faster processors and I/O devices.



PCI Express and Advanced Graphics
So PCIe can eliminate the need for an AGP connection. A x16 PCIe slot can accommodate far more data per second than current AGP 8x connections allow. In addition, a x16 PCIe slot can supply 75 watts of power to the video card, as opposed to the 25watt/42 watt AGP 8x connection. But PCIe has even more impressive potential in store for the future of graphics technology.



With the right hardware, a motherboard with two x16 PCIe connections can support two graphics adapters at the same time. Several manufacturers are developing and releasing systems to take advantage of this feature:


NVIDIA Scalable Link Interface (SLI): With an SLI-certified motherboard, two SLI graphics cards and an SLI connector, a user can put two video cards into the same system. The cards work together by splitting the screen in half. Each card controls half of the screen, and the connector makes sure that everything stays synchronized.



ATI CrossFire: Two ATI Radeon® video cards, one with a "compositing engine" chip, plug into a compatible motherboard. ATI's technology focuses on image quality and does not require identical video cards, although high-performance systems must have identical cards. Crossfire divides up the work of rendering in one of three ways:

splitting the screen in half and assigning one half to each card (called "scissoring")
dividing up the screen into tiles (like a checkerboard) and having one card render the "white" tiles and the other render the "black" tiles
having each card render alternate frames

Alienware Video Array: Two off-the-shelf video cards combine with a Video Merger Hub and proprietary software. This system will use specialized cooling and power systems to handle all the extra heat and energy from the video cards. Alienware's technology may eventually support as many as four video cards.




The Future of PCI Express
Since PCI, PCI-X and PCI Express are all compatible, all three can coexist indefinitely. So far, video cards have made the fastest transition to the PCIe format. Network and sound adapters, as well as other peripherals, have been slower in development. But since PCIe is compatible with current operating systems and can provide faster speeds, it is likely that it will eventually replace PCI as a PC standard. Gradually, PCI-based cards will become obsolete.

No comments: