New PCI Express System Interconnect 4x8G OCuLink - Impact on High-speed Interconnects?
Why is PCI Express important to the copper and optical interconnect community?
PCI Express is the main system bus that connects Intel server microprocessors to peripherals and networking cards. Along with the upgrades of microprocessors, the PCI Express architecture is at the core of the data center servers, storage, switches and peripherals. Changes here, such as the March 2012 "Romley" introduction, and recent upgrades to the Gen3 PCI Express architecture have large impacts on the data center hardware upgrade cycle. These are the "atomic" elements that triggers the interconnect industry upgrade cycle from 1G to 10G at the server-switch link and from 10G to 40G at the top-of-rack uplinks out to the metro link using 40G and 100G LR4 transceivers.
In LightCounting's 10GBASE-T report and soon to be released 40G/100G in the Data Center Report we profile Intel's so called "tick-tock" upgrade model. By tracking this upgrade cycle along with Apple iPhone/iPad upgrade cycle, one can model how everything in between upgrades - from wireless access and backhaul, metro, long haul, and CATV Internet all the way through to the data center to the server MPU and storage infrastructures. Intel makes the servers and Apple makes the clients - all the networking is in between. AMD, Google, Samsung, etc. all follow these market leaders. One can view the Intel Tick-Tock and Apple upgrades as the "atomic clocks" for the data center.
PCI Express Introduces Cabling Specification for 4x8G at 3 meter External Copper-Optical Link
The PCI Express Special Interest Group (PCI-SIG) held its 3 day conference and trade show July 10-12 in Santa Clara, CA and celebrated its 20th anniversary. PCI-SIG is an independent industry group to advance the PCI Express (PCIe) system bus used in most computers. While PCIe has mainly been an "inside the box" technology, it is now moving outside the chassis. PCI-SIG announced developments at the meeting relevant to the high-speed interconnect, switching and server community for data centers:
In a few years, servers will need 100G uplinks. How will the industry meet this interconnect need? So will OCuLink become a new copper and optical opportunity -or- yet another interconnect and connector standard to challenge Ethernet, SAS, Fibre Channel, InfiniBand? Is it a threat to kill off all the above? In talking to our clients, we see heated debates brewing!
OCuLink - A New Interconnect Scheme - Impact on Existing Data center Cabling/Transceivers?
Apparently, there are not enough different cabling, connector and protocols already in the data center to move simple digital 1's and 0's around! Complete specification will be out next year and a yet undefined connector that is part of the Mini CEM [Card Electro-Mechanical] specification. Also supported is the SFF-8639 for SATA and SAS. LightCounting's guess is it will look a lot like a USB connector with 4 channels, power, flat and small - not the usual monster iPass or QSFP style.
Also new is an independent spread spectrum clock integration scheme to address the key system-level challenge in extending PCIe outside the box - clock distribution between separated systems. PCIe Gen3 drops the 8b/10b encoding for a much faster scrambling with 128b/130b with almost no overhead performance tax unlike the -20% with 8b/10b. This means that a 4x10G link using 8b/10b, such as InfiniBand or Ethernet, is as fast as a PCI Express link at 4x8G with faster error correction.
InfiniBand has made its name by providing a very small protocol stack compared to Ethernet. InfiniBand has a very short latency of <900-ns compared to Ethernet at 3-4-us and 1-2-us with a protocol cut though. As a result, InfiniBand is popular with supercomputers. But PCI Express is at ~600 -ns in latency. InfiniBand, Ethernet Fibre Channel, SATA, SAS etc. have to convert from PCI Express off the server to the specific protocol and then back for a reach often times only 1-3 meters. This adds a lot of cost, latency, complexity and power consumption to a simple subsystem short link.
OCuLink is clearly a response to Intel's Thunderbolt link, which is now available on Apple and Windows PCs and closely controlled by Intel along with the protocol router chips on the device motherboards on each end. OCuLink is a shot at becoming an interconnectivity standard that is open to the industry for both inside and outside servers, storage systems, in data centers and HDTVs, PCs, tablets and smartphones in the consumer space. While Thunderbolt today is designed to carry video signals over PCIe, the router chip that resides on the main board on each end will be able to transfer Ethernet, USB and just about any protocol over PCIe. AMD has its Lightning Bolt effort that uses the 5G USB protocol instead of PCI Express. These links just need to be ruggedized and they will find their way into the data center linking servers and storage subsystems. The high volumes generated from the consumer space will enable very low price points. Other interconnect schemes for short reaches, such as Direct Attach copper, 10GBASE-T, and VCSEL-based AOCs will find this very tough price competition.
Thunderbolt is currently a consumer, video-oriented interconnect that runs at 2 lanes of 10G each and can daisy-chain devices. OCuLink is being designed from the start for both the server/subsystems and consumer markets (no word on daisy-chaining or carrying power). It is likely that OCuLink will be included on the server motherboards in the future and perhaps compete with Thunderbolt (and Lightning Bolt)! OCuLink could end up being a very high speed, low cost competitor for everything from to USB to Thunderbolt to Direct Attach copper, SAS, and SFP+ AOCs for the <10 meter zone. Cabling products are likely to be out in the next 18-24 months.
Rumors around Intel's silicon photonics effort lead us to believe that a dual 28G/lane Thunderbolt will be available in 2015+ in time for the next generation MPUs and PCI Express Gen4. It is very likely that a 2x28G link will be targeted at the data center and not the consumer. Today, Thunderbolt uses coax and an active chip in each end to span 2 meters ($50 retail). Sumitomo announced a 100 meter Thunderbolt AOC using 10G VCSELs and TIAs, LDs, etc.. Intel's 2015 versions may incorporate III-V hybrid silicon lasers integrated onto a single silicon photonics chip (at least this what the Intel YouTube video says). When produced at very high volumes for the PC, this technology can also be applied to the data center linking servers to top-of-rack switches, SAS storage systems and GPUs used for computing instead of graphics.
Impact on Data center Systems
PCI Express is used to connect subsystems together. OCuLink could redefine what the "subsystems" definition might be. Instead of being separated by 20-inches, now, each subsystem could be 3 meters apart (using copper), 15 meters (using active ends), 100 meters using AOCs and across the ocean using the optical infrastructure. (A new transatlantic subsea optical link installed for high-speed flash trading enables New York stock exchanges to link with Europe in 23-milliseconds. How about a transatlantic server to top-of-rack switch link?). Since OCuLink will have the low latency (600-ns), long-reach (100+ meters) and bandwidth (32G Gen3 or 64G Gen4), what is now defined as a subsystem might entail a data center system "rack or row." Severs directly linked at 64G to End-of-Row switches with AOCs, SAS and SATA Express also using PCI Express as a transport layer linking SSD bays, SSD/HDDs, PCI Express-FLASH cards, and GPU clusters - all may be considers as the "subsystem." Nearly 80+% of all data center traffic resides within this subsystem row today. This transition will not happen overnight, but the potential is there to change a lot of components and shift economic power in the business. A lot of debate is going to ensue about exactly where the Ethernet and Fibre Channel protocol needs to reside - at the server, top-of-rack or end-of-row switch.
Bottom line: OCuLink represents another opportunity for cabling, connector and optical suppliers. Each protocol and MSA will have its place, end user preference, and likely maintain its own infrastructure with software. Probably one of the most important issues to continue the existing infrastructure is "backwards compatibility" with existing protocols and interconnects. Soon, in the data center there will be Thunderbolt, Lightning Bolt and PCIe OCuLink added to the mix each with new connectors. SATA and SAS are going PCI Express at layer 1. Fibre Channel has seen significant growth and had very limited impact by Fibre Channel-over-Ethernet (FCoE). InfiniBand has gained significant popularity gains in high end data centers. PLX Technology has been a big proponent of PCIe optical links (see SSC video on PLX website) and has shown several demos with Avago microPOD and McLink products showing an Optical USB and Mini-SAS HD.
While the jury is out on if PCI Express can become a real transmission protocol link instead of just a system bus, the mega-forces in the industry are moving PCI Express in this direction. The real changes will likely occur if PCI Express extends what is defined as a subsystem outside a single chassis, to perhaps a data center row.
"With change, comes opportunity."
By Brad Smith, VP and Chief Analyst, Data Center Interconnects, Brad@LightCounting.com
LightCounting, LLC is a leading high-speed interconnects and optical communications market research company, offering semi-annual market update, forecast, and state of the industry reports based on analysis of publicly available information and confidential data provided by more 20 leading module and component vendors. LightCounting is the communications market’s source for accurate, detailed, and relevant information necessary for doing business in today’s highly competitive market environment. Privately held, LightCounting is headquartered in Eugene, Oregon.
858 West Park Street, Eugene, OR 97401
© Copyright 2012 by LightCounting. No portion of this newsletter may be reproduced in whole in or part without the prior written permission of LightCounting any written materials are protected by United States copyright laws. LightCounting offers no specific guarantee regarding the accuracy or completeness of the information presented, but the professional staff of LightCounting makes every reasonable effort to present the most reliable information available to it and to meet or exceed any applicable industry standards. LightCounting is not a registered investment advisor, and it is not the intent of this document to recommend specific companies for investment, acquisition, or other financial considerations.