Interop Report: 10GBASE-T, 10G SFP+, 40G QSFP shows big with new switches

By Kimball Brown and Brad Smith

Interop 2011, now the show to attend for data center switching systems, provided some real eye openers.  Better still, although anyone can see front panel photos on new products’ web sites, LightCounting received permission to show candid, cell phone photos revealing inside the boxes of several  systems—views not often available.   In this report, we also discuss some of our own conclusions about the future of 10GBASE-T, Junipers new QFabric switching architecture, Mellanox’s entry into Ethernet switching, and the future of uplinks for 10GbE switches.

Marvell Solidifies 10GBASE-T market with purchase of Solarflare’s assets

On Monday, May 9, just before Interop opened, Solarflare announced the sale of its 10GBASE-T PHY assets to an unnamed company.  Then, Marvell announced the acquisition of 10GBASE-T Phy assets from an unnamed company.   On Wednesday, May 11, Marvell suddenly announced new 40nm quad and dual port 10GBASE-T chips.  Although Marvell has not made the acquisition public yet, our sources say the deal is done.  Further, HP announced an eight-port 10GBASE-T module for its ProCurve E8200zl switch series using Solarflare’s (now Marvell’s) PHYs.  At the show, we snapped a quick picture of the module (see Figure 1).  This design win for Solarflare undoubtedly helped the sale of the assets, because once a major switching OEM like HP goes through the huge amount of testing and validation in order to bring a product to market, Solarflare has proven that the PHY works in the field.


Figure 1. HP’s ProCurve E8200zl with the 10GBASE-T blade that is based on Solarflare’s PHY

Marvell’s entry into the market greatly legitimizes 10GBASE-T.  While one might consider the sales of Teranetics to PLX and of Solarflare’s assets to Marvell as bad news, these can be viewed positively!  Together, Broadcom and Marvell are the major purveyors of 1000BASE-T PHYs to large switching OEMs such as Cisco, and these two have shipped the majority of the 1GbE chips in the market to date.  We suspect that Cisco wanted both of its two major PHY suppliers to offer 10GBASE-T before it seriously began driving the copper 10GbE standard into the market.  Marvell’s internal effort to develop 10GBASE-T PHYs was undoubtedly underway, the looming availability of the new Romley wave of Intel-based servers, which will offer low cost 10GBASE-T LAN-On-Motherboard (LOM) option cards, pushes Marvell to enter the market quickly.  At the same time, Solarflare decided it could no longer keep up with the 40nm and 28nm silicon node investments needed to succeed long term in the 10GBASE-T PHY market.  Solarflare retained its controller and line card business unit. 

The end result is that the 10GBASE-T market now has two major semiconductor players that can and will invest in 40nm, the follow-on 28nm silicon nodes, and future versions of the PHY so that large system OEMs can safely invest in implementing switching systems that look to win in the copper 10GbE space.  We expect Cisco and other switch OEMs to be aggressive this fall and winter with new copper 10GbE interfaces.

At Interop, we also saw low-power versions of the 10GBASE-T cards designed to reach only 7m or North-South on the racks and a new version of copper cable called “small-diameter CAT6A.”  Obviously, the fight is quickly brewing in switching and servers for 10GbE implemented in SFP+ optical and direct attach copper and 10GBASE-T with the RJ-45 jack for those wanting CAT6A. 

Juniper’s new QFabric switching architecture could be a boon to transceiver makers

In February, Juniper announced its QFabric switching architecture aimed at collapsing the traditional three-layer network where edge, aggregation, and core switches are replaced with distributed switches (single tier).   This is one more competitive shot at Cisco with a unique product approach.  In time, other switching competitors are sure to follow, and Cisco seems to have responded by hiring one of the key switch design executives away from Juniper!

The two other components of Juniper’ architecture includes a huge box that allows connections to over 6,000 ports and a management device that keeps track of the network.  The architecture centers on these three components:

        1. The QF/Node (middle tray in Figure 2) acts as the distributed decision engine of the fabric.

        2. The QF/Interconnect (the big box at the bottom in Figure 2) is the high speed transport device.

         3. The QF/Director (on the top tray in the figure) delivers a common window, controlling all devices as one.

Figure 2. The three components of Juniper’s QFabric architecture.

One might think that collapsing three tiers of switches to one would limit the overall volume of transceiver modules sold.  However, a couple of factors about Juniper’s new offering provide reason for applause for transceiver makers: quality new technology that solves users’ problems tend to sell very well, and Juniper does not force its customers to buy transceivers from only itself, allowing less markup on the modules.  The new architecture certainly improves the overall latency of the network, because packets move through fewer switch ports to reach their destinations. Further, the overall cost of switching should diminish if this architecture takes off, since customers do not have to buy the extra layers of switches they buy now.  Although convincing users that this new architecture is in their best interests may take time, over the next couple of years, we expect QFabric to become quite popular as users upgrade their data center to 10GbE from 1GbE at the lowest level links.  We do not expect transceiver volumes to suffer, and with a smaller markup on the transceiver modules, we expect more profit for transceiver makers.

40Gb uplinks abound, but 100Gb in Datacom still a ways out

Interop showcased almost every  new switch with 10GbE links to servers as input that had 4 to 8 40Gbps QSFP uplink sockets for direct attach copper and optical links.  LightCounting discovered that, right now, customers want more than 1GbE but don’t necessarily need all of the 10Gbps (perhaps 3-4G) on each link, so many of the system we saw were oversubscribed with 24 to 48 10GbE ports to servers (240G) and only 4 QSFP uplinks (160G).    Both Broadcom and Marvell showed off reference platforms for their 10Gb and 40Gb Ethernet switch controller platforms.  Figure 3 shows Broadcom’s 40GbE reference platform. Based on technology that Broadcom picked up from its Dune Networks acquisition completed last year, the platform enables sixteen 40Gb QSFP ports. Extreme Networks also showed its new BlackDiamond X8 20 Terabit switching system with 768 nonblocking 10GbE ports or 192 nonblocking 40GbE ports; it’s designed to evolve to support 100GbE.

Figure 3. Broadcom’s 40 Gb switch based on Dune Networks technology.

Although 100Gb might be a better speed long term to provide uplinks for 10GbE platforms, 40Gb will do for now. In our view, 100Gb throughput in a standard MSA that switch OEMs can trust to be ubiquitous and cost effective is at least three years off.  Keep in mind that, as the new generation of Intel Xeon-based servers hits the streets in late 2011, OEMs will have multiple low-cost options for 10GbE connectivity via 10GBASE-T and Direct Attach cables.  For the most part, users will upgrade to 10GbE to be able run their networks at sustained 2 to 4 Gbps rather than the full 10Gbps.  Therefore, we believe users will not object if the 40GbE ports are oversubscribed early in the 10GbE upgrade phase. Data center managers are most interested in the lower latency benefits of 10GbE versus 1GbE and a gradual transition to 40G and 100G with new more powerful servers.

Unfortunately, 100Gb uplinks costs are cost-prohibitive at the current time for reaches beyond 100m using single-mode fiber.  At Interop, we saw few multimode fiber-based, CXP parallel optic transceivers ports as standard line cards, but we did see a number of CXP add-in cards.  We did not see any CXP ports as standard ports built into switches.

The current 100G CFP MSA designed for 10Km reach is far too expensive and big for widescale use in datacom equipment. Beyond that, the industry needs a 100Gb transceiver that fits in a QSFP socket that will work with current 40Gbps QSFP transceivers.  We spoke to multiple switch OEMs who welcomed this sort of 100Gb transceiver.  The problem is that the standards groups have not even created specifications for such an trasceiver, and the engineering and validation processes required to produce such a transceiver could further extend the delivery timeline.  We think the whole process of defining the 100Gb transceiver that fits into a QSFP package will take another 2-3 years to come to market.  In the meantime, 40Gb transceivers in a QSFP package will be the majority in Datacom with CXP for 100Gb where needed..  

Mellanox introduced its new 36-Port 40 Gigabit Ethernet Switch line

Using its new silicon at the core, Mellanox introduced a new switch designed to help this manufacturer edge its way into the Ethernet switching market and add to its dominant InfiniBand position.  Mellanox showed a top-of-rack switch using nothing but QSFP ports and a Quad-to-Serial (QSA) direct attach copper breakout cable to connect one QSFP port in the switch to four SFP+ ports in servers. Also offered is a QSFP-to-SFP jack for those wanting to link with SFP.  Mellanox's new silicon drastically reduces the size and complexity of what we saw under the system cover, and at the system level, the improvement of using the 40GbE QSFP ports at the core of its strategy boasted huge savings in power, rack space, and cabling.  Borrowing from the InfiniBand market, Mellanox new system boasts a latency of 230ns—the lowest by almost an order of magnitude that we have seen so far in the Ethernet space.

Figure 4. Mellanox’s 40Gb Ethernet and Infiniband switch.