New Internet Protocol Sets Milestone for Fast and Friendly Trans-Atlantic Data Transport
CHICAGO -- A new milestone was reached in trans-Atlantic data transmission today by researchers at the University of Illinois at Chicago (UIC) who demonstrated the practicality of transferring even very large data sets over high-speed production networks.
UIC's National Center for Data Mining (NCDM) and Laboratory for Advanced Computing flashed a set of astronomical data across the Atlantic at 6.8 gigabits per second -- 6800 times faster than the 1 megabit per second effective speed that connects most companies to the Internet.
In the test, 1.4 terabytes of astronomical data was transmitted from Chicago to Amsterdam in a few minutes using UDT, a new protocol developed by the NCDM at the University of Illinois at Chicago. In comparison, moving the same amount of data using the TCP Protocol, which is the standard used on the Internet today for data transfers, would take 25 days.
Moving large data sets over the Internet faces several hurdles:
First, the network infrastructure for long distance 1 Gigabit per second and 10 Gigabit per second network links is still maturing and software that can use this infrastructure is just being developed. The UIC computer clusters used for the test were connected to the SURFnet network in Amsterdam and the Abilene network in Chicago. The test also demonstrated the quality and power of these, two of the world's leading research networks. In the past, high-speed data transfers of very large data sets have usually employed specialized experimental networks and used data protocols that did not allow other network traffic to share the same link.
Second, today's predominant network protocol, TCP, is not effective at moving massive data over long distances. UDP, another network protocol that is also widely deployed, cannot reliably transport data (some data may be lost) and is not friendly to other flows (using it for large data transfers can starve other network traffic).
Currently, efforts are underway to improve TCP, to develop new protocols to replace TCP, and/or to develop protocols on top of TCP and UDP that are effective for high performance data transport.
To overcome these problems, in the past, high speed data transfers of very large data sets have used special purpose research networks and employed specialized data protocols that in practice did not allow other network traffic to share the same link.
Friday's test run used a new network protocol called UDP-based Data Transport or UDT, which was developed by the National Center for Data Mining at the University of Illinois at Chicago. Unlike some other protocols now being studied for high speed data transfer, UDP-based protocols can be used over today's Internet without making changes to the network infrastructure. Today's demonstration not only showed that UDT was fast, but also that it was friendly and could effectively coexist with thousands of other networks connections.
The demonstration is part of an ongoing international effort to find and test new ways of reliably moving massive data sets around the globe using advanced networks and new data transfer protocols. Such systems hold enormous promise for advancing scientific research, in addition to numerous commercial applications. Today, although it is becoming common for global business to have important data in different cities, it is still quite difficult to integrate this data to create a common view.
"Using UDT, it is now practical for the first time to move even massive data sets over very long distances in a friendly fashion using today's networks," said Robert Grossman, Director of UIC's National Center for Data Mining and President of Open Data Partners.
UDT is currently being used by several international research projects. UDT is used by the OptIPuter, a research project developing next generation computing infrastructures based upon advanced photonics. UDT also plays a role in research projects developing high performance web services, something that is required in order to scale today's Web services to large remote and distributed data sets.
UDT is used as the network transport layer in the joint University of Illinois/Northwestern (iCAIR) project on Photonic Data Services (PDS), which is developing open source data services for next generation photonic networks, such as the OptIPuter. The OptIPuter is an example of what are sometimes called lambda grids, distributed computing infrastructures in which applications can set up their own photonic paths (lambdas) supporting data transport at Gigabit per second speeds and higher.
"Moving data at 6.8 Gigabits per second across the Atlantic using UDT is an important milestone for the OptIPuter Project and brings us a bit closer to effective data management over lambda grids," said Larry Smarr, Principal Investigator of the OptIPuter Project and Director of the California Institute for Telecommunications and Information Technology, a UC San Diego/ UC Irvine partnership.
UDT is also being used as one of the layers of a UIC project called Open DMIX (for Data Mining, Data Integration, and Data Exploration), which is developing open source high performance web services for data mining.
"Using UDT and the scalable data mining and data integration web services built on top of it may emerge as an important enabling technology for the grid computing required for next generation virtual observatories," according to Alex Szalay, Alumni Centennial Professor in the Department of Physics and Astronomy at The Johns Hopkins University.
The tests were made possible by support from the following manufacturers and organizations, who have generously contributed their equipment, facilities, and know-how: OMNInet, StarLight, Nortel, SARA and CANARIE. Partial funding for the tests was provided by the National Science Foundation (Grants 0129609, 9977868 and 0225642) and the University of Illinois at Chicago.