Poland-Singapore data transfer over new CAE-1 100G trans-continental link
In early October, Interdisciplinary Centre for Mathematical and Computational Modelling (ICM) – University of Warsaw (Poland), A*STAR Computational Resource Centre (A*CRC, Singapore), and Zettar Inc. (U.S.) embarked to jointly conduct a production trial over the newly built Collaboration Asia Europe-1 (CAE-1) 100Gbps link connecting London and Singapore. The project has shown that moving data at great speed and scale between Poland and Singapore is a reality.
ICM will present the details of this transcontinental link trial at SC19 in Denver (November 17-22, 2019) – please visit Booth No. 1393.
The link provides shorter, faster, and cheaper connectivity than the links routed via the North Atlantic Ocean, across North America, and across the Pacific Ocean that have carried much of the R&E traffic to date between Europe and Asia Pacific region. Furthermore, with the link, the Middle East region will be able to participate in globally distributed data-intensive research and scientific endeavors with Europe, Asia Pacific region, and beyond.
But how well does it work in practice? The 3 parties decided to find it out using entirely production grade components: hardware, storage, network infrastructure, and software.
MOVING DATA AT GREAT SPEED AND SCALE
The project has established a historical first: for the first time over the newly built CAE-1 link, with a production setup at ICM end. It has shown that moving data at great speed and scale between Poland (and thus Central and Eastern Europe) and Singapore is a reality. Furthermore, although the project was initiated only in mid-October, all goals have been reached and a few new grounds have also been broken as well. On ICM side only two technical experts were involved: Marcin Semeniuk, who configured the entire set-up on the Polish side and Jarosław Skomiał who was responsible for establishing a data link between Warsaw and Singapore. The idea for this production environment was proposed by the Director of ICM, Dr. Marek Michalewicz, who also coordinated this project with all international collaborators.
It is also a true international collaboration:
– ICM, aka Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, Poland is one of the most established supercomputing centers in Eastern Europe;
– A*CRC , aka A*STAR Computational Resource Centre, is the Singapore government-funded source of HPC expertise;
– Zettar Inc. is a software startup based in Palo Alto, California, U.S. It is supported by its revenue and U.S. DOE Office of Science funding. It delivers a software application zx for moving data at speed and scale since 2016 and has been setting a world record annually ever since.
Furthermore, in the social, research and scientific collaboration, and engineering, the project has achieved many worthy accomplishments
Data is the new “oil” of the modern digital age. Just like having ready means to transport the liquid oil has enabled the rapid progress the world has seen for more than a century, in this digital age, having a complete solution for transporting data over great distances at rapid speed will surely spur more progress. That the project employs only the existing equipment, production setup, and GA grade software has shown that there is a complete solution available and the solution can be put together in a very short time. Cost-effectiveness should also be evident.
- Research and scientific collaboration
The project has shown concretely the following:
– More R&E regions are reachable. From now on, distributed data-intensive science and engineering collaboration between Europe, Middle East, Asia and Pacific regions are not only feasible, but also can be efficient if the right data moving solution is used.
– More world-wide participation in distributed data-intensive research collaboration is a reality. The achievement should encourage and motivate more parties along the data path and beyond to collaborate on the advancement of the global sciences and engineering.
– Date gravity is no longer a barrier to progress. Even the tight time for preparation, the attained transfer speed already shows it’s possible to move 1PB in less than two days between any two points along the data path used by the project.
- Modest hardware can produce world-class top results, if the resources are utilized intelligently.
- This is a production trial – not a “for show demo”. For example, at ICM, two production Lustre file systems are employed; both formed with 20 OSTs; each OST has 4 x 7200RPM HDDs. Not even a single SSD is employed. Only a single DTN at each end. Both DTNs are from existing hardware inventory. Both DTNs are more than 2 years old.
- Attained result is world’s top level (~60Gbps average)
- Stock TCP is used. There is no need to use any proprietary protocol.
- Vast distance: 19,800 km,12,375 miles
- Stunningly short preparation: 2 weeks total
- InfiniBand (IB), typically used for interconnect in the HPC space, is not amenable to interface bonding, unlike Ethernet, But the two storage pools with IB interconnects are aggregated by the data mover software Zettar zx.
Singapore is an important hub of high-speed international connectivity. Australia is one of the main sites of the ambitious and demanding Square Kilometer Array (SKA) project. AARNet is one of the six CAE-1 consortium members. The SKA project produces huge amount of data that needs to be shared efficiently among international collaborating organizations. Thus, a likely the next step is to engage a supercomputing center in Australia and conduct a similar project, although other even more ambitious possibilities also exist.
The current setup has been prepared within a very tight timeline. Further polish should improve the overall efficiency and higher transfer rates should be attainable.