PCJ – a new tool for computing and data analytics

Ikona wpisu PCJ, czyli jak zoptymalizować obliczenia i analizę danych

The PCJ library has been designed for modern supercomputers but it runs on any system with Java installed. It allows you to create highly scalable computing applications including implementation of Big Data or artificial intelligence (AI) algorithms. Performance tests show that applications built using PCJ are up to several dozen times faster than their counterparts using Hadoop or Spark.

PCJ, or Parallel Computing in Java, is a library that allows simple programming of parallel applications in Java. It was designed for supercomputers, but because virtually every computer has many computing cores today, it also applies when creating applications for laptops or PCs.

Why Java?

Java has been the most popular programming language for years and maintains its position despite emerging new solutions. Unfortunately, until now there were no good tools to use it on supercomputers. These machines require parallel programming, i.e. the ability to run applications on multiple processors at the same time. Java from the very beginning gives the possibility of multi-threaded programming, but it is limited to one node (one virtual Java machine) and quite difficult to use. In the mid-90s, there were a few alternative solutions, but ultimately none of them worked. On the other hand, existing tools based on traditional languages such as C and FORTRAN are refined and difficult to compete with. However, since multi-core systems have become an everyday reality, interest in new solutions is increasing.

Why C or FORTRAN is not enough?

Currently, due to the huge demand for programmers, the market belongs to employees. Getting to know a specific programming language or libraries is a big investment and programmers choose solutions that are popular and give broad employment perspectives. Therefore, today it is difficult to convince a computer science student to learn FORTRAN. In HPC community this problem has been recognized and an introduction of parallel programming as a compulsory lecture in computer science education had been proposed along with a launch of initiatives to attract students to HPC. Unfortunately, these efforts do not bring much effect. On the other hand, the development of Big Data and AI tools has led to the creation of new solutions, such as Hadoop, Spark or TensorFlow, which allow the use of multiprocessor systems without the need for in-depth knowledge of parallel programming principles. Importantly, these are solutions written for new programming languages, such as Java, Scala or Python. Due to the wide range of business applications, these technologies become more interesting and forward-looking for programmers.

Running Hadoop, Spark, and TensorFlow on supercomputers

Nowadays there is an increasing interest in using supercomputers in the Big Data or AI analysis. However, the tools mentioned earlier were written for smaller systems, usually containing up to 100 computing nodes (processors). In addition, they do not work with typical software for managing workloads on supercomputers. Most importantly, the new software is not able to use the capabilities of large computers. This is partly due to the programming language used (Java, Scala), partly from programming tools that do not use supercomputer-specific hardware capabilities. Hence, supercomputers providers rewrite fragments of libraries to C (for example, co-array C) to achieve satisfactory performance and scalability. However, this is a solution that requires large financial expenditures and the work of many programmers, and the effect is not always in line with expectations.

What is PCJ solution to achieve performance and scalability?

The PCJ library has been designed to create applications that take maximum advantage of the computing capabilities of modern supercomputers. It is based on the increasingly popular PGAS (Programming Global Address Space) paradigm allowing for comfortable programming in C, FORTRAN, X10, Chapel and other languages. In the PGAS model, using a relatively small number of commands (or programming constructs), the programmer can easily implement any parallel algorithm. The PCJ library provides this programming model for Java in a natural way for this language. The programmer adds one jar file to his project and he gets wide programming possibilities. What’s more, to create an application, it is enough to use a standard development environment. Moreover, the created application can be run on all systems where Java is available (more specifically – a Java virtual machine).

The PCJ library allows you to create and test application on a laptop or workstation, and then transfer it to any supercomputer, even without the need to recompile. Thanks to this, the application development process is quick and much more convenient than with traditional tools, and it does not require a supercomputer. The use of the PGAS model means that the number of constructs a programmer must know is small. It is basically one class and a dozen or so methods, while in traditional solutions a lot more is needed.

The PCJ based applications are highly scalable. Therefore PCJ enables the implementation of any Big Data or AI algorithm. What’s more, performance tests show that applications built using the PCJ library scale very well and they are even several dozen times faster than their counterparts using Hadoop or Spark.

Java is considered to be much slower than C or FORTRAN. Is it true?

Creating a PCJ library, we have devoted a lot of attention to comparing its performance with applications written in traditional programming languages. In some cases, Java is slower (gets about half of the speed of C code), but for some algorithms implementations written in Java with the PCJ library are just as efficient as their counterparts in C/C++. An example can be image rendering using the ray tracing algorithm or some graph algorithms. Recently, new virtual Java machines appeared, allowing to reduce the performance gap between C and Java. Thanks to that application using the PCJ library become faster. Considering scalability, we carried out tests on large supercomputers – at the same time, we used over 200,000 computing cores. This is a couple time more than the largest computer available at the ICM UW.

In the case of supercomputers, an important element is application scalability. It can be limited by the algorithm or by the speed of data exchange between individual processors (communication). To transfer data inside a supercomputer, very often dedicated hardware solutions are used, such as Infiniband, OmniPath or Aries. At the moment, the PCJ library does not use fully such dedicated hardware, which limits communication efficiency. However, we have some ideas on how to solve this problem and we are working on it.

What are the practical applications of the PCJ library?

So far, we have focused on creating a library and testing its performance on selected algorithms. With the use of PCJ, implementations of typical performance tests, graph processing benchmark (graph500) and Fast Fourier Transform have been created. PCJ has been used to parallelize the application to model connectome of C. Elegans and to analyze DNA sequences. In the latter case, the analysis of finding DNA fragments of viruses in human DNA has been accelerated more than 100 times compared to a single workstation. Competitive solutions are at least 2-3 times slower.

The PCJ library was also used during grand challenge programming hackathons organized at ICM University of Warsaw. Participating students developed applications for planning the network of charging stations for electric cars or for development of fiber-optic network connecting 3 million localizations (buildings) that do not yet have access to high-speed Internet in Poland. In both cases, the best solutions were written in Java using the PCJ library.

How can I use the PCJ library?

Using the PCJ library is very simple. You should download the library file (jar) from the http://pcj.icm.edu.pl website and place it in your Java project. The PCJ web page contains a description of the library and example source codes. The program can be created and tested on any computer with Java 1.8 (or newer) using a typical development environment (NetBeans, Eclipse, IntelliJ IDEA or others). The created application can be run locally or on remote computers. The PCJ application can also be run in the Hadoop or Spark environment, resulting in significantly higher performance.

The development of applications using the PCJ library is simple but requires the design of a parallel algorithm. So the programmer must know the basics of parallel programming. In return, he obtains the possibility of effective implementation of any parallel algorithm without the need to fit into the MapReduce model standing behind solutions such as Hadoop or Spark. Greater flexibility allows for better performance and scalability.

Even for a programmer who has only basic knowledge of Java, the use of the PCJ library is simple and does not pose major problems. For people starting the adventure with parallel programming, it is much simpler than using traditional solutions such as MPI or OpenMP. We can see this very well in the case of hackathon participants at ICM or students of computational engineering, who, having a choice of different tools for creating parallel applications, most often choose Java and the PCJ library.

At this moment, the PCJ library is available in version 5.0.6, released in November 2017 and requires Java 1.8 or newer. The current version of the library is well tested – for more than half a year no serious errors were found. Version 5 of the library has been downloaded more than a thousand times. The development of the PCJ library has been described in 10 scientific publications and exemplary applications in further 11 papers. The PCJ library has been presented at dedicated workshops organized as part of scientific conferences. It is also used as part of the parallel programming course in the computational engineering programme at the ICM University of Warsaw and in the computer science programme at the Faculty of Mathematics and Computer Science an N. Copernicus University. We have information that the PCJ library is also used to teach parallel programming in Brazil and Italy.

Who is behind the scene?

The idea to create a PCJ library appeared in autumn 2011 during the master’s seminar run by prof. Piotr Bała at the Nicolaus Copernicus University in Toruń. Subsequent attempts to use the available tools to write parallel applications in Java failed and the idea of developing own solution was raised. Marek Nowicki undertook work on the prototype implementation, and Łukasz Górski was working on a similar solution. The first version of the library was available at the end of 2011, soon next releases were created. In 2014, the library was presented at the Supercomputing Conference in the USA, where it received an award in the HPC Challenge competition. Work on the PCJ library became the basis for Marek Nowicki’s Ph.D. thesis defended at WMIiM UW and Łukasz Górski and Magdalena Ryczkowska thesis (defended at IPI PAN). Łukasz Górski dealt with the implementation of selected algorithms, and Magdalena Ryczkowska with graph algorithms. Michał Szynkiewicz is also involved in the development of the library, and he is working on fault tolerance mechanisms. The team dealing with PCJ is small, which is an additional guarantee that the library is well designed and implemented. For comparison, teams of dozens of people deal with implementations of the PGAS model for other programming languages.

How is the PCJ library developed?

The PCJ library is being developed as a typical academic, open source project. The library is free and its source code is available on the GitHub server. Initially, the PCJ library was created as part of our own research, however, in 2014, the international consortium CHIST-ERA granted dedicated resources to the development of the library. In Poland, within the framework of the CHIST-ERA consortium, the funding was provided by the National Science Center (NCN). It should be emphasized that in the particular CHIST-ERA call, only 7% of applications were funded, including one on PCJ prepared by ICM UW in cooperation with IBM Research Zurich (Switzerland), Queen’s University Belfast (United Kingdom) and Bilkent University (Turkey). As part of the ICM activities in this project, the PCJ library was developed. The Zurich partners dealt with selected computational kernels, such as multiplication of sparse matrices. Belfast colleagues investigated the use of graphics cards and worked on optimization of advanced Java virtual machine properties. Ankara researchers created a streaming library based on PCJ. The NCN grant ended at the end of 2017, and until the end of 2018, the remaining partners are carrying out their tasks with the support provided by ICM. We are currently looking for further financing options for PCJ library development.