A Brief Introduction

Using deep, multilayer neural networks, deep learning is currenty a major driver of the boom around machine learning. From big leaps in the quality of automatic translations to autonomous driving to beating grandmasters in chess, this technology is making headlines many times everyday.

Deeplearning4j, also called DL4J, is a Java library for Deep Learning. It also includes a whole family of other libraries that simplify the use of deep learning models with Java. As an alternative to the many Python-based frameworks, DL4J provides a way to easily bring Deep Learning into production, even in enterprise environments. The full code of the article, including training data, is on GitHub.

Integration into the project

Similar to many other Java libraries, DL4J can be easily included as another dependency in the build tool of choice. In this article, the necessary specifications for this are made in Maven format, i.e. as they would be in the pom.xml file. Of course, you can also use another build tool like Gradle or SBT.

However, use without build tools is not intended for DL4J, since it itself also has a large number of direct and transitive dependencies. Because of this, there is no single .jar file to manually specify as a dependency in your IDE.

DL4J and its associated libraries are modular, so you can customize your dependencies to suit the needs of your project. However, especially for beginners, this can complicate usage, as it is not necessarily obvious which submodule is needed to make a particular class available.

The used versions of all DL4J modules should always be the same. To simplify this, we define a property that we will always use below to specify the version. DL4J is currently close to its 1.0 release and in this article we use version 1.0.0-beta7, which was released recently.


For beginners it is advisable to start with the deeplearning4j-core module. This transitively entails many more modules and thus allows the use of a large number of features without having to search for the right dependency. The disadvantage is that when you bundle all the dependencies into one Uber JAR, you get a large file.


DL4J supports multiple backends that allow the use of CPU or GPU. In the simplest case, the choice of the backend is made by specifying a dependency. To use the CPU backend, nd4j-native-platform is required. For the GPU backend, however, nd4j-cuda-X.Y-platform is used, whereby X.Y should be replaced by the installed CUDA version. CUDA 8.0, 9.0 and 9.2 are currently supported.


Both backends rely on the use of native binaries, which is why the platform modules also include the binaries for all supported platforms. This enables distribution to several different platforms without having to create a single specialized JAR file for each. The currently supported platforms for the respective CPUs are:

  • Linux (PPC64LE, x86_64)
  • Windows (x86_64)
  • macOS (x86_64)
  • Android (ARM, ARM64, x86, x86_64)

For CUDA-enabled GPUs:

  • Linux (PPC64LE, x86_64)
  • macOS (x86_64)
  • Windows (x86_64)

Due to the fact that some DL4J's own dependencies are not yet fully compatible with newer Java versions, A few workarounds may be necessary to use Java versions newer than Java 8. The example code in the Quickstart with DL4J GitHub Repository contains this and can run with Java 11.

Finally we add a logger to our dependencies. DL4J needs a logger compatible with the SLF4J API in order to share its information with us. For our example we are using Logback Classic.



As can already be seen from the specification of the backend, ND4J forms the foundation on which DL4J is built. ND4J is a library for fast tensor math with Java. In order to make maximum use of the hardware, practically all calculations are carried out outside the JVM. In this way, both CPU features such as B. AVX vector instructions as well as GPUs can be used.

If a GPU is used, however, one should take into account that a very powerful GPU is often necessary for deep learning in order to achieve a speed advantage over a CPU. This is especially true for notebook GPUs that appeared before the current GeForce 1000 series, and even on the desktop you should at least come up with a GeForce GTX 960 with 4 GB of RAM. The reason for this recommendation is that GPUs shine, especially when it comes to calculations with large amounts of data - but these large amounts of data also require a corresponding amount of RAM and this is only available in sufficient amounts in the more powerful models.

Loading data

Machine learning of all kinds always starts with the fact that data has to be collected and loaded. It's no different with deep learning either. This article focuses on tabular data available as CSV. However, the process for other file formats or types, such as pictures, is similar.

Basically, if you want to get good results quickly, you should understand your data and the problem to be solved well. A little expert knowledge of the data and the general problem area as well as a corresponding preparation of the data can in many cases significantly reduce the model complexity and training time.

Another point to note is that to train a model, you have to split your data into at least two parts. Most of the data, usually around 80 percent, is used for training and is therefore referred to as the training set. The rest of the data, usually around 20 percent, is used to study the quality of the model and is called the test set. In the case of further tuning, in particular, it is even common to reserve a further 10 percent of the training data for a validation set that is used to check whether the model has not been tailored too much to the test set.

When choosing the data for the test set, it should be noted that they make up a representative subset of all data. This is necessary in order to be able to properly check the informative value of the model.