LibND4J
Native operations for nd4j. Build using cmake
Prerequisites
GCC 4.9+
CUDA 8.0 or 9.0 (if desired)
CMake 3.8 (as of Nov 2017, in near future will require 3.9)
Additional build arguments
There's few additional arguments for buildnativeoperations.sh script you could use:
-a XXXXXXXX// shortcut for -march/-mtune, i.e. -a native
-b release OR -b debug // enables/desables debug builds. release is considered by default
-j XX // this argument defines how many threads will be used to binaries on your box. i.e. -j 8
-cc XX// CUDA-only argument, builds only binaries fortarget GPU architecture. use this forfast builds
You can find the compute capability for your card on the NVIDIA website here.
For example, a GTX 1080 has compute capability 6.1, for which you would use -cc 61 (note no decimal point).
OS Specific Requirements
Android
Download the NDK, extract it somewhere, and execute the following commands, replacing android-xxx with either android-arm or android-x86:
git clone https://github.com/deeplearning4j/libnd4j
git clone https://github.com/deeplearning4j/nd4j
exportANDROID_NDK=/path/to/android-ndk/
cdlibnd4j
bash buildnativeoperations.sh -platform android-xxx
cd ../nd4j
mvn clean install -Djavacpp.platform=android-xxx -DskipTests -pl '!:nd4j-cuda-9.0,!:nd4j-cuda-9.0-platform,!:nd4j-tests'
OSX
Run ./setuposx.sh (Please ensure you have brew installed)
Linux
Depends on the distro - ask in the earlyadopters channel for specifics
on distro
Ubuntu Linux 15.10
wget http://developer.download.nvidia.com/compute/cuda/7.5/Prod/local_installers/cuda-repo-ubuntu1504-7-5-local_7.5-18_amd64.deb
sudodpkg -i cuda-repo-ubuntu1504-7-5-local_7.5-18_amd64.deb
sudoapt-get update
sudoapt-get installcuda
sudoapt-get installcmake
sudoapt-get installgcc-4.9
sudoapt-get installg++-4.9
sudoapt-get installgit
git clone https://github.com/deeplearning4j/libnd4j
cdlibnd4j/
exportLIBND4J_HOME=~/libnd4j/
sudo rm /usr/bin/gcc
sudo rm /usr/bin/g++
sudo ln -s /usr/bin/gcc-4.9 /usr/bin/gcc
sudo ln -s /usr/bin/g++-4.9 /usr/bin/g++
./buildnativeoperations.sh
./buildnativeoperations.sh -c cuda -сс YOUR_DEVICE_ARCH
Ubuntu Linux 16.04
sudoapt installcmake
sudoapt installnvidia-cuda-dev nvidia-cuda-toolkit nvidia-361
exportTRICK_NVCC=YES
./buildnativeoperations.sh
./buildnativeoperations.sh -c cuda -сс YOUR_DEVICE_ARCH
The standard development headers are needed.
CentOS 6
yum installcentos-release-scl-rh epel-release
yum installdevtoolset-3-toolchain maven30 cmake3 git
scl enabledevtoolset-3 maven30 bash
./buildnativeoperations.sh
./buildnativeoperations.sh -c cuda -сс YOUR_DEVICE_ARCH
Windows
Setup for All OS
Set a LIBND4J_HOME as an environment variable to the libnd4j folder you've obtained from GIT
Note: this is required for building nd4j as well.
Setup cpu followed by gpu, run the following on the command line:
For standard builds:
./buildnativeoperations.sh
./buildnativeoperations.sh -c cuda -сс YOUR_DEVICE_ARCH
For Debug builds:
./buildnativeoperations.sh blas -b debug
./buildnativeoperations.sh blas -c cuda -сс YOUR_DEVICE_ARCH -b debug
For release builds (default):
./buildnativeoperations.sh
./buildnativeoperations.sh -c cuda -сс YOUR_DEVICE_ARCH
OpenMP support
OpenMP 4.0+ should be used to compile libnd4j. However, this shouldn't be any trouble, since OpenMP 4 was released in 2015 and should be available on all major platforms.
Linking with MKL
We can link with MKL either at build time, or at runtime with binaries initially linked with another BLAS implementation such as OpenBLAS. In either case, simply add the path containing libmkl_rt.so (or mkl_rt.dll on Windows), say /path/to/intel64/lib/, to the LD_LIBRARY_PATH environment variable on Linux (or PATH on Windows), and build or run your Java application as usual. If you get an error message like undefined symbol: omp_get_num_procs, it probably means that libiomp5.so, libiomp5.dylib, or libiomp5md.dll is not present on your system. In that case though, it is still possible to use the GNU version of OpenMP by setting these environment variables on Linux, for example:
exportMKL_THREADING_LAYER=GNU
exportLD_PRELOAD=/usr/lib64/libgomp.so.1
##Troubleshooting MKL
Sometimes the above steps might not be all you need to do. Another additional step might be the need to
add:
exportLD_LIBRARY_PATH=/opt/intel/lib/intel64/:/opt/intel/mkl/lib/intel64
This ensures that mkl will be found first and liked to.
Packaging
If on Ubuntu (14.04 or above) or CentOS (6 or above), this repository is also
set to create packages for your distribution. Let's assume you have built:
for the cpu, your command-line was ./buildnativeoperations.sh ...:
cdblasbuild/cpu
make package
for the gpu, your command-line was ./buildnativeoperations.sh -c cuda ...:
cdblasbuild/cuda
make package
Uploading package to Bintray
The package upload script is in packaging. The upload command for an rpm built
for cpu is:
./packages/push_to_bintray.sh myAPIUser myAPIKey deeplearning4j blasbuild/cpu/libnd4j-0.8.0.fc7.3.1611.x86_64.rpm https://github.com/deeplearning4j
The upload command for a deb package built for cuda is:
./packages/push_to_bintray.sh myAPIUser myAPIKey deeplearning4j blasbuild/cuda/libnd4j-0.8.0.fc7.3.1611.x86_64.deb https://github.com/deeplearning4j
##Running tests
Tests are written with gtest,
run using cmake.
Tests are currently under tests_cpu/
There are 2 directories for running tests:
1. libnd4j_tests: These are older legacy ops tests.
2. layers_tests: This covers the newer graph operations and ops associated with samediff.
For running the tests, we currently use cmake to run the tests.
We typically use clion for our tests.