@en{
GROUP BY
planning, thus, it allows cooperation of custom-plans provides by extension modules.@en{
Choose a Linux distribution which is supported by CUDA Toolkit, then install the system according to the installation process of the distribution. NVIDIA DEVELOPER ZONE introduces the list of Linux distributions which are supported by CUDA Toolkit.
}
@en{
In case of Red Hat Enterprise Linux 7.x or CentOS 7.x series, choose “Minimal installation” as base environment, and also check the following add-ons.
@en{
Next to the OS installation, a few additionsl configurations are required to install GPU-drivers and NVMe-Strom driver on the later steps.
}
epel-release
package provides the repository definition of EPEL.
You can obtain this package from the public FTP site of Fedora Project. Downloads the epel-release-<distribution version>.noarch.rpm
, and install the package.
Once epel-release
package gets installed, yum system configuration is updated to get software from the EPEL repository.
Fedora Project Public FTP Site
@en{
!!! Tip
Walk down the directory: Packages
--> e
, from the above URL.
}
@en{
Install the epel-release
package as follows.
}
$ sudo yum install https://dl.fedoraproject.org/pub/epel/7/x86_64/Packages/e/epel-release-7-11.noarch.rpm
:
================================================================================
Package Arch Version Repository Size
================================================================================
Installing:
epel-release noarch 7-11 /epel-release-7-11.noarch 24 k
Transaction Summary
================================================================================
Install 1 Package
:
Installed:
epel-release.noarch 0:7-11
Complete!
@en{
PG-Strom and related packages are distributed from HeteroDB Software Distribution Center.
You need to add a repository definition of HeteroDB-SWDC for you system to obtain these software.
}
@en{
heterodb-swdc
package provides the repository definition of HeteroDB-SWDC.
Access to the HeteroDB Software Distribution Center using Web browser, download the heterodb-swdc-1.0-1.el7.noarch.rpm
on top of the file list, then install this package.
Once heterodb-swdc package gets installed, yum system configuration is updated to get software from the HeteroDB-SWDC repository.
}
@en{
Install the heterodb-swdc
package as follows.
}
$ sudo yum install https://heterodb.github.io/swdc/yum/rhel7-x86_64/heterodb-swdc-1.0-1.el7.noarch.rpm
:
================================================================================
Package Arch Version Repository Size
================================================================================
Installing:
heterodb-swdc noarch 1.0-1.el7 /heterodb-swdc-1.0-1.el7.noarch 2.4 k
Transaction Summary
================================================================================
Install 1 Package
:
Installed:
heterodb-swdc.noarch 0:1.0-1.el7
Complete!
@en{
This section introduces the installation of CUDA Toolkit. If you already installed the latest CUDA Toolkit, you can skip this section.
}
@en{
NVIDIA offers two approach to install CUDA Toolkit; one is by self-extracting archive (called runfile), and the other is by RPM packages.
We recommend RPM installation because it allows simple software updates.
}
@en{
You can download the installation package for CUDA Toolkit from NVIDIA DEVELOPER ZONE. Choose your OS, architecture, distribution and version, then choose “rpm(network)” edition.
}
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-cciTg5fR-1583745924468)(./img/cuda-download.png)]
@en{
The “rpm(network)” edition contains only yum repositoty definition to distribute CUDA Toolkit. It is similar to the EPEL repository definition at the OS installation.
So, you needs to installa the related RPM packages over network after the resistoration of CUDA repository. Run the following command.
}
$ sudo rpm -i cuda-repo-<distribution>-<version>.x86_64.rpm
$ sudo yum clean all
$ sudo yum install cuda --enablerepo=rhel-7-server-e4s-optional-rpms
or
$ sudo yum install cuda
@en{
Once installation completed successfully, CUDA Toolkit is deployed at /usr/local/cuda
.
}
@en{
!!! Tip
RHEL7 does not enable rhel-7-server-e4s-optional-rpms
repository in the default. It distributes vulkan-filesystem
packaged required by CUDA Toolkit installation. When you kick installation of CUDA Toolkit, edit /etc/yum.repos.d/redhat.repo
to enable the repository, or use --enablerepo
option of yum command to resolve dependency.
}
$ ls /usr/local/cuda
bin include libnsight nvml samples tools
doc jre libnvvp nvvm share version.txt
extras lib64 nsightee_plugins pkgconfig src
@en{
Once installation gets completed, ensure the system recognizes the GPU devices correctly.
nvidia-smi
command shows GPU information installed on your system, as follows.
}
$ nvidia-smi
Wed Feb 14 09:43:48 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 387.26 Driver Version: 387.26 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla V100-PCIE... Off | 00000000:02:00.0 Off | 0 |
| N/A 41C P0 37W / 250W | 0MiB / 16152MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
@en{
!!! Tip
If nouveau driver which conflicts to nvidia driver is loaded, system cannot load the nvidia driver immediately.
In this case, reboot the operating system after a configuration to disable the nouveau driver.
If CUDA Toolkit is installed by the runfile installer, it also disables the nouveau driver. Elsewhere, in case of RPM installation, do the following configuration.
}
@en{
To disable the nouveau driver, put the following configuration onto /etc/modprobe.d/disable-nouveau.conf
, then run dracut
command to apply them on the boot image of Linux kernel.
}
# cat > /etc/modprobe.d/disable-nouveau.conf <<EOF
blacklist nouveau
options nouveau modeset=0
EOF
# dracut -f
@en{
This section introduces PostgreSQL installation with RPM.
We don’t introduce the installation steps from the source because there are many documents for this approach, and there are also various options for the ./configure
script.
}
@en{
PostgreSQL is also distributed in the packages of Linux distributions, however, it is not the latest one, and often older than the version which supports PG-Strom. For example, Red Hat Enterprise Linux 7.x or CentOS 7.x distributes PostgreSQL v9.2.x series. This version had been EOL by the PostgreSQL community.
}
@en{
PostgreSQL Global Development Group provides yum repository to distribute the latest PostgreSQL and related packages.
Like the configuration of EPEL, you can install a small package to set up yum repository, then install PostgreSQL and related software.
}
@en{
Here is the list of yum repository definition: http://yum.postgresql.org/repopackages.php.
Repository definitions are per PostgreSQL major version and Linux distribution. You need to choose the one for your Linux distribution, and for PostgreSQL v9.6 or later.
}
@en{
All you need to install are yum repository definition, and PostgreSQL packages. If you choose PostgreSQL v10, the pakages below are required to install PG-Strom.
}
$ sudo yum install -y https://download.postgresql.org/pub/repos/yum/10/redhat/rhel-7-x86_64/pgdg-redhat10-10-2.noarch.rpm
$ sudo yum install -y postgresql10-server postgresql10-devel
:
================================================================================
Package Arch Version Repository Size
================================================================================
Installing:
postgresql10-devel x86_64 10.2-1PGDG.rhel7 pgdg10 2.0 M
postgresql10-server x86_64 10.2-1PGDG.rhel7 pgdg10 4.4 M
Installing for dependencies:
postgresql10 x86_64 10.2-1PGDG.rhel7 pgdg10 1.5 M
postgresql10-libs x86_64 10.2-1PGDG.rhel7 pgdg10 354 k
Transaction Summary
================================================================================
Install 2 Packages (+2 Dependent packages)
:
Installed:
postgresql10-devel.x86_64 0:10.2-1PGDG.rhel7
postgresql10-server.x86_64 0:10.2-1PGDG.rhel7
Dependency Installed:
postgresql10.x86_64 0:10.2-1PGDG.rhel7
postgresql10-libs.x86_64 0:10.2-1PGDG.rhel7
Complete!
@en{
The RPM packages provided by PostgreSQL Global Development Group installs software under the /usr/pgsql-<version>
directory, so you may pay attention whether the PATH environment variable is configured appropriately.
postgresql-alternative
package set up symbolic links to the related commands under /usr/local/bin
, so allows to simplify the operations. Also, it enables to switch target version using alternatives
command even if multiple version of PostgreSQL.
}
$ sudo yum install postgresql-alternatives
:
Resolving Dependencies
--> Running transaction check
---> Package postgresql-alternatives.noarch 0:1.0-1.el7 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
:
================================================================================
Package Arch Version Repository Size
================================================================================
Installing:
postgresql-alternatives noarch 1.0-1.el7 heterodb 9.2 k
Transaction Summary
================================================================================
:
Installed:
postgresql-alternatives.noarch 0:1.0-1.el7
Complete!
@{
This section introduces the steps to install PG-Strom.
We recommend RPM installation, however, also mention about the steps to build PG-Strom from the source code.
}
@en{
PG-Strom and related packages are distributed from HeteroDB Software Distribution Center.
If you repository definition has been added, not many tasks are needed.
}
@en{
We provide individual RPM packages of PG-Strom for each base PostgreSQL version. pg_strom-PG96
package is built for PostgreSQL 9.6, and pg_strom-PG10
is also built for PostgreSQL v10.
}
$ sudo yum install pg_strom-PG10
:
================================================================================
Package Arch Version Repository Size
================================================================================
Installing:
pg_strom-PG10 x86_64 1.9-180301.el7 heterodb 320 k
Transaction Summary
================================================================================
:
Installed:
pg_strom-PG10.x86_64 0:1.9-180301.el7
Complete!
@en{
That’s all for package installation.
}
@en{
For developers, we also introduces the steps to build and install PG-Strom from the source code.
}
@en{
Like RPM packages, you can download tarball of the source code from HeteroDB Software Distribution Center.
On the other hands, here is a certain time-lags to release the tarball, it may be preferable to checkout the master branch of PG-Strom on GitHub to use the latest development branch.
}
$ git clone https://github.com/heterodb/pg-strom.git
Cloning into 'pg-strom'...
remote: Counting objects: 13797, done.
remote: Compressing objects: 100% (215/215), done.
remote: Total 13797 (delta 208), reused 339 (delta 167), pack-reused 13400
Receiving objects: 100% (13797/13797), 11.81 MiB | 1.76 MiB/s, done.
Resolving deltas: 100% (10504/10504), done.
@en{
Configuration to build PG-Strom must match to the target PostgreSQL strictly. For example, if a particular strcut
has inconsistent layout by the configuration at build, it may lead problematic bugs; not easy to find out.
Thus, not to have inconsistency, PG-Strom does not have own configure script, but references the build configuration of PostgreSQL using pg_config
command.
If PATH environment variable is set to the pg_config
command of the target PostgreSQL, run make
and make install
.
Elsewhere, give PG_CONFIG=...
parameter on make
command to tell the full path of the pg_config
command.
}
$ cd pg-strom
$ make PG_CONFIG=/usr/pgsql-10/bin/pg_config
$ sudo make install PG_CONFIG=/usr/pgsql-10/bin/pg_config
@en{
Database cluster is not constructed yet, run initdb
command to set up initial database of PostgreSQL.
The default path of the database cluster on RPM installation is /var/lib/pgsql/<version number>/data
.
If you install postgresql-alternatives
package, this default path can be referenced by /var/lib/pgdata
regardless of the PostgreSQL version.
}
$ sudo su - postgres
$ initdb -D /var/lib/pgdata/
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
fixing permissions on existing directory /var/lib/pgdata ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok
WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
Success. You can now start the database server using:
pg_ctl -D /var/lib/pgdata/ -l logfile start
@en{
Next, edit postgresql.conf
which is a configuration file of PostgreSQL.
The parameters below should be edited at least to work PG-Strom.
Investigate other parameters according to usage of the system and expected workloads.
}
@en{
shared_preload_libraries
. Unable to load it on demand. Therefore, you must add the configuration below.shared_preload_libraries = '$libdir/pg_strom'
max_worker_processes = 100
shared_buffers
is too small for the data size where PG-Strom tries to work, thus storage workloads restricts the entire performance, and may be unable to work GPU efficiently.shared_buffers = 10GB
work_mem
is too small to choose the optimal query execution plan on analytic queries.work_mem = 1GB
@en{
SSD-to-GPU Direct SQL especially tries to open many files simultaneously, so resource limit for number of file descriptors per process should be expanded.
Also, we recommend not to limit core file size to generate core dump of PostgreSQL certainly on system crash.
}
@en{
If PostgreSQL service is launched by systemd, you can put the configurations of resource limit at /etc/systemd/system/postgresql-XX.service.d/pg_strom.conf
.
RPM installation setups the configuration below by the default.
It comments out configuration to the environment variable CUDA_ENABLE_COREDUMP_ON_EXCEPTION
. This is a developer option that enables to generate GPU’s core dump on any CUDA/GPU level errors, if enabled. See CUDA-GDB:GPU core dump support for more details.
}
[Service]
LimitNOFILE=65536
LimitCORE=infinity
#Environment=CUDA_ENABLE_COREDUMP_ON_EXCEPTION=1
@en{
Start PostgreSQL service.
If PG-Strom is set up appropriately, it writes out log message which shows PG-Strom recognized GPU devices.
The example below recognized the Tesla V100(PCIe; 16GB edition) device.
}
# systemctl start postgresql-10
# systemctl status -l postgresql-10
* postgresql-10.service - PostgreSQL 10 database server
Loaded: loaded (/usr/lib/systemd/system/postgresql-10.service; disabled; vendor preset: disabled)
Active: active (running) since Sat 2018-03-03 15:45:23 JST; 2min 21s ago
Docs: https://www.postgresql.org/docs/10/static/
Process: 24851 ExecStartPre=/usr/pgsql-10/bin/postgresql-10-check-db-dir ${PGDATA} (code=exited, status=0/SUCCESS)
Main PID: 24858 (postmaster)
CGroup: /system.slice/postgresql-10.service
|-24858 /usr/pgsql-10/bin/postmaster -D /var/lib/pgsql/10/data/
|-24890 postgres: logger process
|-24892 postgres: bgworker: PG-Strom GPU memory keeper
|-24896 postgres: checkpointer process
|-24897 postgres: writer process
|-24898 postgres: wal writer process
|-24899 postgres: autovacuum launcher process
|-24900 postgres: stats collector process
|-24901 postgres: bgworker: PG-Strom ccache-builder2
|-24902 postgres: bgworker: PG-Strom ccache-builder1
`-24903 postgres: bgworker: logical replication launcher
Mar 03 15:45:19 saba.heterodb.com postmaster[24858]: 2018-03-03 15:45:19.195 JST [24858] HINT: Run 'nvidia-cuda-mps-control -d', then start server process. Check 'man nvidia-cuda-mps-control' for more details.
Mar 03 15:45:20 saba.heterodb.com postmaster[24858]: 2018-03-03 15:45:20.509 JST [24858] LOG: PG-Strom: GPU0 Tesla V100-PCIE-16GB (5120 CUDA cores; 1380MHz, L2 6144kB), RAM 15.78GB (4096bits, 856MHz), CC 7.0
Mar 03 15:45:20 saba.heterodb.com postmaster[24858]: 2018-03-03 15:45:20.510 JST [24858] LOG: NVRTC - CUDA Runtime Compilation vertion 9.1
Mar 03 15:45:23 saba.heterodb.com postmaster[24858]: 2018-03-03 15:45:23.378 JST [24858] LOG: listening on IPv6 address "::1", port 5432
Mar 03 15:45:23 saba.heterodb.com postmaster[24858]: 2018-03-03 15:45:23.378 JST [24858] LOG: listening on IPv4 address "127.0.0.1", port 5432
Mar 03 15:45:23 saba.heterodb.com postmaster[24858]: 2018-03-03 15:45:23.442 JST [24858] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
Mar 03 15:45:23 saba.heterodb.com postmaster[24858]: 2018-03-03 15:45:23.492 JST [24858] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
Mar 03 15:45:23 saba.heterodb.com postmaster[24858]: 2018-03-03 15:45:23.527 JST [24858] LOG: redirecting log output to logging collector process
Mar 03 15:45:23 saba.heterodb.com postmaster[24858]: 2018-03-03 15:45:23.527 JST [24858] HINT: Future log output will appear in directory "log".
Mar 03 15:45:23 saba.heterodb.com systemd[1]: Started PostgreSQL 10 database server.
@en{
At the last, create database objects related to PG-Strom, like SQL functions.
This steps are packaged using EXTENSION feature of PostgreSQL. So, all you needs to run is CREATE EXTENSION
on the SQL command line.
}
@en{
Please note that this step is needed for each new database.
If you want PG-Strom is pre-configured on new database creation, you can create PG-Strom extension on the template1
database, its configuration will be copied to the new database on CREATE DATABASE
command.
}
$ psql postgres -U postgres
psql (10.2)
Type "help" for help.
postgres=# CREATE EXTENSION pg_strom ;
CREATE EXTENSION
@en{
That’s all for the installation.
}
@en{
This section also introduces NVME-Strom Linux kernel module which is closely cooperating with core features of PG-Strom like SSD-to-GPU Direct SQL Execution, even if it is an independent software module.
}
@en{
Like other PG-Strom related modules, NVME-Strom is distributed at the (https://heterodb.github.io/swdc/)[HeteroDB Software Distribution Center] as a free software. In other words, it is not an open source software.
If your system already setup heterodb-swdc
package, yum install
command downloads the RPM file and install the nvme_strom
package.
}
$ sudo yum install nvme_strom
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
* base: mirrors.cat.net
* epel: ftp.iij.ad.jp
* extras: mirrors.cat.net
* ius: mirrors.kernel.org
* updates: mirrors.cat.net
Resolving Dependencies
--> Running transaction check
---> Package nvme_strom.x86_64 0:1.3-1.el7 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
================================================================================
Package Arch Version Repository Size
================================================================================
Installing:
nvme_strom x86_64 1.3-1.el7 heterodb 273 k
Transaction Summary
================================================================================
Install 1 Package
Total download size: 273 k
Installed size: 1.5 M
Is this ok [y/d/N]: y
Downloading packages:
No Presto metadata available for heterodb
nvme_strom-1.3-1.el7.x86_64.rpm | 273 kB 00:00
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : nvme_strom-1.3-1.el7.x86_64 1/1
:
<snip>
:
DKMS: install completed.
Verifying : nvme_strom-1.3-1.el7.x86_64 1/1
Installed:
nvme_strom.x86_64 0:1.3-1.el7
Complete!
@en{
License activation is needed to use all the features of NVME-Strom module, provided by HeteroDB,Inc. You can operate the system without license, but features below are restricted.
@en{
You can obtain a license file, like as a plain text below, from HeteroDB,Inc.
}
IAgIVdKxhe+BSer3Y67jQW0+uTzYh00K6WOSH7xQ26Qcw8aeUNYqJB9YcKJTJb+QQhjmUeQpUnboNxVwLCd3HFuLXeBWMKp11/BgG0FSrkUWu/ZCtDtw0F1hEIUY7m767zAGV8y+i7BuNXGJFvRlAkxdVO3/K47ocIgoVkuzBfLvN/h9LffOydUnHPzrFHfLc0r3nNNgtyTrfvoZiXegkGM9GBTAKyq8uWu/OGonh9ybzVKOgofhDLk0rVbLohOXDhMlwDl2oMGIr83tIpCWG+BGE+TDwsJ4n71Sv6n4bi/ZBXBS498qShNHDGrbz6cNcDVBa+EuZc6HzZoF6UrljEcl=
----
VERSION:2
SERIAL_NR:HDB-TRIAL
ISSUED_AT:2019-05-09
EXPIRED_AT:2019-06-08
GPU_UUID:GPU-a137b1df-53c9-197f-2801-f2dccaf9d42f
@en{
Copy the license file to /etc/heterodb.license
, then restart PostgreSQL.
The startup log messages of PostgreSQL dumps the license information, and it tells us the license activation is successfully done.
}
$ pg_ctl restart
:
LOG: PG-Strom version 2.2 built for PostgreSQL 11
LOG: PG-Strom: GPU0 Tesla P40 (3840 CUDA cores; 1531MHz, L2 3072kB), RAM 22.38GB (384bits, 3.45GHz), CC 6.1
:
LOG: HeteroDB License: { "version" : 2, "serial_nr" : "HDB-TRIAL", "issued_at" : "9-May-2019", "expired_at" : "8-Jun-2019", "gpus" : [ { "uuid" : "GPU-a137b1df-53c9-197f-2801-f2dccaf9d42f", "pci_id" : "0000:02:00.0" } ] }
LOG: listening on IPv6 address "::1", port 5432
LOG: listening on IPv4 address "127.0.0.1", port 5432
:
@en{
NVME-Strom Linux kernel module has some parameters.
Parameter | Type | Default | Description |
---|---|---|---|
verbose | int | 0 | Enables detailed debug output |
fast_ssd_mode | int | 0 | Operating mode for fast NVME-SSD |
p2p_dma_max_depth | int | 1024 | Maximum number of asynchronous P2P DMA request can be enqueued on the I/O-queue of NVME device |
p2p_dma_max_unitsz | int | 256 | Maximum length of data blocks, in kB, to be read by a single P2P DMA request at once |
}
@en{
Here is an extra explanation for fast_ssd_mode
parameter.
When NVME-Strom Linux kernel module get a request for SSD-to-GPU direct data transfer, first of all, it checks whether the required data blocks are caches on page-caches of operating system.
If fast_ssd_mode
is 0
, NVME-Strom once writes back page caches of the required data blocks to the userspace buffer of the caller, then indicates application to invoke normal host–>device data transfer by CUDA API. It is suitable for non-fast NVME-SSDs such as PCIe x4 grade.
On the other hands, SSD-to-GPU direct data transfer may be faster, if you use PCIe x8 grade fast NVME-SSD or use multiple SSDs in striping mode, than normal host–>device data transfer after the buffer copy. If fast_ssd_mode
is not 0
, NVME-Strom kicks SSD-to-GPU direct data transfer regardless of the page cache state.
However, it shall never kicks SSD-to-GPU direct data transfer if page cache is dirty.
}
@en{
Here is an extra explanation for p2p_dma_max_depth
parameter.
NVME-Strom Linux kernel module makes DMA requests for SSD-to-GPU direct data transfer, then enqueues them to I/O-queue of the source NVME devices.
When asynchronous DMA requests are enqueued more than the capacity of NVME devices, latency of individual DMA requests become terrible because NVME-SSD controler processes the DMA requests in order of arrival. (On the other hands, it maximizes the throughput because NVME-SSD controler receives DMA requests continuously.)
If turn-around time of the DMA requests are too large, it may be wrongly considered as errors, then can lead timeout of I/O request and return an error status. Thus, it makes no sense to enqueue more DMA requests to the I/O-queue more than the reasonable amount of pending requests for full usage of NVME devices.
p2p_dma_max_depth
parameter controls number of asynchronous P2P DMA requests that can be enqueued at once per NVME device. If application tries to enqueue DMA requests more than the configuration, the caller thread will block until completion of the running DMA. So, it enables to avoid unintentional high-load of NVME devices.
}