当前位置: 首页 > 工具软件 > MonetDB > 使用案例 >

The column-store pioneer | MonetDB

程谦
2023-12-01
MonetDB是一个开源的列存储数据库产品, 起源于阿姆斯特丹大学1990年的一个研究项目MAGNUM, 2002年一位学生以该项目为基础的博士论文并以Monet命名. 第一个版本是2004年9月30日发布的. 
MonetDB的架构, 包含3个层次.
第一层top layer, 负责查询接口, 输出MAL(MonetDB Assembly Language)指令.
第二层backend 层. 接收MAL指令, 负责基于成本的优化.
第三层database kernel层. 为数据库内核层, 负责数据访问和存储Binary Association Tables(BATs). 
MonetDB可以利用CPU cache, 提高效率.
MonetDB architecture is represented in three layers, each with its own set of optimizers.[10] The front-end is the top layer, providing query interfaces for SQL, SciQL, SPARQL and JAQL.[11] Queries are parsed into domain-specific representations, like relational algebra for SQL, and optimized. The generated logical execution plans are then translated into MonetDB Assembly Language (MAL) instructions, which are passed to the next layer. The middle or back-end layer provides a number of cost-based optimizers for the MAL. The bottom layer is the database kernel, which provides access to the data stored in Binary Association Tables (BATs). Each BAT is a table consisting of an Object-identifier and value columns, representing a single column in the database.[10]

MonetDB internal data representation also relies on the memory addressing ranges of contemporary CPUs using demand paging of memory mapped files, and thus departing from traditional DBMS designs involving complex management of large data stores in limited memory.


MonetDB的扩展插件, 
包括基于SQL:2003标准的SQL插件, 空间几何插件, 科学应用插件, 数据分析插件等.
A number of extensions exist for MonetDB that extend the functionality of the database engine. Due to the three-layer architecture, top-level query interfaces can benefit from optimizations one in the backend and kernel layers.

SQL[edit]
MonetDB/SQL is a top-level extension, which provides complete support for transactions in compliance with the SQL:2003 standard.[10]

GIS[edit]
MonetDB/GIS is an extension to MonetDB/SQL with support for the Simple Features standard of Open Geospatial Consortium (OGS).[1]

SciQL[edit]
SciQL an SQL-based query language for science applications with arrays as first class citizens. SciQL allows MonetDB to effectively function as an array database. It is used in the SciLens and will be further extended for the Human Brain Project.[12]

DataCell[edit]
MonetDB/DataCell adds stream processing facilities on top of the column-store architecture of MonetDB. It provides facilities for data analysis of-the-fly with the database system itself.[3] [10]

RDF/SPARQL[edit]
MonetDB/RDF is a SPARQL-based extension for working with linked data, which adds support for RDF and allowing MonetDB to function as a triplestore. Under development for the Linked Open Data 2 project.[2]

MonetDB还支持多核并行查询, 多主机集群, 数据压缩, 主键, 外键, 等. 

PostgreSQL连接MonetDB的fdw.

更多文档参考

MonetDB在CentOS 6.x x64中的安装 : 
下载最新的稳定版本源码 : 
查看配置帮助.
# wget https://www.monetdb.org/downloads/sources/Jan2014-SP3/MonetDB-11.17.21.tar.bz2
# tar -jxvf MonetDB-11.17.21.tar.bz2
# cd MonetDB-11.17.21
# ./configure --help
`configure' configures MonetDB 11.17.21 to adapt to many kinds of systems.

Usage: ./configure [OPTION]... [VAR=VALUE]...

To assign environment variables (e.g., CC, CFLAGS...), specify them as
VAR=VALUE.  See below for descriptions of some of the useful variables.

Defaults for the options are specified in brackets.

Configuration:
  -h, --help              display this help and exit
      --help=short        display options specific to this package
      --help=recursive    display the short help of all the included packages
  -V, --version           display version information and exit
  -q, --quiet, --silent   do not print `checking ...' messages
      --cache-file=FILE   cache test results in FILE [disabled]
  -C, --config-cache      alias for `--cache-file=config.cache'
  -n, --no-create         do not create output files
      --srcdir=DIR        find the sources in DIR [configure dir or `..']

Installation directories:
  --prefix=PREFIX         install architecture-independent files in PREFIX
                          [/usr/local]
  --exec-prefix=EPREFIX   install architecture-dependent files in EPREFIX
                          [PREFIX]

By default, `make install' will install all the files in
`/usr/local/bin', `/usr/local/lib' etc.  You can specify
an installation prefix other than `/usr/local' using `--prefix',
for instance `--prefix=$HOME'.

For better control, use the options below.

Fine tuning of the installation directories:
  --bindir=DIR            user executables [EPREFIX/bin]
  --sbindir=DIR           system admin executables [EPREFIX/sbin]
  --libexecdir=DIR        program executables [EPREFIX/libexec]
  --sysconfdir=DIR        read-only single-machine data [PREFIX/etc]
  --sharedstatedir=DIR    modifiable architecture-independent data [PREFIX/com]
  --localstatedir=DIR     modifiable single-machine data [PREFIX/var]
  --libdir=DIR            object code libraries [EPREFIX/lib]
  --includedir=DIR        C header files [PREFIX/include]
  --oldincludedir=DIR     C header files for non-gcc [/usr/include]
  --datarootdir=DIR       read-only arch.-independent data root [PREFIX/share]
  --datadir=DIR           read-only architecture-independent data [DATAROOTDIR]
  --infodir=DIR           info documentation [DATAROOTDIR/info]
  --localedir=DIR         locale-dependent data [DATAROOTDIR/locale]
  --mandir=DIR            man documentation [DATAROOTDIR/man]
  --docdir=DIR            documentation root [DATAROOTDIR/doc/MonetDB]
  --htmldir=DIR           html documentation [DOCDIR]
  --dvidir=DIR            dvi documentation [DOCDIR]
  --pdfdir=DIR            pdf documentation [DOCDIR]
  --psdir=DIR             ps documentation [DOCDIR]

Program names:
  --program-prefix=PREFIX            prepend PREFIX to installed program names
  --program-suffix=SUFFIX            append SUFFIX to installed program names
  --program-transform-name=PROGRAM   run sed PROGRAM on installed program names

System types:
  --build=BUILD     configure for building on BUILD [guessed]
  --host=HOST       cross-compile to build programs to run on HOST [BUILD]
  --target=TARGET   configure for building compilers for TARGET [HOST]

Optional Features:
  --disable-option-checking  ignore unrecognized --enable/--with options
  --disable-FEATURE       do not include FEATURE (same as --enable-FEATURE=no)
  --enable-FEATURE[=ARG]  include FEATURE [ARG=yes]
  --enable-silent-rules   less verbose build output (undo: "make V=1")
  --disable-silent-rules  verbose build output (undo: "make V=0")
  --enable-gdk            enable support for GDK (default=auto)
  --enable-monetdb5       enable support for MonetDB5 (default=auto)
  --enable-rdf            enable support for RDF [experimental/unsupported]
                          (default=no)
  --enable-datacell       enable datacell stream components
                          [experimental/unsupported] (default=no)
  --enable-fits           enable support for FITS (default=no)
  --enable-sql            enable support for MonetDB/SQL (default=auto)
  --enable-geom           enable support for geom module (default=auto)
  --enable-jaql           enable support for MonetDB/JAQL (default=auto)
  --enable-gsl            enable support for GSL (default=no)
  --enable-odbc           compile the MonetDB ODBC driver (default=auto)
  --enable-testing        enable support for testing (default=auto)
  --enable-developer      enable support for MonetDB development (default=yes
                          for development sources)
  --enable-console        enables direct console on the server (involves
                          security risks) (default=yes)
  --enable-jdbc           build the MonetDB JDBC driver
  --enable-merocontrol    build the Merovingian control driver
  --enable-static-analysis
                          configure for static code analysis (use only if you
                          know what you are doing)
  --enable-static[=PKGS]  build static libraries [default=no]
  --enable-shared[=PKGS]  build shared libraries [default=yes]
  --enable-dependency-tracking
                          do not reject slow dependency extractors
  --disable-dependency-tracking
                          speeds up one-time build
  --disable-largefile     omit support for large files
  --enable-bits           obsolete: specify by setting CC in the environment
  --enable-oid32          use 32 bits for OIDs on a 64-bit architecture
  --enable-strict         enable strict compiler flags (default=yes for
                          development sources)
  --enable-fast-install[=PKGS]
                          optimize for fast installation [default=yes]
  --disable-libtool-lock  avoid locking (might break parallel builds)
  --enable-debug          enable full debugging (default=yes for development
                          sources)
  --enable-assert         enable assertions in the code (default=yes for
                          development sources)
  --enable-optimize       enable extra optimization (default=no)
  --enable-profile        enable profiling (default=no)
  --enable-instrument     enable instrument (default=no)
  --disable-rpath         do not hardcode runtime library paths

Optional Packages:
  --with-PACKAGE[=ARG]    use PACKAGE [ARG=yes]
  --without-PACKAGE       do not use PACKAGE (same as --with-PACKAGE=no)
  --with-password-backend=HASHALG
                          password hash algorithm, one of MD5, SHA1,
                          RIPEMD160, SHA224, SHA256, SHA384, SHA512, defaults
                          to SHA512
  --with-logdir=DIR       Where to put log files (LOCALSTATEDIR/log/monetdb/)
  --with-rundir=DIR       Where to put pid files (LOCALSTATEDIR/run/monetdb/)
  --with-bits=BITS        obsolete: specify by setting CC in the environment
  --with-pic[=PKGS]       try to use only PIC/non-PIC objects [default=use
                          both]
  --with-gnu-ld           assume the C compiler uses GNU ld [default=no]
  --with-sysroot=DIR Search for dependent libraries within DIR
                        (or the compiler's sysroot if not specified).
  --with-translatepath=PROG
                          program to translate paths from configure-time
                          format to execute-time format. Take care that this
                          program can be given paths like ${prefix}/etc which
                          should be translated carefully.
  --with-anttranslatepath=PROG
                          program to translate paths from configure-time
                          format to a format that can be given to the ant
                          program (default: 'readlink -f' or value for
                          --with-translatepath)
  --with-perl=FILE        perl is installed as FILE
  --with-perl-libdir=DIR  relative path for Perl library directory (where Perl
                          modules should be installed)
  --with-python2=FILE     python2 is installed as FILE
  --with-python3=FILE     python3 is installed as FILE
  --with-python2-libdir=DIR
                          relative path for Python 2 library directory (where
                          Python 2 modules should be installed)
  --with-python3-libdir=DIR
                          relative path for Python 3 library directory (where
                          Python 3 modules should be installed)
  --with-rubygem-dir=DIR  Ruby gems are installed in DIR
  --with-rubygem=FILE     ruby gem is installed as FILE
  --with-ant=FILE         ant is installed as FILE
  --with-java=DIR         java, javac, jar and javadoc are installed in
                          DIR/bin
  --with-bz2=DIR          bz2 library is installed in DIR
  --with-pthread=DIR      pthread library is installed in DIR
  --with-readline=DIR     readline library is installed in DIR
  --with-gnu-ld           assume the C compiler uses GNU ld [default=no]
  --with-libiconv-prefix[=DIR]  search for libiconv in DIR/include and DIR/lib
  --without-libiconv-prefix     don't search for libiconv in includedir and libdir
  --with-valgrind         include valgrind support (default=no)
  --with-sphinxclient=DIR sphinxclient library is installed in DIR
  --with-unixodbc=DIR     unixODBC library is installed in DIR
  --with-mseed=DIR        mseed library is installed in DIR
  --with-geos=DIR         geos library is installed in DIR
  --with-hwcounters=DIR   hwcounters library is installed in DIR

Some influential environment variables:
  PKG_CONFIG  path to pkg-config utility
  PKG_CONFIG_PATH
              directories to add to pkg-config's search path
  PKG_CONFIG_LIBDIR
              path overriding pkg-config's built-in search path
  CC          C compiler command
  CFLAGS      C compiler flags
  LDFLAGS     linker flags, e.g. -L<lib dir> if you have libraries in a
              nonstandard directory <lib dir>
  LIBS        libraries to pass to the linker, e.g. -l<library>
  CPPFLAGS    (Objective) C/C++ preprocessor flags, e.g. -I<include dir> if
              you have headers in a nonstandard directory <include dir>
  CPP         C preprocessor
  YACC        The `Yet Another Compiler Compiler' implementation to use.
              Defaults to the first program found out of: `bison -y', `byacc',
              `yacc'.
  YFLAGS      The list of arguments that will be passed by default to $YACC.
              This script will default YFLAGS to the empty string to avoid a
              default value of `-d' given by some make applications.
  openssl_CFLAGS
              C compiler flags for openssl, overriding pkg-config
  openssl_LIBS
              linker flags for openssl, overriding pkg-config
  pcre_CFLAGS C compiler flags for pcre, overriding pkg-config
  pcre_LIBS   linker flags for pcre, overriding pkg-config
  libxml2_CFLAGS
              C compiler flags for libxml2, overriding pkg-config
  libxml2_LIBS
              linker flags for libxml2, overriding pkg-config
  raptor_CFLAGS
              C compiler flags for raptor, overriding pkg-config
  raptor_LIBS linker flags for raptor, overriding pkg-config
  curl_CFLAGS C compiler flags for curl, overriding pkg-config
  curl_LIBS   linker flags for curl, overriding pkg-config
  zlib_CFLAGS C compiler flags for zlib, overriding pkg-config
  zlib_LIBS   linker flags for zlib, overriding pkg-config
  valgrind_CFLAGS
              C compiler flags for valgrind, overriding pkg-config
  valgrind_LIBS
              linker flags for valgrind, overriding pkg-config
  cfitsio_CFLAGS
              C compiler flags for cfitsio, overriding pkg-config
  cfitsio_LIBS
              linker flags for cfitsio, overriding pkg-config
  atomic_ops_CFLAGS
              C compiler flags for atomic_ops, overriding pkg-config
  atomic_ops_LIBS
              linker flags for atomic_ops, overriding pkg-config
  gsl_CFLAGS  C compiler flags for gsl, overriding pkg-config
  gsl_LIBS    linker flags for gsl, overriding pkg-config

Use these variables to override the choices made by `configure' or to help
it to find libraries and programs with nonstandard names/locations.

Report bugs to <info@monetdb.org>.

配置, 这里的配置选项包含了大量的插件, 如果中间遇到依赖关系的报错, 安装解决对应的缺失库即可 : 
# ./configure --prefix=/opt/monetdb11.17.21 --enable-gdk --enable-monetdb5 --enable-rdf --enable-datacell --enable-fits --enable-sql --enable-geom --enable-jaql --enable-gsl --enable-odbc --enable-testing --enable-console --enable-jdbc --enable-merocontrol

报错
checking for ant... no
checking for java... /usr/bin/java
checking for javac... no
checking for jar... no
checking for javadoc... no
configure: error: MonetDB JDBC requires ant and Java
解决
# yum install -y ant


报错
checking for pcre... no
configure: error: PCRE library not found but required for MonetDB5
解决
# yum install -y pcre pcre-devel


报错
checking for raptor... no
configure: error: raptor library required for RDF support
解决
# yum install -y raptor-devel


报错
checking for curl... no
checking for zlib... yes
checking sphinxclient.h usability... no
checking sphinxclient.h presence... no
checking for sphinxclient.h... no
checking odbcinst.h usability... no
checking odbcinst.h presence... no
checking for odbcinst.h... no
configure: error: unixODBC required for building ODBC driver
解决
# yum install -y curl-devel unixODBC-devel.x86_64


报错
checking libmseed.h usability... no
checking libmseed.h presence... no
checking for libmseed.h... no
checking for geos-config... no
configure: error: geos library required for geom module
解决, 需要安装geos
http://trac.osgeo.org/geos/
# yum install -y gcc-c++
# wget http://download.osgeo.org/geos/geos-3.4.2.tar.bz2
# tar -jxvf geos-3.4.2.tar.bz2
# cd geos-3.4.2
# ./configure --prefix=/opt/geos3.4.2
# gmake && gmake install
# ln -s /opt/geos3.4.2 /opt/geos

# vi /etc/ld.so.conf
/opt/geos/lib
# ldconfig

# vi /etc/profile
export PATH=/opt/geos/bin:$PATH
重新进shell环境


报错
checking for cfitsio... no
configure: error: cfitsio library required for FITS support
解决, 需要安装fitsio
http://heasarc.gsfc.nasa.gov/docs/software/fitsio/fitsio.html
# wget ftp://heasarc.gsfc.nasa.gov/software/fitsio/c/cfitsio3370.tar.gz
# tar -zxvf cfitsio3370.tar.gz
# cd cfitsio
# ./configure --prefix=/opt/cfitsio
# make && make install
# vi /etc/ld.so.conf
/opt/cfitsio/lib
# ldconfig
如果ldconfig不生效, 需要指定cfitsio的LIBS和CFLAGS
# cfitsio_LIBS=-L/opt/cfitsio/lib cfitsio_CFLAGS=-I/opt/cfitsio/include ./configure --prefix=/opt/monetdb11.17.21 --enable-gdk --enable-monetdb5 --enable-rdf --enable-datacell --enable-fits --enable-sql --enable-geom --enable-jaql --enable-gsl --enable-odbc --enable-testing --enable-console --enable-jdbc --enable-merocontrol


报错
checking for atomic_ops... no
checking for gsl... no
configure: error: gsl library required for GSL support
解决
# yum install -y gsl-devel

所有错误解决后, 编译通过, 有些未启用的特性, 可以安装对应的库, 重新编译即可支持.
如 bzip2-devel, libatomic_ops-devel, 

# cfitsio_LIBS=-L/opt/cfitsio/lib cfitsio_CFLAGS=-I/opt/cfitsio/include ./configure --prefix=/opt/monetdb11.17.21 --enable-gdk --enable-monetdb5 --enable-rdf --enable-datacell --enable-fits --enable-sql --enable-geom --enable-jaql --enable-gsl --enable-odbc --enable-testing --enable-console --enable-jdbc --enable-merocontrol

MonetDB is configured as follows:
* Compilation specifics:
    Host:      x86_64-unknown-linux-gnu
    Compiler:  gcc (gcc-4.4.7; gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4) Copyright (C) 2010 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.)
    CFLAGS:    -g -O2 $(X_CFLAGS)
    X_CFLAGS: 
    LDFLAGS:  

* Enabled/disabled build options:
    strict     is disabled (by default)
    assert     is disabled (by default)
    debug      is disabled (by default)
    optimize   is disabled (by default)
    developer  is disabled (by default)
    instrument is disabled (by default)
    profile    is disabled (by default)

* Enabled/disabled components:
    gdk       is enabled
    monetdb5  is enabled
    sql       is enabled
    jaql      is enabled
    geom      is enabled
    gsl       is enabled
    fits      is enabled
    rdf       is enabled
    datacell  is enabled
    odbc      is enabled
    jdbc      is enabled
    control   is enabled
    testing   is enabled

* Available features/extensions:
    atomic_ops    = no  (atomic_ops library not found)
    bz2           = no  (bz2 library not found)
    cfitsio       = yes
    curl          = yes
    geos          = yes
    getaddrinfo   = yes
    gsl           = yes
    hwcounters    = no  (no supported harwdware counters interface found)
    java          = yes
    java_control  = yes
    java_jdbc     = yes
    libxml2       = yes
    mseed         = no  (libmseed.h header not found)
    openssl       = yes
    pcre          = yes
    perl          = yes
    pthread       = yes
    python2       = yes
    python3       = no  (Python 3 executable not found)
    raptor        = yes
    readline      = yes
    rubygem       = no  (no rubygem executable found)
    setsockopt    = no  (by default)
    sphinxclient  = no  (sphinxclient.h header not found)
    unixodbc      = yes
    valgrind      = no  (by default)
    winsock2      = yes
    zlib          = yes

* Important options:
    OID size:  64 bits

# gmake
# gmake install

jdbc驱动make时可能有问题, 可以去除, --disable-jdbc, 然后make, make install
添加到环境变量.
vi /etc/profile
export PATH=/opt/monetdb11.17.21/bin:$PATH
export MANPATH=/opt/monetdb11.17.21/share/man:$MANPATH

目录结构
[root@176 MonetDB-11.17.21]# cd /opt/monetdb11.17.21/
[root@176 monetdb11.17.21]# ll
total 28
drwxr-xr-x 2 root root 4096 Aug 13 11:38 bin
drwxr-xr-x 3 root root 4096 Aug 13 11:38 etc
drwxr-xr-x 3 root root 4096 Aug 13 11:38 include
drwxr-xr-x 5 root root 4096 Aug 13 11:38 lib
drwxr-xr-x 3 root root 4096 Aug 13 11:38 lib64
drwxr-xr-x 6 root root 4096 Aug 13 11:38 share
drwxr-xr-x 3 root root 4096 Aug 13 11:38 var


[其他]
1. 如果使用过程中发现有fits库的问题, 可以在编译时取消--enable-fits .
例如
merovingian.log
2014-08-15 09:57:16 MSG test[19577]: #WARNING: LoaderException:loadLibrary:Loading error could not locate library fits (from within file 'fits')


[参考]

 类似资料:

相关阅读

相关文章

相关问答