Apache Superset 二次开发

葛烨
2023-12-01

基本概念

 Superset 是 Airbnb 开源的一个旨在视觉,直观和交互式的数据探索平台(曾用名 Panoramix、Caravel,现已进入 Apache 孵化器)

基础组件

Flask

 Python 几大著名 Web 框架之一,以其轻量级, 高可扩展性而著名

  • Jinja2
    模板引擎

  • Werkzeug
    WSGI 工具集

Gunicorn

 Gunicorn 是一个开源的 Python WSGI HTTP 服务器,移植于 Ruby 的 Unicorn 项目的采用 pre-fork 模式的服务器

WSGI

 WSGI,即 Python **W**eb **S**erver **G**ateway **I**nterface,是专门用于 Python 应用程序或框架与 Web 服务器之间的一种接口,没有官方的实现,因为 WSGI 更像一个协议,只要遵照这些协议,WSGI 应用都可以在 任何服务器上运行,反之亦然

Pre-Fork

 一个进程处理一个请求,基于 select 模型,所以最多一次创建 1024 个进程
 预先创建进程,pre-fork 采用的是预派生子进程方式,用子进程处理不同的请求,每个请求对应一个子进程,进程之间是彼此独立的
 一定程度上加快了进程的响应速度

Django

 Django 是一个开放源代码的 Web 应用框架,由 Python 写成。采用了 MVC 的软件设计模式,使得开发复杂的、数据库驱动的网站变得简单
 Django 注重组件的重用性和” 可插拔性”,敏捷开发和 DRY 法则(Do not Repeat Yourself)

 核心组件
* 物件导向的映射器,用作数据模型(以 Python 类的形式定义)和 关联性数据库间的媒介
* 基于正则表达式的 URL 分发器
* 视图系统,用于处理请求
* 模板系统

PyDruid

 A Python connector for Druid
 Exposes a simple API to create, execute, and analyze Druid queries

Pandas

 Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive

SciPy

 SciPy 是基于 Numpy 构建的一个集成了多种数学算法和方便的函数的 Python 模块

Scikit-learn

 Machine Learning in Python

D3.js

 D3.js 是一个操纵数据的 JavaScript 库

安装

基础环境

OS

$ uname -a
Linux 2.6.32-431.el6.x86_64 #1 SMP Fri Nov 22 03:15:09 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

$ cat /proc/version
Linux version 2.6.32-431.el6.x86_64 (mockbuild@c6b8.bsys.dev.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Fri Nov 22 03:15:09 UTC 2013

# For Fedora and RHEL-derivatives
# [Doc]: Other System https://superset.apache.org/installation.html#os-dependencies
$ sudo yum upgrade python-setuptools -y
$ sudo yum install gcc libffi-devel python-devel python-pip python-wheel openssl-devel libsasl2-devel openldap-devel -y

Machines

# 外网(http://192.168.1.10:9097/)
superset01                     192.168.1.10           Superset
druid01                        192.168.1.11           Druid
druid02                        192.168.1.12           MySQL

# Cluster 配置
Cluster                         druid cluster
Coordinator Host                192.168.1.11
Coordinator Port                8081
Coordinator Endpoint            druid/coordinator/v1/metadata
Broker Host                     192.168.1.13
Broker Port                     8082
Broker Endpoint                 druid/v2
Cache Timeout                   86400               # 1day: result_backend


# 线上(http://192.168.2.10:9097)
druid-prd01                     192.168.2.10         Superset
druid-prd02                     192.168.2.11         Druid

# Cluster 配置
Cluster                         druid cluster
Coordinator Host                192.168.2.11
Coordinator Port                8081
Coordinator Endpoint            druid/coordinator/v1/metadata
Broker Host                     192.168.2.13
Broker Port                     8082
Broker Endpoint                 druid/v2
Cache Timeout                   86400                 # 1day: result_backend

Python 相关

Python

$ python --version
  Python 2.7.8

[Note]: Superset is tested using Python 2.7 and Python 3.4+. Python 3 is the recommended version, Python 2.6 won't be supported.'

## 升级 Python(stable: Python 2.7.12 | 3.4.5, lastest: Python 3.5.2 [2016/12/15])
https://www.python.org/downloads/

# 在 python ftp 服务器中下载到,对应版本的 python
$ wget http://python.org/ftp/python/2.7.12/Python-2.7.12.tgz

# 编译
$ tar -zxvf Python-2.7.12.tgz
$ cd /root/software/Python-2.7.12
$ ./configure --prefix=/usr/local/python27
$ make
$ make install

$ ls /usr/local/python27/ -al

  drwxr-xr-x.  6 root root 4096 12月 15 14:22 .
  drwxr-xr-x. 13 root root 4096 12月 15 14:20 ..
  drwxr-xr-x.  2 root root 4096 12月 15 14:22 bin
  drwxr-xr-x.  3 root root 4096 12月 15 14:21 include
  drwxr-xr-x.  4 root root 4096 12月 15 14:22 lib
  drwxr-xr-x.  3 root root 4096 12月 15 14:22 share


# 覆盖原来的 python6
$ which python
  /usr/local/bin/python
# mv /usr/bin/python /usr/bin/python_old
$ mv /usr/local/bin/python /usr/local/bin/python_old
$ ln -s /usr/local/python27/bin/python /usr/local/bin/
$ python --version
  Python 2.7.12

# 修改 yum 引用的 python 版本为旧版 2.6 的 python
$ vim /usr/bin/yum

  # 第一行修改为 python2.6
  #!/usr/bin/python2.6

$ yum --version | sed '2,$d'
  3.2.29

Pip

$ pip --version
$ pip 9.0.1 from /usr/local/lib/python2.7/site-packages (python 2.7)

# upgrade setup tools and pip
$ pip install --upgrade setuptools pip

## Offline 环境下安装 pip
# https://pypi.python.org/pypi/setuptools#code-of-conduct 下载 setuptools-32.0.0.tar.gz
$ tar zxvf setuptools-32.0.0.tar.gz
$ cd setuptools-32.0.0

$ cd setuptools-32.0.0
$ python setup.py install

# https://pypi.python.org/pypi/pip 下载 pip-9.0.1.tar.gz
$ wget --no-check-certificate https://pypi.python.org/packages/11/b6/abcb525026a4be042b486df43905d6893fb04f05aac21c32c638e939e447/pip-9.0.1.tar.gz#md5=35f01da33009719497f01a4ba69d63c9
$ tar zxvf pip-9.0.1.tar.gz
$ cd pip-9.0.1
$ python setup.py install
  Installed /usr/local/python27/lib/python2.7/site-packages/pip-9.0.1-py2.7.egg
  Processing dependencies for pip==9.0.1
  Finished processing dependencies for pip==9.0.1

$ pip --version
  pip 9.0.1 from /root/software/pip-9.0.1 (python 2.7)

Virtualenv

$ pip install virtualenv

# virtualenv is shipped in Python 3 as pyvenv
$ virtualenv venv
$ source venv/bin/activate

## Offline 环境下安装 virtualenv
# https://pypi.python.org/pypi/virtualenv#downloads 下载 virtualenv-15.1.0.tar.gz
$ tar zxvf virtualenv-15.1.0.tar.gz
$ cd virtualenv-15.1.0
$ python setup.py install

$ virtualenv --version
  15.1.0

Superset 相关

Superset 初始化

$ pip install superset

## Offline 环境下安装 superset
# https://pypi.python.org/pypi/superset 下载 superset-0.15.0.tar.gz
$ tar zxvf superset-0.15.0.tar.gz
$ cd superset-0.15.0
$ python setup.py install

# Create an admin user
$ fabmanager create-admin --app superset

  Username [admin]:        # login name
  User first name [admin]: # first name
  User last name [user]:   # lastname
  Email [admin@fab.org]:   # email, must unique
  Password: 
  Repeat for confirmation: 
  Error: the two entered values do not match
  Password:             #superset
  Repeat for confirmation: #superset
  // ...
  Recognized Database Authentications.
  2016-12-14 17:53:40,945:INFO:flask_appbuilder.security.sqla.manager:Added user superset db upgrade
  Admin User superset db upgrade created.

# Initialize the database
$ superset db upgrade

  // ...
  INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
  INFO  [alembic.runtime.migration] Will assume transactional DDL.


# Load some data to play with
$ superset load_examples

  Loading examples into <SQLA engine=u'sqlite:root/.superset/superset.db'>
  Creating default CSS templates
  Loading energy related dataset
  Creating table [wb_health_population] reference
  2016-12-14 17:58:09,568:INFO:root:Creating database reference
  2016-12-14 17:58:09,575:INFO:root:sqlite:root/.superset/superset.db
  Loading [World Bank's Health Nutrition and Population Stats]'
  Creating table [wb_health_population] reference
  2016-12-14 17:58:30,840:INFO:root:Creating database reference
  2016-12-14 17:58:30,846:INFO:root:sqlite:root/.superset/superset.db


# Create default roles and permissions
$ superset init

  Loading examples into <SQLA engine=u'sqlite:root/.superset/superset.db'>
  Creating default CSS templates
  Loading energy related dataset
  Creating table [wb_health_population] reference
  2016-12-14 17:58:09,568:INFO:root:Creating database reference
  2016-12-14 17:58:09,575:INFO:root:sqlite:root/.superset/superset.db
  Loading [World Bank's Health Nutrition and Population Stats]
  Creating table [wb_health_population] reference
  2016-12-14 17:58:30,840:INFO:root:Creating database reference
  2016-12-14 17:58:30,846:INFO:root:sqlite:root/.superset/superset.db
  Creating slices
  Creating a World's Health Bank dashboard
  Loading [Birth names]
  Done loading table!
  --------------------------------------------------------------------------------
  Creating table [birth_names] reference
  2016-12-14 17:58:52,276:INFO:root:Creating database reference
  2016-12-14 17:58:52,280:INFO:root:sqlite:root/.superset/superset.db
  Creating some slices
  Creating a dashboard
  Loading [Random time series data]
  Done loading table!
  --------------------------------------------------------------------------------
  Creating table [random_time_series] reference
  2016-12-14 17:58:53,953:INFO:root:Creating database reference
  2016-12-14 17:58:53,957:INFO:root:sqlite:root/.superset/superset.db
  Creating a slice
  Loading [Random long/lat data]
  Done loading table!
  --------------------------------------------------------------------------------
  Creating table reference
  2016-12-14 17:59:09,732:INFO:root:Creating database reference
  2016-12-14 17:59:09,736:INFO:root:sqlite:root/.superset/superset.db
  Creating a slice
  Loading [Multiformat time series]
  Done loading table!
  --------------------------------------------------------------------------------
  Creating table [multiformat_time_series] reference
  2016-12-14 17:59:10,421:INFO:root:Creating database reference
  2016-12-14 17:59:10,426:INFO:root:sqlite:root/.superset/superset.db
  Creating some slices
  Loading [Misc Charts] dashboard
  Creating the dashboard


# Start the web server on port 8088
$ superset runserver -p 8088

# To start a development web server, use the -d switch
# superset runserver -d

# Refresh Druid Datasource (after config it)
$ superset refresh_druid

Virtualenv 工作空间

# superset01 192.168.1.10
$ cd root
$ virtualenv -p /usr/local/bin/python --system-site-packages --always-copy superset
$ source superset/bin/activate

# 详见下文 `遇到的坑` - `安装 superset需要下载依赖库` 部分
# pip install --download package -r requirements.txt
$ pip install -r /root/requirements.txt

$ superset runserver -a 0.0.0.0 -p 8088

# 建议使用 rsync,详见 `部署上线` 部分
$ cd /root
$ tar zcvf virtualenv.tar.gz virtualenv/
$ scp virtualenv.tar.gz root@192.168.1.13:/root/

# 192.168.1.13
$ cd /root/virtualenv/superset
$ source bin/activate

VirtualenvWrapper

## 【拓展】
# virtualenvwrapper 是 virtualenv 的扩展工具,可以方便的创建、删除、复制、切换不同的虚拟环境
$ pip install virtualenvwrapper
$ mkdir ~/workspaces
$ vim ~/.bashrc
  # 增加
  export WORKON_HOME=~/virtualenv
  source /usr/local/bin/virtualenvwrapper.sh

$ mkvirtualenv --python=/usr/bin/python superset
  Running virtualenv with interpreter /usr/bin/python
  New python executable in /root/virtualenv/superset/bin/python
  Installing setuptools, pip, wheel...done.
  virtualenvwrapper.user_scripts creating /root/virtualenv/superset/bin/predeactivate
  virtualenvwrapper.user_scripts creating /root/virtualenv/superset/bin/postdeactivate
  virtualenvwrapper.user_scripts creating /root/virtualenv/superset/bin/preactivate
  virtualenvwrapper.user_scripts creating /root/virtualenv/superset/bin/postactivate
  virtualenvwrapper.user_scripts creating /root/virtualenv/superset/bin/get_env_details
(superset) [root@superset01 virtualenv]# 
(superset) [root@superset01 virtualenv]# deactivate

$ workon superset
(superset) [root@superset01 virtualenv]# lsvirtualenv -b
superset

部署上线

拷贝

# rsync 替换 scp 可以确保软链接 也能被 cp
$ rsync -avuz -e ssh /home/superset/superset-0.15.4/ yuzhouwan@middle:/home/yuzhouwan/superset-0.15.4

  //...
  sent 142935894 bytes  received 180102 bytes  3920986.19 bytes/sec
  total size is 359739823  speedup is 2.51

# 在 本机 和 目标机器 的 superset 目录下校验文件数量
$ find | wc -l
  10113

# 重复以上步骤,从跳板机 rsync 到线上机器
$ rsync -avuz -e ssh /home/yuzhouwan/superset-0.15.4/ root@192.168.2.10:/home/superset/superset-0.15.4

# virtualenv 创建依赖的 python
$ rsync -avuz -e ssh /root/software yuzhouwan@middle:/home/yuzhouwan
$ rsync -avuz -e ssh /home/yuzhouwan/software root@druid-prd01:/root

$ cd /root/software
$ tar zxvf Python-2.7.12.tgz
$ cd Python-2.7.12

$ ./configure --prefix=/usr --enable-shared CFLAGS=-fPIC
$ make && make install
$ /sbin/ldconfig -v | grep /       # nessnary!!
$ python -V
  Python 2.7.12

动态链接库

# 虽然软链接已经 rsync 过来了,但是 目标机器相关目录下,没有对应的 Python 的动态链接库
$ file /root/superset/lib/python2.7/lib-dynload

  /root/superset/lib/python2.7/lib-dynload: broken symbolic link to `/usr/local/python27/lib/python2.7/lib-dynload`

# 需要和联网环境中,创建 VirtualEnv 时的 Python 全局环境一致
$ ./configure --prefix=/usr/local/python27 --enable-shared CFLAGS=-fPIC
$ make && make install
$ /sbin/ldconfig -v | grep /

$ ls /usr/local/python27/lib/python2.7/lib-dynload -sail

用户权限

# 创建用户
$ adduser superset
$ cd /home/superset
# 如果存在版本号,需要创建 软链接
$ chown -R superset:superset superset-0.15.4
$ ln -s superset-0.15.4 superset

$ chown -h superset:superset superset
$ su - superset

元数据存储

# 修改数据库
$ vim ./lib/python2.7/site-packages/superset/config.py

  # SQLALCHEMY_DATABASE_URI = 'sqlite:///' + os.path.join(DATA_DIR, 'superset.db')
  SQLALCHEMY_DATABASE_URI = 'mysql+pymysql://user:password@mysql01:3306/superset1?charset=utf8'

$ mysql -hmysql01 -p3306 -uuser -ppassword
> use superset1;
> show tables;
  +-------------------------+
  | Tables_in_superset1     |
  +-------------------------+
  | ab_permission           |
  | ...                     |
  | url                     |
  +-------------------------+
  28 rows in set (0.00 sec)

# mysqldump -hmysql01 -p3306 -uuser -ppassword superset1 > superset1.sql
$ mysqldump -hmysql01 -p3306 -uuser -ppassword --single-transaction superset1 > superset1.sql

参考
* mysqldump: 1044 Access denied when using LOCK TABLES
* 解决 mysqldump: Got error: 1044: Access denied for user 的方法

启动

$ cd /home/superset/superset-0.15.4
$ source bin/activate
$ mkdir logs
$ nohup superset runserver -a 0.0.0.0 -p 9097 2>&1 -w 4 > logs/superset.log &

本地运行

依赖

Windows 相关

Microsoft Visual C++ 9.0 is required (Unable to find vcvarsall.bat)
描述

 error: Microsoft Visual C++ 9.0 is required (Unable to find vcvarsall.bat). Get it from http://aka.ms/vcpython27

解决
# download vcredist_x64.exe from http://www.microsoft.com/en-us/download/details.aspx?id=2092
$ pip install wheel setuptools
# VCForPython27.msi 下载安装
‘openssl/opensslv.h’: No such file or directory
解决
# download openssl-0.9.8h-1-setup.exe from http://gnuwin32.sourceforge.net/packages/openssl.htm
参考
Cannot open include file: ‘stdint.h’: No such file or directory
解决
# Microsoft Visual C++ 2015 Redistributable Update 3
# download vc_redist.x64.exe from https://www.microsoft.com/zh-CN/download/details.aspx?id=53840
$ vim D:\apps\Python27\Lib\distutils\msvc9compiler.py

  def get_build_version():
    return 9.0
  def find_vcvarsall(version):
    return r'C:\Users\yuzhouwan\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\vcvarsall.bat'

$ cd superset-0.15.4
$ python setup.py install

# Microsoft 提供的 VCForPython27.msi 默认使用 VC2008,而 stdint.h 是从 VC2012 开始支持的
# 2014 年之后,VCForPython27.msi 便不再维护,决定尝试用 ubuntu or remote debug ...
参考

Ubuntu 相关

安装 VMware

 15+ VMWARE WORKSTATION PRO 12.X UNIVERSAL LICENSE KEYS FOR WIN & LIN

Python 相关

Make sure that you use the correct version of ‘pip’
描述
  Try to run this command from the system terminal. Make sure that you use the correct version of 'pip' installed for your Python interpreter located at 'D:\apps\Python27\python.exe'
解决
# 安装 pip,下载 https://bootstrap.pypa.io/get-pip.py 安装文件
$ python get-pip.py

$ pip --version
  pip 8.1.1 from d:\apps\python27\lib\site-packages (python 2.7)
参考
‘Connection to pypi.python.org timed out. (connect timeout=15)’
描述
$ pip install --upgrade pip
  'Connection to pypi.python.org timed out. (connect timeout=15)'
解决
# 设置 proxy
$ export https_proxy="http://10.10.10.10:8080"
$ pip install --upgrade pip
$ pip --version
  pip 9.0.1 from d:\apps\python27\lib\site-packages (python 2.7)
参考
setup.py failed with error code 1
描述
Command "d:\apps\python27\python.exe -u -c "import setuptools, tokenize;__file__='c:\\users\\yuzhouwan\\appdata\\local\\temp\\pip-build-zzbhrq\\sasl\\setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record c:\users\yuzhouwan\appdata\local\temp\pip-erwavd-record\install-record.txt --single-version-externally-managed --compile" failed with error code 1 in c:\users\yuzhouwan\appdata\local\temp\pip-build-zzbhrq\sasl\
解决
$ pip install --upgrade setuptools pip
$ pip install superset

# Download superset-0.15.4.tar.gz from https://pypi.python.org/pypi/superset
$ tar zxvf superset-0.15.4.tar.gz
$ cd superset-0.15.4
$ python setup.py install
参考

开发环境搭建

依赖

$ cd /root/software
$ tar zxvf Python-2.7.12.tgz
$ cd Python-2.7.12

$ ./configure --prefix=/usr/local/python27 --enable-shared CFLAGS=-fPIC
$ make && make install
$ /sbin/ldconfig -v | grep /
$ python -V
$ Python 2.7.12

$ mv /usr/local/bin/python /usr/local/bin/python_bak
$ ln -s /usr/local/python27/bin/python /usr/local/bin/python

虚拟环境

$ cd /root
$ virtualenv -p /usr/local/bin/python --system-site-packages env
$ cd env
$ mkdir code

代码

# windows
$ cd E:\Github\super\env
$ git init
$ git remote add origin master https://github.com/asdf2014/superset.git
$ git pull origin master

# SFTP
# 上传到 /root/env/code

安装

$ cd /root/env/code
$ source /root/env/bin/activate

$ cd /root/env/code/superset/static
$ mv assets assets_bak
$ ln -s ../assets assets

$ cd /root/env/code
$ python setup.py develop

  Finished processing dependencies for superset==0.15.4

$ pip freeze | grep superset
  superset==0.15.4

# Create an admin user
$ fabmanager create-admin --app superset

  Username [admin]:        # login name
  User first name [admin]: # first name
  User last name [user]:   # lastname
  Email [admin@fab.org]:   # email, must unique
  Password: 
  Repeat for confirmation: 
  Error: the two entered values do not match
  Password:             #superset
  Repeat for confirmation: #superset
  // ...
  Recognized Database Authentications.
  2016-12-14 17:53:40,945:INFO:flask_appbuilder.security.sqla.manager:Added user superset db upgrade
  Admin User superset db upgrade created.

$ superset db upgrade
$ superset init
$ superset load_examples

Npm

# [Mac OS]
$ sudo yum group install "Development Tools" --setopt=group_package_types=mandatory,default,optional --skip-broken -y
$ sudo yum install curl git m4 ruby texinfo bzip2-devel curl-devel expat-devel ncurses-devel zlib-devel -y

# ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/linuxbrew/go/install)"    # Do not run this as root!
$ wget https://raw.githubusercontent.com/Homebrew/linuxbrew/go/install --no-check-certificate
$ mv install install.rb
$ vim install.rb

 # abort "Don't run this as root!" if Process.uid == 0

$ mkdir -p /root/.linuxbrew/bin
$ export PATH="/root/.linuxbrew/bin:$PATH"
$ ruby install.rb

$ vim ~/.bashrc

 export PATH="$HOME/.linuxbrew/bin:$PATH"
 export MANPATH="$HOME/.linuxbrew/share/man:$MANPATH"
 export INFOPATH="$HOME/.linuxbrew/share/info:$INFOPATH"


# [CentOS]
$ yum install npm
$ cd /root/env/code/superset/assets    # package.json
$ npm install

# if visit https://github.com/jquery/jquery.git return timeout
$ vim /etc/hosts

 192.30.253.112 github.com
 151.101.100.133 assets-cdn.github.com
 192.30.253.117 api.github.com
 192.30.253.121 codeload.github.com

测试

$ cd /root/env/code
$ chmod 777 *sh
$ cd /root/env/code/superset/bin
$ chmod 777 superset

$ cd /root/env/code
$ bash run_tests.sh

IDE 中远程开发

Remote Debug

 详见我的另一篇博客中 Remote Debug 部分:《Python

参考

二次开发

Others Category

问题

描述

 对 HBase 的 Region 层面进行聚合,group 出来的 Region 会很多,在 DistributionPieViz 中展示会很卡顿,而且不美观

解决
增加 row_limit 可以排除 topN 之外的数据
$ cd /root/superset-0.15.4
$ vim ./lib/python2.7/site-packages/superset/viz.py

  fieldsets = ({
    'label': None,
    'fields': (
      'metrics', 'groupby',
      'limit',
      'pie_label_type',
      ('donut', 'show_legend'),
      'labels_outside',
      'row_limit',
    )
  },)
others_category 将 topN 之外的数据聚合
$ cd /root/superset-0.15.4
$ vim ./lib/python2.7/site-packages/superset/viz.py

  fieldsets = ({
    'label': None,
    'fields': (
      'metrics', 'groupby',
      'limit',
      'pie_label_type',
      ('donut', 'show_legend'),
      'labels_outside',
      'row_limit',
      'others_category',
    )
  },)

$ vim ./lib/python2.7/site-packages/superset/forms.py

  'others_category': (BetterBooleanField, {
    "label": _("Others category"),
    "default": True,
    "description": _("Aggregate data outside of topN into a single category")
  }),


# models.py
# Others类别,没有被排在最后,而是重新又进行了一次排序
# "others_category": "y" 属性没有传递下来

self.status = None
self.error_message = None
self.others_category = form_data.get("others_category")

top_n = 10
if top_n > 0:
df_head = df.head(top_n)
df_tail = df.tail(len(df) - 10)
other_metrics_sum = []
for i in range(0, len(metrics) - 1):
  metric = metrics[i]
  other_metrics_sum[i] = df_tail[metric].sum()
df_other = pd.DataFrame([['Others', other_metrics_sum]], columns=df.columns)
df = df_head.append(df_other, ignore_index=True)

Tips: 已提 RP#2176 Aggregate data outside of topN into a single category

Y 轴数据异常

描述

 Y 轴本应该是 0 的起点,变成 -997m 负数

解决

 已提 RP#2307 Some problem in Y Axis

后期优化

MySQL 时区问题

查询

描述
$ lib/python2.7/site-packages/superset/config.py

 from dateutil import tz

 # Druid query timezone
 # tz.tzutc() : Using utc timezone
 # tz.tzlocal() : Using local timezone
 # other tz can be overridden by providing a local_config
 DRUID_IS_ACTIVE = True
 DRUID_TZ = tz.tzlocal()        # +08:00

 # DRUID_TZ = tz.gettz('Asia/Shanghai')
解决

 已提 RP#2143 Using the time zone with specific name for querying Druid

展示

描述
  dttm.tz_convert(dttm.tzinfo._filename.split('zoneinfo/')[1]) - pytz.timezone(dttm.tzinfo._filename.split('zoneinfo/')[1]).localize(EPOCH)
解决

 已提 RP#2370 Fix timezone issues in slices

参考

Superset 升级

# 直接利用 pip install 的方式进行升级
$ pip freeze | grep superset
$ superset==0.13.2

$ pip install superset==-1
  versions: 0.12.0, 0.13.0, 0.13.1, 0.13.2, 0.14.0, 0.14.1, 0.15.0, 0.15.1, 0.15.3, 0.15.4

$ pip install superset==0.15.4

# 发现之前的配置数据 都消失了,需要做一些 config 的调整
$ vim ./lib/python2.7/site-packages/superset/config.py

# SQLALCHEMY_DATABASE_URI = 'sqlite:///' + os.path.join(DATA_DIR, 'superset.db')
  SQLALCHEMY_DATABASE_URI = 'mysql+pymysql://root:root@192.168.1.12:3306/superset?charset=utf8'

$ vim /root/superset-0.15.4/bin/activate

  # VIRTUAL_ENV="/root/superset"
  VIRTUAL_ENV="/root/superset-0.15.4"

# then could just run "superset runserver -a 0.0.0.0 -p 9097"

Unknown column ‘datasources.filter_select_enabled’ in ‘field list’

描述
InternalError: (pymysql.err.InternalError) (1054, u"Unknown column 'datasources.filter_select_enabled' in 'field list'") [SQL: u'SELECT datasources.created_on AS datasources_created_on, datasources.changed_on AS datasources_changed_on, datasources.id AS datasources_id, datasources.datasource_name AS datasources_datasource_name, datasources.is_featured AS datasources_is_featured, datasources.is_hidden AS datasources_is_hidden, datasources.filter_select_enabled AS datasources_filter_select_enabled, datasources.description AS datasources_description, datasources.default_endpoint AS datasources_default_endpoint, datasources.user_id AS datasources_user_id, datasources.cluster_name AS datasources_cluster_name, datasources.offset AS datasources_offset, datasources.cache_timeout AS datasources_cache_timeout, datasources.params AS datasources_params, datasources.perm AS datasources_perm, datasources.changed_by_fk AS datasources_changed_by_fk, datasources.created_by_fk AS datasources_created_by_fk \nFROM datasources \nWHERE datasources.datasource_name = %(datasource_name_1)s \n LIMIT %(param_1)s'] [parameters: {u'param_1': 1, u'datasource_name_1': u'bi-dfp-oms-detail'}]
解决
$ superset db upgrade
$ superset refresh_druid

Issues with Druid timezones

描述

 Those methods that named tzutc and tzlocal in tz work for me…
 Oh no.. They are not working when i upgrade superset from v0.13.2 into v0.15.4, even if i try to use DRUID_TZ = tz.gettz(‘Asia/Shanghai’) :-(

 详见:Issues with Druid timezones #1369

解决
$ cd /root/superset-0.15.4
$ ./bin/python -m pip freeze | grep superset

  superset==0.13.2

$ ./bin/python -m pip uninstall superset
$ ./bin/python -m pip install superset==0.15.4
$ ./bin/python -m pip freeze | grep superset

  superset==0.15.4

$ ./bin/python ./bin/easy_install lib/pycharm-debug.egg
# config remote python

$ ./bin/python ./bin/superset runserver -a 0.0.0.0 -p 9097
# nohup ./bin/python ./bin/superset runserver -a 0.0.0.0 -p 9097 2>&1 > logs/superset.log &

$ ./bin/python ./bin/superset db upgrade
$ ./bin/python ./bin/superset refresh_druid

pydevd 无法进行 remote debug

描述

 版本从 0.13.2 升级到 0.15.4,在 debug 的时候会启动两个进程(会导致 pydevd 无法进行 remote debug)

$ ps -ef | grep superset | grep -v grep

  root     22567  1632 19 12:05 pts/0    00:00:03 ./bin/python ./bin/superset runserver -d -p 9097
  root     22578 22567 24 12:05 pts/0    00:00:03 /root/superset-0.15.4/bin/python ./bin/superset runserver -d -p 9097
解决
直接用 cli.py 启动 –not ok
$ vim ./lib/python2.7/site-packages/superset/config.py

  # append
  manager.run()

$ ./bin/python ./lib/python2.7/site-packages/superset/cli.py runserver -a 0.0.0.0  -p 9097

$ ps -ef | grep superset | grep -v grep

  root     25238  1632 35 13:07 pts/0    00:00:03 ./bin/python ./lib/python2.7/site-packages/superset/cli.py runserver -d -p 9097
  root     25247 25238 55 13:07 pts/0    00:00:03 /root/superset-0.15.4/bin/python ./lib/python2.7/site-packages/superset/cli.py runserver -d -p 9097
尝试解决 WARNING:werkzeug: * Debugger is active! 问题
$ vim lib/python2.7/site-packages/werkzeug/serving.py

  class ThreadedWSGIServer(ThreadingMixIn, BaseWSGIServer):

    """A WSGI server that does threading."""
    multithread = True

$ vim lib/python2.7/site-packages/flask/app.py

  options.setdefault('use_reloader', self.debug)

$ superset/__init__.py

 已提 RP#2136 Fix werkzeug instance was created twice in Debug Mode

参考

Sqlite3 切换为 MySQL

尝试 SQLite 自带的 dump 命令

# superset01                192.168.1.10        Superset
$ cd /root/.superset
$ ll -sail

  1285 43256 -rw-r--r--   1 root root 44288000 Jan 22 14:06 superset.db

$ sqlite3 superset.db
sqlite> .databases
  seq  name             file                                                      
  ---  ---------------  ----------------------------------------------------------
  0    main             /root/.superset/superset.db

  sqlite> .tables
  ab_permission            columns                  multiformat_time_series
  ab_permission_view       css_templates            query                  
  ab_permission_view_role  dashboard_slices         random_time_series     
  ab_register_user         dashboard_user           slice_user             
  ab_role                  dashboards               slices                 
  ab_user                  datasources              sql_metrics            
  ab_user_role             dbs                      table_columns          
  ab_view_menu             energy_usage             tables                 
  access_request           favstar                  url                    
  alembic_version          logs                     wb_health_population   
  birth_names              long_lat               
  clusters                 metrics                

# not suit for mysql
# sqlite> .output superset.sql
# sqlite> .dump

$ vim dump_for_mysql.py

  # https://github.com/EricHigdon/sqlite3tomysql

$ sqlite3 superset.db .dump | python dump_for_mysql.py > superset.sql

$ ls -sail

  1285 43256 -rw-r--r--   1 root root 44288000 Jan 22 14:06 superset.db
  18631 76968 -rw-r--r--   1 root root 78812197 Jan 22 14:35 superset.sql

$ vim superset.sql

  id INTEGER NOT NULL, 
  # 替换为 (主键) 自增长
  id INTEGER PRIMARY KEY NOT NULL AUTO_INCREMENT, 

$ scp superset.sql root@192.168.1.12:/home/mysql

自己实现 sqlite3tomysql.py

# druid02    192.168.1.12    MySQL
$ ps -ef | grep mysql | grep -v druid | grep -v grep

  mysql    11435  8530  0 14:13 pts/4    00:00:00 /bin/sh /home/mysql/bin/mysqld_safe --defaults-file=/home/mysql/my.cnf
  mysql    12192 11435  0 14:13 pts/4    00:00:00 /home/mysql/bin/mysqld --defaults-file=/home/mysql/my.cnf --basedir=/home/mysql --datadir=/home/mysql/data --plugin-dir=/home/mysql/lib/mysql/plugin --log-error=/home/mysql/data/druid02.err --open-files-limit=8192 --pid-file=/home/mysql/data/druid02.pid --socket=/home/mysql/data/mysql.sock --port=3306
  mysql    12223  8530  0 14:13 pts/4    00:00:00 mysql -uroot -p -S /home/mysql/data/mysql.sock


$ su - mysql
$ mysql -uroot -p -S /home/mysql/data/mysql.sock
mysql> show databases;
mysql> create database superset;
mysql> show databases;
mysql> use superset;

# 执行 sqlite3tomysql.py
  mysql -uroot -p superset2 -S /home/mysql/data/mysql.sock  --default-character-set=utf8 < superset.sql.schema.sql
  mysql -uroot -p superset2 -S /home/mysql/data/mysql.sock  --default-character-set=utf8 < superset.sql.data.sql

# 避免表之间 外键依赖,可以在 mysql 命令行中,使用 source .superset.sql.schema.sql 的方式,多次批量导入

元数据存储

# superset01                192.168.1.10        Superset
$ cd /root/superset
$ find ./ -name config.py
  ./lib/python2.7/site-packages/caravel/config.py
  ./lib/python2.7/site-packages/sqlalchemy/testing/config.py
  ./lib/python2.7/site-packages/pandas/core/config.py
  ./lib/python2.7/site-packages/superset/config.py
  ./lib/python2.7/site-packages/setuptools/config.py
  ./lib/python2.7/site-packages/numpy/distutils/command/config.py
  ./lib/python2.7/site-packages/gunicorn/config.py
  ./lib/python2.7/site-packages/panoramix/config.py
  ./lib/python2.7/site-packages/flask/config.py
  ./lib/python2.7/site-packages/alembic/testing/config.py
  ./lib/python2.7/site-packages/alembic/config.py

$ vim ./lib/python2.7/site-packages/superset/config.py
  # SQLALCHEMY_DATABASE_URI = 'sqlite:///' + os.path.join(DATA_DIR, 'superset.db')
  SQLALCHEMY_DATABASE_URI = 'mysql+pymysql://root:root@192.168.1.12:3306/superset?charset=utf8'

启动

# 先执行,一系列 superset 初始化工作
$ nohup superset runserver -a 0.0.0.0 -p 9097 -w 4 2>&1 > logs/superset.log &

Tips: 代码 & 操作步骤,详见:Convert SQLite into MySQL

参考

参数调优

# 适当增加 gunicorn 的 worker 数量(default:2)
$ cd /root/superset
$ source bin/activate
$ mkdir logs
$ nohup ./bin/python ./bin/superset runserver -a 0.0.0.0 -p 9097 -w 4 2>&1 > logs/superset.log &

日志

ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cache instead.

描述
(superset) [root@superset01 superset-0.15.4]# ./bin/python ./lib/python2.7/site-packages/superset/cli.py runserver -d -p 9097
/root/superset-0.15.4/lib/python2.7/site-packages/flask/exthook.py:71: ExtDeprecationWarning: Importing flask.ext.script is deprecated, use flask_script instead.
.format(x=modname), ExtDeprecationWarning
/root/superset-0.15.4/lib/python2.7/site-packages/flask/exthook.py:71: ExtDeprecationWarning: Importing flask.ext.sqlalchemy is deprecated, use flask_sqlalchemy instead.
.format(x=modname), ExtDeprecationWarning
/root/superset-0.15.4/lib/python2.7/site-packages/flask/exthook.py:71: ExtDeprecationWarning: Importing flask.ext.sqlalchemy._compat is deprecated, use flask_sqlalchemy._compat instead.
.format(x=modname), ExtDeprecationWarning
/root/superset-0.15.4/lib/python2.7/site-packages/flask_cache/init.py:152: UserWarning: Flask-Cache: CACHE_TYPE is set to null, caching is effectively disabled.
warnings.warn("Flask-Cache: CACHE_TYPE is set to null, "
/root/superset-0.15.4/lib/python2.7/site-packages/flask/exthook.py:71: ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cache instead.
.format(x=modname), ExtDeprecationWarning
解决
$ vim ./bin/superset

  +import warnings
  +from flask.exthook import ExtDeprecationWarning
  +warnings.simplefilter('ignore', ExtDeprecationWarning)
  +
  from superset.cli import manager

 已提 RP#2138 Fix ExtDeprecationWarning

参考

遇到的坑

创建 user 时,需保证 email 的唯一性

Recognized Database Authentications.
2016-12-14 18:12:36,007:ERROR:flask_appbuilder.security.sqla.manager:Error adding new user to database. (sqlite3.IntegrityError) column email is not unique [SQL: u'INSERT INTO ab_user (first_name, last_name, username, password, active, email, last_login, login_count, fail_login_count, created_on, changed_on, created_by_fk, changed_by_fk) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)'] [parameters: (u'superset', u'yuzhouwan', u'superset', 'pbkdf2:sha1:1000$e3imUMx0$83b38fb2a0f628d1379379bb353fc80697c435a1', 1, u'yuzhouwan@gmail.com', None, None, None, '2016-12-14 18:12:36.004721', '2016-12-14 18:12:36.004773', None, None)]
No user created an error occured

 使用 admin / admin 用户登录,进行修改

缺少的依赖包

描述

RuntimeError: Compression requires the (missing) zlib module

解决

$ yum install zlib
$ yum install zlib-devel

# 进到 python2.7 目录 重新编译安装,软链接不需要重建
$ cd /root/software/Python-2.7.12
$ make
$ make install

# 进到 setup-tools 目录 重新安装
$ cd /root/software/setuptools-32.0.0
$ python setup.py install

Python 无法装载模块(RedHat Problem)

pip: command not found

# 利用装载模块的方式 使用 pip
$ python -m pip --version
  pip 9.0.1 from /root/software/pip-9.0.1 (python 2.7)

# 修改命令别名
$ vim ~/.bashrc

  # 未生效可直接执行
  alias pip='python -m pip'

$ pip --version
  pip 9.0.1 from /root/software/pip-9.0.1 (python 2.7)

virtualenv: command not found

$ vim ~/.bashrc
  alias virtualenv='python -m virtualenv'

$ virtualenv --version
  15.1.0

安装 superset 需要下载依赖库

sasl/sasl.h:没有那个文件或目录

描述
gcc: error trying to exec 'cc1plus': execvp: 没有那个文件或目录
error: command 'gcc' failed with exit status 1

cc1plus: 警告:命令行选项 “-Wstrict-prototypes” 对 Ada/C/ObjC 是有效的,但对 C++ 无效
在包含自 sasl/saslwrapper.cpp:254 的文件中:
sasl/saslwrapper.h:22:23: 错误:sasl/sasl.h:没有那个文件或目录
解决
$ gcc -v
  使用内建 specs。
  目标:x86_64-redhat-linux
  配置为:../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
  线程模型:posix
  gcc 版本 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC)

# 安装 g++
# g++是c++的编译器,安装好之后,gcc会自动寻找c++程序所需的编译环境,进而编译成功
# wget ftp://rpmfind.net/linux/centos/6.8/os/x86_64/Packages/gcc-c++-4.4.7-17.el6.x86_64.rpm (需要完全一致 gcc 4.4.7-4才行)
# http://rpm.pbone.net/index.php3/stat/4/idpl/25438297/dir/scientific_linux_6/com/gcc-c++-4.4.7-4.el6.x86_64.rpm.html
# http://rpm.pbone.net/index.php3/stat/4/idpl/25440518/dir/scientific_linux_6/com/libstdc++-devel-4.4.7-4.el6.x86_64.rpm.html
$ rpm -ivh libstdc++-devel-4.4.7-4.el6.x86_64.rpm
$ rpm -ivh gcc-c++-4.4.7-4.el6.x86_64.rpm

$ g++ -v
  使用内建 specs。
  目标:x86_64-redhat-linux
  配置为:../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
  线程模型:posix
  gcc 版本 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC)

命令行选项 “-Wstrict-prototypes” 对 Ada/C/ObjC 是有效的,但对 C++ 无效

描述

 cc1plus: 警告:命令行选项 “-Wstrict-prototypes” 对 Ada/C/ObjC 是有效的,但对 C++ 无效

解决
# cmake 版本过低(这里是没有安装)
# https://cmake.org/ (stable: 3.6.3, lastest: 3.7.1, date: 2016/12/16)
# https://cmake.org/cmake/help/v3.6/
$ wget --no-check-certificate  https://cmake.org/files/v3.6/cmake-3.6.3.tar.gz # To connect to cmake.org insecurely
$ tar zxvf cmake-3.6.3.tar.gz
$ cd cmake-3.6.3
$ ./bootstrap
$ make
$ gmake install

$ cmake -version
$ cmake version 3.6.3
$ CMake suite maintained and supported by Kitware (kitware.com/cmake).

# reboot (should)

$ cd ~
$ mkdir virtualenv
$ cd virtualenv
$ virtualenv env1
$ virtualenv --python=/usr/bin/python env1

# new problem
# IOError: [Errno 40] Too many levels of symbolic links: '/root/virtualenv/env1/bin/python'
# 不能直接 rm -rf env1,需要用 rmvirtualenv 才行
$ rmvirtualenv env1
$ cd env1
$ source bin/activate      # 退出 deactivate
(env1) [root@edeppreapp01 env1]# python -V
  Python 2.7.12
参考

Could not find a version that satisfies the requirement pytz>dev

描述
# 如果一个一个依赖去安装 会很麻烦
Could not find a version that satisfies the requirement pytz>dev (from celery==3.1.23) (from versions: )
Could not find a version that satisfies the requirement billiard<3.4,>=3.3.0.23 (from celery==3.1.23) (from versions: )
No matching distribution found for amqp<2.0,>=1.4.9 (from kombu==3.0.35)
No matching distribution found for anyjson>=0.3.3 (from kombu==3.0.35)
No matching distribution found for kombu<3.1,>=3.0.34 (from celery==3.1.23)
No matching distribution found for celery==3.1.23 (from superset)
Could not find suitable distribution for Requirement.parse('werkzeug==0.11.10')
pip install thrift-0.9.3.tar.gz
No matching distribution found for six (from sasl==0.2.1)
No matching distribution found for sasl>=0.2.1 (from thrift-sasl==0.2.1)
No local packages or working download links found for thrift-sasl>=0.2.1
解决
$ pip list
$ pip freeze > requirements.txt
$ mkdir packages
$ pip install --download package -r requirements.txt

$ cd packages
$ scp celery-3.1.23-py2.py3-none-any.whl root@druid01:/root/software/packages

# --find-links 可以在指定目录中,找到 superset 的相关依赖,依次安装好
$ python -m pip install --no-index --find-links=packages superset # -r requirements.txt
参考

ImportError: No module named _ssl

解决

# 安装 ssl
$ yum install yum-downloadonly -y

$ yum -y install ncurses ncurses-devel gcc-c++ libxml2-devel gd gd-devel libpng libpng-devel libjpeg libjpeg-devel libmcrypt libmcrypt-devel openldap-devel openldap-servers openldap-clients autoconf freetype-devel libtool-ltdl-devel openssl openssl-devel gcc automake autoconf libtool make --downloadonly --downloaddir=.

$ yum -y install GeoIP gmp libevent libmcrypt libtidy libXpm libxslt mhash mysql mysql-server nfs-utils nginx perl-DBD-MySQL perl-DBI php php-common php-fpm php-gd php-mbstring php-mcrypt php-mhash php-mysql php-pdo php-xml t1lib --downloadonly --downloaddir=.

$ rpm -Uvh --force --nodeps *.rpm


# 重新编译 Python
$ cd /root/software/Python-2.7.12
$ vim Modules/Setup.dist

  # 取消注释
  SSL=/usr/local/ssl
  _ssl _ssl.c \
         -DUSE_SSL -I$(SSL)/include -I$(SSL)/include/openssl \
         -L$(SSL)/lib -lssl -lcrypto

$ ./configure --enable-shared CFLAGS=-fPIC //--enable-shared option means to generate dynamic library libpython2.7.so.1.0
make && make install

# Not work
$ python --version
  Python 2.7.12

$ python
Python 2.7.12 (default, Dec 19 2016, 10:58:27) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import ssl
>>> Traceback (most recent call last):
>>> File "<stdin>", line 1, in <module>
>>> File "/usr/local/lib/python2.7/ssl.py", line 97, in <module>
>>> import _ssl             # if we can't import it, let the error propagate
>>> ImportError: No module named _ssl
>>> quit()

# 安装缺少的 openssl-devel
$ rpm -aq | grep openssl
  openssl-1.0.1e-42.el6_7.4.x86_64

$ yum install openssl-devel -y

$ rpm -aq | grep openssl
  openssl-1.0.1e-42.el6_7.4.x86_64
  openssl-devel-1.0.1e-42.el6_7.4.x86_64

#修改 Setup 文件
$ vim /root/software/Python-2.7.12/Modules/Setup
  # Socket module helper for socket(2)
  _socket socketmodule.c timemodule.c

  # Socket module helper for SSL support; you must comment out the other
  # socket line above, and possibly edit the SSL variable:
  #SSL=/usr/local/ssl
  _ssl _ssl.c \
  -DUSE_SSL -I$(SSL)/include -I$(SSL)/include/openssl \
  -L$(SSL)/lib -lssl -lcrypto

# 重新编译
$ cd /root/software/Python-2.7.12
$ make && make install

$ python
Python 2.7.12 (default, Dec 19 2016, 11:08:33) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import ssl
>>>

$ cd /root/virtualenv/superset/bin
[root@olap03-sit bin]# python
Python 2.7.12 (default, Dec 19 2016, 11:08:33) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import ssl

$ /root/virtualenv/superset/bin/python
Python 2.7.12 (default, Dec 16 2016, 16:23:17) 
[GCC 4.4.6 20120305 (Red Hat 4.4.6-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import ssl
>>> Traceback (most recent call last):
>>> File "<stdin>", line 1, in <module>
>>> File "/usr/local/python27/lib/python2.7/ssl.py", line 97, in <module>
>>> import _ssl             # if we can't import it, let the error propagate
>>> ImportError: No module named _ssl


$ mv /root/virtualenv/superset/bin/python /root/virtualenv/superset/bin/python_old
$ ln -s /usr/local/bin/python /root/virtualenv/superset/bin/

$ ./python
Python 2.7.12 (default, Dec 19 2016, 11:08:33) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import ssl
>>> quit()
>>> [root@olap03-sit bin]# 
>>> [root@olap03-sit bin]# 
>>> [root@olap03-sit bin]# pwd
>>> /root/virtualenv/superset/bin
>>> [root@olap03-sit bin]# /root/virtualenv/superset/bin/python
>>> Python 2.7.12 (default, Dec 19 2016, 11:08:33) 
>>> [GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
>>> Type "help", "copyright", "credits" or "license" for more information.
>>> import ssl
>>> quit()
>>> [root@olap03-sit bin]# python
>>> Python 2.7.12 (default, Dec 19 2016, 11:08:33) 
>>> [GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
>>> Type "help", "copyright", "credits" or "license" for more information.
>>> import ssl
>>>
>>> source bin/activate
>>> (superset) [root@olap03-sit superset]# which python
>>> /root/virtualenv/superset/bin/python
>>> (superset) [root@olap03-sit superset]# python
>>> Python 2.7.12 (default, Dec 19 2016, 11:08:33) 
>>> [GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
>>> Type "help", "copyright", "credits" or "license" for more information.
>>> import ssl
>>> quit()

# ImportError: No module named gunicorn.app.base
import gunicorn.app.base

参考

python: error while loading shared libraries: libpython2.7.so.1.0

描述

$ ./configure --prefix=/usr/local/python27 --enable-shared CFLAGS=-fPIC //--enable-shared option means to generate dynamic library libpython2.7.so.1.0
$ make && make install
$ python -V
  python: error while loading shared libraries: libpython2.7.so.1.0: cannot open shared object file: No such file or directory

解决

$ yum reinstall python-libs        --not work

$ ll /usr/local/python27/lib/libpython2.7.so.1.0       --not work
$ vim /etc/ld.so.conf

  include ld.so.conf.d/*.conf
  include /usr/local/Python2.7/lib

$ /sbin/ldconfig -v | grep /
  /lib:
  /lib64:
  /usr/lib:
  /usr/lib64:
  /lib64/tls: (hwcap: 0x8000000000000000)
  /usr/lib64/sse2: (hwcap: 0x0000000004000000)
  /usr/lib64/tls: (hwcap: 0x8000000000000000)

$ ./configure --prefix=/usr --enable-shared CFLAGS=-fPIC
$ make && make install
$ /sbin/ldconfig -v | grep /
$ python -V
  Python 2.7.12

参考

ImportError: No module named pysqlite2

解决

$ vim /root/superset/lib/python2.7/site-packages/sqlalchemy/dialects/sqlite/pysqlite.py

# 修改 sqlite3
@classmethod
def dbapi(cls):
try:
  # 改为 from sqlite3 import dbapi2 as sqlite
  from pysqlite2 import dbapi2 as sqlite
except ImportError as e:
  try:
    from sqlite3 import dbapi2 as sqlite  # try 2.5+ stdlib name.
  except ImportError:
    raise e
return sqlite

# Redhat 5.3 环境下,要源代码安装 sqlite3,然后安装 python 才能有 _sqlite3.so 这个文件
$ wget https://sqlite.org/snapshot/sqlite-snapshot-201612131847.tar.gz
$ sqlite3 --version
  3.6.20

参考

pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available.

解决

方法一
# 清除所有的 alias 和 superset 源码中 python 路径的修改
$ which pip
$ alias pip='python -m pip'
$ /root/superset/bin/python

$ vim ~/.bashrc
# alias pip='python -m pip'
# alias virtualenv='python -m virtualenv'

# Source global definitions
# export WORKON_HOME=~/virtualenv
# source /usr/local/bin/virtualenvwrapper.sh

$ source ~/.bashrc
$ deactivate
$ yum install python-pip

$ unalias pip
$ which pip
$ /usr/bin/pip

$ superset runserver -a 0.0.0.0 -p 9999

$ cd /usr/local/lib/python2.7/site-packages

$ python
  Python 2.7.12 (default, Dec 19 2016, 11:08:33) 
  [GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
  Type "help", "copyright", "credits" or "license" for more information.
  >>> import ssl
  >>> ssl
  <module 'ssl' from '/usr/local/lib/python2.7/ssl.pyc'>
  >>> quit()

$ vim mypkpath.pth
  /usr/local/lib/python2.7

$ vim ~/.bashrc
  alias python=/usr/local/bin/python
  alias pip=/usr/bin/pip

$ source ~/.bashrc        --not work(superset 的 py程序开头都有 #!/root/superset/bin/python)
$ vim /root/superset/bin/superset
  #!/usr/local/bin/python
方法二
# 利用 prefix 将 python 的第三方库安装到 /usr/lib 中
$ ./configure --prefix=/usr --enable-shared CFLAGS=-fPIC
$ make && make install
$ /sbin/ldconfig -v | grep /
$ python -V
  Python 2.7.12

参考

Error while processing cluster ‘druid cluster’ (sqlite3. Operational Error) database is locked

描述

 [Web UI] Sources - Druid Clusters 配置 - Refresh Druid Metadata

原因

 Web 中无法维持长连接,会超时

解决

 superset refresh_druid

Tips: 目前最新的 v0.22.1 版本中,已经解决了这个问题,可以在页面上直接点击 “Sources - Refresh Druid Metadata” 按钮,完成操作(2017-12-12)

参考

An unknown error occurred. (Status: 0) Maybe the request timed out?

描述

 部分图标 无法正常显示

解决

# 打开 debug 模式,查看详细日志,定位问题
$ vim ./lib/python2.7/site-packages/superset/config.py

  # DEBUG = False
  DEBUG = True

ImportError: No module named pymysql

解决

 pip install pymysql

uHost druid01 is not allowed to connect to this MySQL server

描述

 nohup superset runserver -a 0.0.0.0 -p 8888 2>&1 &

2017-01-22 16:36:53,013:ERROR:flask_appbuilder.security.sqla.manager:DB Creation and initialization failed: (pymysql.err.InternalError) (1130, u"Host 'druid01' is not allowed to connect to this MySQL server")

解决

GRANT ALL PRIVILEGES ON *.* TO 'root'@'druid01' IDENTIFIED BY 'root' WITH GRANT OPTION;

参考

Permission for Druid

解决

 增加新的数据源之后,需要 superset init,来更新 permission 相关的数据表

参考

Update Druid Cluster’s Name

解决

alter table datasources drop FOREIGN KEY `datasources_ibfk_2`;
update clusters set cluster_name='Druid Cluster' where cluster_name='druid cluster';
update datasources set cluster_name ='Druid Cluster' where cluster_name ='druid cluster';
alter table datasources add constraint  `datasources_ibfk_2`  FOREIGN KEY (`cluster_name`) REFERENCES `clusters` (`cluster_name`);
# show create table datasources; # troubleshooting

参考

An unexpected error occurred: “https://registry.yarnpkg.com/convert-source-map: ETIMEDOUT”

描述

$ yarn
  yarn install v1.3.2
  info No lockfile found.
  [1/4] Resolving packages...
  error An unexpected error occurred: "https://registry.yarnpkg.com/@vx%2fbounds: ETIMEDOUT".
  info If you think this is a bug, please open a bug report with the information provided in "/home/superset/software/incubator-superset-0.22.1/superset/assets/yarn-error.log".
  info Visit https://yarnpkg.com/en/docs/cli/install for documentation about this command.

解决

# 由于不知名的外星力量,需要先替换掉原始的 IP 地址
$ vim /etc/hosts
  104.16.59.173 registry.yarnpkg.com

# 控制网络并发量,减少 TIMEOUT 发生的可能
$ yarn --network-concurrency 1

参考

社区跟进

资料

Doc

Help Doc

fabmanager --help

  Usage: fabmanager [OPTIONS] COMMAND [ARGS]...

  This is a set of commands to ease the creation and maintenance of your
  flask-appbuilder applications.

  All commands that import your app will assume by default that your running
  on your projects directory just before the app directory. will assume also
  that on the __init__.py your initializing AppBuilder like this (using a
  var named appbuilder) just like the skeleton app::

  appbuilder = AppBuilder(......)

  If your using different namings use app and appbuilder parameters.

  Options:
  --help  Show this message and exit.

  Commands:
  babel-compile     Babel, Compiles all translations
  babel-extract     Babel, Extracts and updates all messages...
  collect-static    Copies flask-appbuilder static files to your...
  create-addon      Create a Skeleton AddOn (needs internet...
  create-admin      Creates an admin user
  create-app        Create a Skeleton application (needs internet...
  create-db         Create all your database objects (SQLAlchemy...
  list-users        List all users on the database
  list-views        List all registered views
  reset-password    Resets a user's password'
  run               Runs Flask dev web server.
  security-cleanup  Cleanup unused permissions from views and...
  version           Flask-AppBuilder package version

Blog

Flask

Gunicorn

Book

Source

更多资源,欢迎加入,一起交流学习

Technical Discussion Group:(人工智能 1020982(高级)& 1217710(进阶)| BigData 1670647)


Post author:Benedict Jin
Post link: https://yuzhouwan.com/posts/743/
Copyright Notice: All articles in this blog are licensed under CC BY-NC-SA 4.0 unless stating additionally.

 类似资料: