Pylearn2的使用简介

汤念

2023-12-01

官方文档：

http://deeplearning.net/software/pylearn2/

http://blog.itpub.net/16582684/viewspace-1242503/

pylearn2说明：
pylearn2包含了模型、学习算法和数据集三部分

Model：用来存储参数的，实现了很多成熟的模型，比如RBM，CNN,AUTOENCODER等，尤其是LISP实验室的paper中的模型，它全都实现了的。

学习算法：调整Model中的参数的，并且还有别的功能，比如建立Monitor，来检测学习过程中的一些变化，比如精度曲线，里面有几个实现好的类，比如SGD、BGD。

数据集：就是我们训练算法用的数据，只是对原始数据和模型之间做了个接口，让模型对数据透明，因为不同的实现，毕竟数据类型很多，理论上支持任何类型的数据输入格式。如果数据时矩阵直接用DenseDesignmatrix类，如果数据在Numpy或者pickle format的直接用就是了，更大的数据也支持HDF5格式，同时这个模块还可以对数据进行ZCA,PCA的预处理。

具体使用是用配置文件来实现的，配置的主要模块也就是上面3个地方，每个地方会涉及到一些参数，具体的根据实际应用设定

import theano.tensor as T #一个算法库， tensor是数据方法的函数，很多库以它为数据处理基础
import theano

######Pylearn2 ,机器学习的库，目前已经停止开发。
import pylearn2.train # 训练的累
import pylearn2.models.mlp as p2_md_mlp # 多层感知机
import pylearn2.datasets.dense_design_matrix

"""
The DenseDesignMatrix class and related code. Functionality for representing
data that can be described as a dense matrix (rather than a sparse matrix)
with each row containing an example and each column corresponding to a
different feature. DenseDesignMatrix also supports other "views" of the data,
for example a dataset of images can be viewed either as a matrix of flattened
images or as a stack of 2D multi-channel images. However, the images must all
be the same size, so that each image may be mapped to a matrix row by the same
transformation.
"""

import pylearn2.training_algorithms.sgd as p2_alg_sgd #training_algorithms：训练的算法

"""
SGD = (Minibatch) Stochastic Gradient Descent.
A TrainingAlgorithm that does stochastic gradient descent on
minibatches of training examples.

"""

import pylearn2.training_algorithms.learning_rule

"""
A module containing different learning rules for use with the SGD training
algorithm.

A pylearn2 learning rule is an object which computes new parameter values
given (1) a learning rate (2) current parameter values and (3) the current
estimated gradient.
"""

import pylearn2.costs.mlp.dropout as p2_ct_mlp_dropout

"""
Functionality for training with dropout.

Implements the dropout training technique described in
"Improving neural networks by preventing co-adaptation of feature
detectors"
Geoffrey E. Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever,
Ruslan R. Salakhutdinov
arXiv 2012

This paper suggests including each unit with probability p during training,
then multiplying the outgoing weights by p at the end of training.
We instead include each unit with probability p and divide its
state by p during training. Note that this means the initial weights should
be multiplied by p relative to Hinton's.
The SGD learning rate on the weights should also be scaled by p^2 (use
W_lr_scale rather than adjusting the global learning rate, because the
learning rate on the biases should not be adjusted).

During training, each input to each layer is randomly included or excluded
for each example. The probability of inclusion is independent for each
input and each example. Each layer uses "default_input_include_prob"
unless that layer's name appears as a key in input_include_probs, in which
case the input inclusion probability is given by the corresponding value.

Each feature is also multiplied by a scale factor. The scale factor for
each layer's input scale is determined by the same scheme as the input
probabilities.

"""

import pylearn2.termination_criteria as p2_termcri

"""
Termination criteria used to determine when to stop running a training
algorithm.
"""

Pylearn2的使用简介 2014-07-30 15:54:19

分类：开源技术

环境: ubuntu 12.4

Pylearn2是基于theano上封装的深度学习包。它实现一些常见的模型，具体请参考： http://deeplearning.net/software/pylearn2/library/index.html#libdoc，比theano在做实际的项目节约时间，只需要配置一些参数来实现模型的训练。
下面来讲解实际的安装和使用：

1. 安装 Theano（Bleeding-edge install instruction）

  jerry@hq:~$sudo pip install --upgrade --no-deps git+git://github.com/Theano/Theano.git --user

2. 下载Pylearn2
jerry@hq:~$ git clone git://github.com/lisa-lab/pylearn2.git

3. 安装pylearn2
   jerry@hq:~$cd pylearn2
  jerry@hq:~$sudo python setup.py develop --user

4. 测试安装成功
  jerry@hq:~$python
import pylearn2
能加载包即安装ok

5. 设置PYTHON2_DATA_PATH, PYLEARN2_VIEWR_COMMAND
  vi ~/.bashrc
  添加
  export PYLEARN2_DATA_PATH=/u01/lisa/data
export PYLEARN2_VIEWER_COMMAND=/usr/bin/eog

如何运行一个示例

1. 下载数据
cd /u01/lisa/data/cifar10
  wget http://www.cs.utoronto.ca/~kriz/cifar-10-python.tar.gz
tar xvf cifar-10-python.tar.gz

2. 修改make_dataset.py文件，指定路径/u01/lisa/data/ (由于本机上/空间不足，只能把数据放在其它路径上)
jerry@hq:~$vi /home/jerry/pylearn2/pylearn2/scripts/tutorials/grbm_smd/make_dataset.py
修改成这样：
   """
path = pylearn2.__path__[0]
train_example_path = os.path.join(path, 'scripts', 'tutorials', 'grbm_smd')
train.use_design_loc(os.path.join(train_example_path, 'cifar10_preprocessed_train_design.npy'))
train_pkl_path = os.path.join(train_example_path, 'cifar10_preprocessed_train.pkl')
"""
train_pkl_path = os.path.join('/u01/lisa/data/', 'cifar10_preprocessed_train.pkl')
serial.save(train_pkl_path, train)

3. 对下载数据进行数据预处理
python /home/jerry/pylearn2/pylearn2/scripts/tutorials/grbm_smd/make_dataset.py
处理完后在目录/u01/lisa/data下有一个文件 cifar10_preprocessed_train.pkl，大概652M左右

4. 对数据进行训练
cd /u01/lisa/data
python ~/pylearn2/pylearn2/scripts/train.py ~/pylearn2/pylearn2/scripts/tutorials/grbm_smd/cifar_grbm_smd.yaml

5. 查看结果
  python ~/pylearn2/pylearn2/scripts/show_weights.py ~/pylearn2/pylearn2/scripts/tutorials/grbm_smd/cifar_grbm_smd.pkl

python ~/pylearn2/pylearn2/scripts/plot_monitor.py ~/pylearn2/pylearn2/scripts/tutorials/grbm_smd/cifar_grbm_smd.pkl

python ~/pylearn2/pylearn2/scripts/print_monitor.py ~/pylearn2/pylearn2/scripts/tutorials/grbm_smd/cifar_grbm_smd.pkl

python ~/pylearn2/pylearn2/scripts/summarize_model.py ~/pylearn2/pylearn2/scripts/tutorials/grbm_smd/cifar_grbm_smd.pkl

6. 直接查看生成参数的文件 cifar_grbm_smd.pkl

加载模型文件
>>> from pylearn2.utils import serial
>>> model = serial.load('/home/jerry/pylearn2/pylearn2/scripts/tutorials/grbm_smd/cifar_grbm_smd.pkl')
查下文件结构
>>> dir(model)
获取权重参数
>>> model.get_weights()
获取参数名
>>> model.get_params()
获取参数值
>>> model.get_param_values()

Pylearn2的使用简介

相关阅读

相关文章

相关问答

相关文档