当前位置: 首页 > 文档资料 > Angel 中文文档 >

FM

优质
小牛编辑
123浏览
2023-12-01

Recommendation Algorithm

Currently, Pytorch on angel supports a series of recommendation algorithms.

In detail, the following methods are currently implemented:

We use DeepFM as an example to illustrate the details process of running an algorithm. The methods are similar for other algorithms.

Example of DeepFM

  1. Generate pytorch script model First, go to directory of python/recommendation and execute the following command:

     python deepfm.py --input_dim 148 --n_fields 13 --embedding_dim 10 --fc_dims 10 5 1
    

    Some explanations for the parameters.

    • input_dim: the feature dimension for the data
    • n_fields: number of fields for data
    • embedding_dim: dimension for embedding layer
    • fc_dims: the dimensions for fc layers in deepfm. "10 5 1" indicates a two-layers mlp composed with one 10x5 layer and one 5x1 layer.

      This python script will generate a TorchScript model with the structure of dataflow graph for deepfm. This file is named deepfm.pt.

  2. Preparing the input data The input data of DeepFM should be libsvm or libffm format. Each line of the input data represents one data sample.

     label feature1:value1 feature2:value2
    

    In Pytorch on angel, multi-hot field is allowed, which means some field can be appeared multi-times in one data example.

     label field1:feature1:value1 field2:feature2:value2
    
  3. Training model After obtaining the model file (deepfm.pt) and the input data, we can submit a task through Spark on Angel to train the model. The command is:

     source ./spark-on-angel-env.sh  
     $SPARK_HOME/bin/spark-submit \
           --master yarn-cluster\
           --conf spark.ps.instances=5 \
           --conf spark.ps.cores=1 \
           --conf spark.ps.jars=$SONA_ANGEL_JARS \
           --conf spark.ps.memory=5g \
           --conf spark.ps.log.level=INFO \
           --conf spark.driver.extraJavaOptions=-Djava.library.path=$JAVA_LIBRARY_PATH:.:./torch/torch-lib \
           --conf spark.executor.extraJavaOptions=-Djava.library.path=$JAVA_LIBRARY_PATH:.:./torch/torch-lib \
           --conf spark.executor.extraLibraryPath=./torch/torch-lib \
           --conf spark.driver.extraLibraryPath=./torch/torch-lib \
           --conf spark.executorEnv.OMP_NUM_THREADS=2 \
           --conf spark.executorEnv.MKL_NUM_THREADS=2 \
           --queue $queue \
           --name "deepfm on angel" \
           --jars $SONA_SPARK_JARS  \
           --archives torch.zip#torch\
           --files deepfm.pt \
           --driver-memory 5g \
           --num-executors 5 \
           --executor-cores 1 \
           --executor-memory 5g \
           --class com.tencent.angel.pytorch.examples.supervised.RecommendationExample \
           ./pytorch-on-angel-0.3.0.jar \
           trainInput:$input batchSize:128 torchModelPath:deepfm.pt \
           stepSize:0.001 numEpoch:10 testRatio:0.1 \
           angelModelOutputPath:$output \
    

    Description for the parameters:

    • trainInput: the input path (hdfs) for training data
    • batchSize: batch size for each optimizing step
    • torchModelPath: the name of the generated torch model
    • stepSize: learning rate
    • numEpoch: how many epoches for the training process
    • testRatio: how many training examples are used for testing
    • angelModelOutputPath: the output path (hdfs) for the training model

项目地址:https://github.com/Angel-ML/angel
官网:https://angelml.ai/