当前位置: 首页 > 知识库问答 >
问题:

在神经网络中实现反向传播有困难

龙焱
2023-03-14

我有一个简单的前馈神经网络,有2个输入神经元(和1个偏置神经元),4个隐藏神经元(和1个偏置神经元),和一个输出神经元。前馈机制似乎运行良好,但我很难完全理解如何实现反向传播算法。

有3个班:

  • 神经::网络;构建网络,前馈输入值(暂时没有反向传播)
  • 神经::神经元;具有神经元的特性(指数、输出、权重等)
  • 神经::连接;一个类似结构的类,它随机化权重并保存输出、增量权重等。

为了让事情更清楚,我上了微积分课,所以我理解了一些概念,虽然这是相当先进的,但我还是想让它发挥作用。

传递函数是一个logistic函数。突触的权重“附加”到输出该值的神经元上。

这是我对反向传播函数的尝试:

void Net::backPropagate(const vector<double>& targetVals) {
        Layer& outputLayer = myLayers.back();
        assert(targetVals.size() == outputLayer.size());
        cout << "good2" << endl;
        // Starting with the output layer
        for (unsigned int i = 0; i < outputLayer.size(); ++i) { // Traversing output layer
            double output = outputLayer[i].getOutput(); cout << "good3" << endl;
            double error = output * (1 - output) * (pow(targetVals[i] - output,2)); cout << "good4" << endl;
            outputLayer[i].setError(error); // Calculating error
            double newWeight = outputLayer[i].getWeight();
                  newWeight += (error * outputLayer[i].getOutput());
            outputLayer[i].setWeight(newWeight); // Setting new weight
            cout << "good5" << endl;
        }

        for (unsigned int i = myLayers.size() - 2; i > 0; --i) { // Traversing hidden layers all the way to input layer
            Layer& currentLayer = myLayers[i];
            Layer& nextLayer = myLayers[i + 1];
            for (unsigned int j = 0; j < currentLayer.size(); ++j) { // Traversing current layer
                const double& output = currentLayer[j].getOutput();
                double subSum = 0.0; // Initializing subsum
                for (unsigned int k = 0; k < nextLayer.size(); ++k) { // Traversing next layer
                    subSum += pow(nextLayer[k].getError() * currentLayer[j].getWeight(),2); // Getting their backpropagated error and weight
                }
                double error = output*(1 - output)*(subSum);
                currentLayer[j].setError(error);
                double newWeight = currentLayer[j].getWeight();
                newWeight += error * output;
                currentLayer[j].setWeight(newWeight);
            }
        }
    null
// STL_Practice.cpp : Defines the entry point for the console application.
//
#include <iostream>
#include <cassert>
#include <cstdlib>
#include <vector>
#include <time.h>
#include "ConsoleColor.hpp"

using namespace std;

namespace Neural {
    class Neuron;
    typedef vector<Neuron> Layer;

    // ******************** Class: Connection ******************** //
    class Connection {
    public:
        Connection();
        void setOutput(const double& outputVal) { myOutputVal = outputVal; }
        void setWeight(const double& weight) { myDeltaWeight = myWeight- weight; myWeight = weight; }
        double getOutput(void) const { return myOutputVal; }
        double getWeight(void) const { return myWeight; }
    private:
        static double randomizeWeight(void) { return rand() / double(RAND_MAX); }
        double myOutputVal;
        double myWeight;
        double myDeltaWeight;
    };

    Connection::Connection() { 
        myOutputVal = 0;
        myWeight = Connection::randomizeWeight();
        myDeltaWeight = myWeight;
        cout << "Weight: " << myWeight << endl;
    }

    // ******************** Class: Neuron ************************ //
    class Neuron {
    public:
        Neuron();
        void setIndex(const unsigned int& index) { myIndex = index; }
        void setOutput(const double& output) { myConnection.setOutput(output); }
        void setWeight(const double& weight) { myConnection.setWeight(weight); }
        void setError(const double& error) { myError = error; }
        unsigned int getIndex(void) const { return myIndex; }
        double getOutput(void) const { return myConnection.getOutput(); }
        double getWeight(void) const { return myConnection.getWeight(); }
        double getError(void) const { return myError; }
        void feedForward(const Layer& prevLayer);
        void printOutput(void) const;

    private:
        inline static double transfer(const double& weightedSum);
        Connection myConnection;
        unsigned int myIndex;
        double myError;
    };

    Neuron::Neuron() : myIndex(0), myConnection() { } 
    double Neuron::transfer(const double& weightedSum) { return 1 / double((1 + exp(-weightedSum))); }
    void Neuron::printOutput(void) const { cout << "Neuron " << myIndex << ':' << myConnection.getOutput() << endl; }
    void Neuron::feedForward(const Layer& prevLayer) {
        // Weight sum of the previous layer's output values
        double weightedSum = 0;
        for (unsigned int i = 0; i < prevLayer.size(); ++i) {
            weightedSum += prevLayer[i].getOutput()*myConnection.getWeight();
            cout << "Neuron " << i << " from prevLayer has output: " << prevLayer[i].getOutput() << endl;
            cout << "Weighted sum: " << weightedSum << endl;
        }
        // Transfer function
        myConnection.setOutput(Neuron::transfer(weightedSum));
        cout << "Transfer: " << myConnection.getOutput() << endl;
    }

    // ******************** Class: Net *************************** //
    class Net {
    public:
        Net(const vector<unsigned int>& topology);
        void setTarget(const vector<double>& targetVals);
        void feedForward(const vector<double>& inputVals);
        void backPropagate(const vector<double>& targetVals);
        void printOutput(void) const;
    private:
        vector<Layer> myLayers;
    };
    Net::Net(const vector<unsigned int>& topology) {
        assert(topology.size() > 0);
        for (unsigned int i = 0; i < topology.size(); ++i) { // Creating the layers
            myLayers.push_back(Layer(((i + 1) == topology.size()) ? topology[i] : topology[i] + 1)); // +1 is for bias neuron
            // Setting each neurons index inside layer
            for (unsigned int j = 0; j < myLayers[i].size(); ++j) {
                myLayers[i][j].setIndex(j); 
            }
            // Console log
            cout << red;
            if (i == 0) {
                cout << "Input layer (" << myLayers[i].size() << " neurons including bias neuron) created." << endl;
                myLayers[i].back().setOutput(1);
            }
            else if (i < topology.size() - 1) { 
                cout << "Hidden layer " << i << " (" << myLayers[i].size() << " neurons including bias neuron) created." << endl; 
                myLayers[i].back().setOutput(1);
            }
            else { cout << "Output layer (" << myLayers[i].size() << " neurons) created." << endl; }
            cout << white;
        }
    }
    void Net::feedForward(const vector<double>& inputVals) {
        assert(myLayers[0].size() - 1 == inputVals.size());
        for (unsigned int i = 0; i < inputVals.size(); ++i) { // Setting input vals to input layer
            cout << yellow << "Setting input vals...";
            myLayers[0][i].setOutput(inputVals[i]); // myLayers[0] is the input layer
            cout << "myLayer[0][" << i << "].getOutput()==" << myLayers[0][i].getOutput() << white << endl;
        }
        for (unsigned int i = 1; i < myLayers.size() - 1; ++i) { // Updating hidden layers
            for (unsigned int j = 0; j < myLayers[i].size() - 1; ++j) { // - 1 because bias neurons do not have input
                cout << "myLayers[" << i << "].size()==" << myLayers[i].size() << endl;
                cout << green << "Updating neuron " << j << " inside layer " << i << white << endl;
                myLayers[i][j].feedForward(myLayers[i - 1]); // Updating the neurons output based on the neurons of the previous layer
            }
        }
        for (unsigned int i = 0; i < myLayers.back().size(); ++i) { // Updating output layer
            cout << green << "Updating output neuron " << i << ": " << white << endl;
            const Layer& prevLayer = myLayers[myLayers.size() - 2];
            myLayers.back()[i].feedForward(prevLayer); // Updating the neurons output based on the neurons of the previous layer
        }
    }
    void Net::printOutput(void) const {
        for (unsigned int i = 0; i < myLayers.back().size(); ++i) {
            cout << blue;  myLayers.back()[i].printOutput(); cout << white;
        }
    }
    void Net::backPropagate(const vector<double>& targetVals) {
        Layer& outputLayer = myLayers.back();
        assert(targetVals.size() == outputLayer.size());
        cout << "good2" << endl;
        // Starting with the output layer
        for (unsigned int i = 0; i < outputLayer.size(); ++i) { // Traversing output layer
            double output = outputLayer[i].getOutput(); cout << "good3" << endl;
            double error = output * (1 - output) * (pow(targetVals[i] - output,2)); cout << "good4" << endl;
            outputLayer[i].setError(error); // Calculating error
            double newWeight = outputLayer[i].getWeight();
                  newWeight += (error * outputLayer[i].getOutput());
            outputLayer[i].setWeight(newWeight); // Setting new weight
            cout << "good5" << endl;
        }

        for (unsigned int i = myLayers.size() - 2; i > 0; --i) { // Traversing hidden layers all the way to input layer
            Layer& currentLayer = myLayers[i];
            Layer& nextLayer = myLayers[i + 1];
            for (unsigned int j = 0; j < currentLayer.size(); ++j) { // Traversing current layer
                const double& output = currentLayer[j].getOutput();
                double subSum = 0.0; // Initializing subsum
                for (unsigned int k = 0; k < nextLayer.size(); ++k) { // Traversing next layer
                    subSum += pow(nextLayer[k].getError() * currentLayer[j].getWeight(),2); // Getting their backpropagated error and weight
                }
                double error = output*(1 - output)*(subSum);
                currentLayer[j].setError(error);
                double newWeight = currentLayer[j].getWeight();
                newWeight += error * output;
                currentLayer[j].setWeight(newWeight);
            }
        }
    }
}

int main(int argc, char* argv[]) {
    srand(time(NULL));
    vector<unsigned int> myTopology;
    myTopology.push_back(2);
    myTopology.push_back(4);
    myTopology.push_back(1);

    cout << myTopology.size() << endl << endl; // myTopology == {3, 4, 2 ,1}
    Neural::Net myNet(myTopology);
    for (unsigned int i = 0; i < 50; ++i) {
        myNet.feedForward({1, 1});
        myNet.backPropagate({0});
    }
    for (unsigned int i = 0; i < 50; ++i){
        myNet.feedForward({0, 0});
        myNet.backPropagate({1});
    }
    cout << "Feeding 0,0" << endl;
    myNet.feedForward({0, 0});
    myNet.printOutput();
    cout << "Feeding 1,1" << endl;
    myNet.feedForward({1, 1});
    myNet.printOutput();

    return 0;
}

共有1个答案

邢昂然
2023-03-14

您可以尝试训练直到网络的误差为0%,但这可能会花费太长时间或不可能。通常使用0.01(1%)的最小误差,阈值如下:>0.9=1和<0.1=0。

要计算具有单个输出神经元的网络的误差,您需要将Sum(Math.abs(idealOutput-a.value))添加到每个输入的列表中。然后对列表进行平均以得到错误。

我用C#实现的是:

int epoch = 0;
double error = 1.0;
Network = network;

while (error > minError && epoch < int.MaxValue)
{
    var errors = new List<double>();
    for (int i = 0; i < inputs.Count; i++)
    {
        Algorithm(inputs[i], ideals[i]);

        int n = 0;
        errors.Add(Network.Layers[Network.Layers.Count - 1].Neurons.Sum(a => Math.Abs(ideals[i][n++] - a.Value)));
    }
    error = errors.Average();
    Console.WriteLine("Epoch: #{0} --- Error: {1}", epoch, error);
    epoch++;
}
 类似资料:
  • 我正在尝试实现一个简单的神经网络。我知道已经有很多可用的库,这不是重点。 我的网络只有3层:一个输入层一个隐藏层一个输出层 输出层有8个神经元,每个神经元代表不同的类。 我知道如何实现feedfoward算法,但我真的很难实现反向传播算法。 这是我到目前为止得出的结论: 我尝试使用Iris数据集进行测试:https://en.wikipedia.org/wiki/Iris_flower_data_

  • 考虑具有以下架构的卷积神经网络: Standford的深度学习教程使用以下等式来实现这一点: 然而,在使用这个等式时,我面临以下问题: 我做错了什么?有人能解释一下如何通过卷积层传播错误吗? 简单的MATLAB示例将受到高度赞赏。

  • 我很难构建好的神经网络教学算法,因为有一些人工操作。第一件事:我的目标是教nn-xor函数,我使用sigmoid作为激活函数和简单的梯度下降。前馈很容易,但backprop在某种程度上令人困惑——大多数算法描述中常见的步骤有:1。计算输出层上的错误。2、将此错误传播到有关权重3的隐藏层。更新突触上的权重 所以我的问题:1。偏差也应该更新吗?如果是,如何更新?目前我随机选择偏差[0.5;1]?2.在

  • 我正在尝试用RELU实现神经网络。 输入层- 以上是我的神经网络结构。我对这个relu的反向传播感到困惑。对于RELU的导数,如果x 有人能解释一下我的神经网络架构的反向传播“一步一步”吗?

  • 我们首先回顾DNN的反向传播算法。在DNN中,我们是首先计算出输出层的$$deltaL:deltaL = frac{partial J(W,b)}{partial zL} = frac{partial J(W,b)}{partial aL}odot sigma{'}(zL)$$ 利用数学归纳法,用$$delta{l+1}$$的值一步步的向前求出第l层的$$deltal$$,表达式为:$$delta

  • 我正在从头开始编写一个backprop神经网络迷你库,我需要一些帮助来编写有意义的自动测试。到目前为止,我已经进行了自动化测试,以验证backprop算法是否正确计算了权重和偏差梯度,但没有测试训练本身是否有效。 到目前为止,我使用的代码可以执行以下操作: 定义一个具有任意层数和每层神经元数的神经网络 鉴于所有这些,我可以编写什么样的自动化测试来确保训练算法被正确实施。我应该尝试近似什么函数(si