Support-Vector-Data-Description-SVDD

授权协议 Readme
开发语言
所属分类 应用工具、 科研计算工具
软件类型 开源软件
地区 不详
投 递 者 陆子航
操作系统 跨平台
开源组织
适用人群 未知
 软件概览

Support Vector Data Description (SVDD)

MATLAB Code for abnormal detection or fault detection using SVDD

Version 2.1, 11-MAY-2021

Email: iqiukp@outlook.com


Main features

  • SVDD model for one-class or binary classification
  • Multiple kinds of kernel functions (linear, gaussian, polynomial, sigmoid, laplacian)
  • Visualization of decision boundaries for 2D or 3D data
  • Parameter Optimization using Bayesian optimization, Genetic Algorithm, and Particle Swarm Optimization
  • Weighted SVDD model

Notices

  • This version of the code is not compatible with the versions lower than R2016b.
  • The label must be 1 for positive sample or -1 for negative sample.
  • Detailed applications please see the demonstrations.
  • This code is for reference only.

How to use

01. banana-shaped dataset

A class named DataSet is defined to generate and partition the 2D or 3D banana-shaped dataset.

[data, label] = DataSet.generate;
[data, label] = DataSet.generate('dim', 2);
[data, label] = DataSet.generate('dim', 2, 'num', [200, 200]);
[data, label] = DataSet.generate('dim', 3, 'num', [200, 200], 'display', 'on');

% 'single' --- The training set contains only positive samples. 
[trainData, trainLabel, testData, testLabel] = DataSet.partition(data, label, 'type', 'single');

% 'hybrid' --- The training set contains positive and negetive samples. 
[trainData, trainLabel, testData, testLabel] = DataSet.partition(data, label, 'type', 'hybrid');

02. Kernel funcions

A class named Kernel is defined to compute kernel function matrix.

%{
        type   -
        
        linear      :  k(x,y) = x'*y
        polynomial  :  k(x,y) = (γ*x'*y+c)^d
        gaussian    :  k(x,y) = exp(-γ*||x-y||^2)
        sigmoid     :  k(x,y) = tanh(γ*x'*y+c)
        laplacian   :  k(x,y) = exp(-γ*||x-y||)
    
    
        degree -  d
        offset -  c
        gamma  -  γ
%}
kernel = Kernel('type', 'gaussian', 'gamma', value);
kernel = Kernel('type', 'polynomial', 'degree', value);
kernel = Kernel('type', 'linear');
kernel = Kernel('type', 'sigmoid', 'gamma', value);
kernel = Kernel('type', 'laplacian', 'gamma', value);

For example, compute the kernel matrix between X and Y

X = rand(5, 2);
Y = rand(3, 2);
kernel = Kernel('type', 'gaussian', 'gamma', 2);
kernelMatrix = kernel.computeMatrix(X, Y);
>> kernelMatrix

kernelMatrix =

    0.5684    0.5607    0.4007
    0.4651    0.8383    0.5091
    0.8392    0.7116    0.9834
    0.4731    0.8816    0.8052
    0.5034    0.9807    0.7274

03-1. Simple SVDD model for dataset containing only positive samples

[data, label] = DataSet.generate('dim', 3, 'num', [200, 200], 'display', 'on');
[trainData, trainLabel, testData, testLabel] = DataSet.partition(data, label, 'type', 'single');
kernel = Kernel('type', 'gaussian', 'gamma', 0.2);
cost = 0.3;
svddParameter = struct('cost', cost,...
                       'kernelFunc', kernel);
svdd = BaseSVDD(svddParameter);

% train SVDD model
svdd.train(trainData, trainLabel);
% test SVDD model
results = svdd.test(testData, testLabel);

In this code, the input of svdd.train is also supported as:

% train SVDD model
svdd.train(trainData);

The training and test results:

*** SVDD model training finished ***
running time            = 0.0069 seconds
iterations              = 9 
number of samples       = 140 
number of SVs           = 23 
radio of SVs            = 16.4286% 
accuracy                = 95.0000%


*** SVDD model test finished ***
running time            = 0.0013 seconds
number of samples       = 260 
number of alarm points  = 215 
accuracy                = 94.2308%

03-2. Simple SVDD model for dataset containing both positive and negetive samples

[data, label] = DataSet.generate('dim', 3, 'num', [200, 200], 'display', 'on');
[trainData, trainLabel, testData, testLabel] = DataSet.partition(data, label, 'type', 'hybrid');
kernel = Kernel('type', 'gaussian', 'gamma', 0.05);
cost = 0.9;
svddParameter = struct('cost', cost,...
                       'kernelFunc', kernel);
svdd = BaseSVDD(svddParameter);

% train SVDD model
svdd.train(trainData, trainLabel);
% test SVDD model
results = svdd.test(testData, testLabel);

The training and test results:

*** SVDD model training finished ***
running time            = 0.0074 seconds
iterations              = 9 
number of samples       = 160 
number of SVs           = 12 
radio of SVs            = 7.5000% 
accuracy                = 97.5000%


*** SVDD model test finished ***
running time            = 0.0013 seconds
number of samples       = 240 
number of alarm points  = 188 
accuracy                = 96.6667%

04. Visualization

A class named SvddVisualization is defined to visualize the training and test results.

Based on the trained SVDD model, the ROC curve of the training results (only supported for dataset containing both positive and negetive samples) is

% Visualization 
svplot = SvddVisualization();
svplot.ROC(svdd);

The decision boundaries (only supported for 2D/3D dataset) are

% Visualization 
svplot = SvddVisualization();
svplot.boundary(svdd);

The distance between the test data and the hypersphere is

svplot.distance(svdd, results);

For the test results, the test data and decision boundary (only supported for 2D/3D dataset) are

svplot.testDataWithBoundary(svdd, results);

05. Parameter Optimization

A class named SvddOptimization is defined to optimized the parameters.

% optimization setting 
optimization.method = 'bayes'; % bayes, ga  pso 
optimization.variableName = { 'cost', 'gamma'};
optimization.variableType = {'real', 'real'}; % 'integer' 'real'
optimization.lowerBound = [10^-2, 2^-6];
optimization.upperBound = [10^0, 2^6];
optimization.maxIteration = 20;
optimization.points = 10;
optimization.display = 'on';

% SVDD parameter
svddParameter = struct('cost', cost,...
                       'kernelFunc', kernel,...
                       'optimization', optimization);

The visualization of parameter optimization is

Notice

  • The optimization method can be set to 'bayes', 'ga', 'pso'.
  • The parameter names are limited to 'cost', 'degree', 'offset', 'gamma'
  • The parameter optimization of the polynomial kernel function can only be achieved by using Bayesian optimization.
  • The parameter type of 'degree' should be set to 'integer'.

06. Cross Validation

In this code, two cross-validation methods are supported: 'K-Folds' and 'Holdout'.For example, the cross-validation of 5-Folds is

svddParameter = struct('cost', cost,...
                       'kernelFunc', kernel,...
                       'KFold', 5);

For example, the cross-validation of the Holdout method with a ratio of 0.3 is

svddParameter = struct('cost', cost,...
                       'kernelFunc', kernel,...
                       'Holdout', 0.3);

07. Dimensionality reduction using PCA

For example, reducing the data to 2 dimensions can be set as

% SVDD parameter
svddParameter = struct('cost', cost,...
                       'kernelFunc', kernel,...
                       'PCA', 2);

Notice: you only need to set PCA in svddParameter, and you don't need to process training data and test data separately.

08. Weighted SVDD

An Observation-weighted SVDD is supported in this code. For example, the weighted SVDD can be set as

weight = rand(size(trainData, 1), 1);
% SVDD parameter
svddParameter = struct('cost', cost,...
                       'kernelFunc', kernel,...
                       'weight', weight);

Notice: the size of 'weigh' should be m×1, where m is the number of training samples.

  • 前言 支持向量数据描述(Support Vector Data Description,SVDD)是一种单值分类算法,能够实现目标样本和非目标样本的区分,通常应用于异常检测和故障检测等领域。SVDD算法的具体描述可以参考以下文献: (1)Tax D M J, Duin R P W. Support vector domain description[J]. Pattern recognition

  • tags: 机器学习 支持向量机(Support Vector Machine, SVM)

  • // vector::data #include <iostream> #include <vector> using namespace std; //访问数据(指针操作) //返回指向向量内部用于存储其拥有元素的内存数组的直接指针。 //由于保证向量中的元素以向量表示的相同顺序存储在相邻的存储位置 //因此可以偏移检索到的指针以访问数组中的任何元素。 int main () { ve

  • SVM(Support Vector Machines) 0. Introduction Capable of performing linear or nonlinear classification, regression, and even outlier detection Well suited for classification of complex but small- or me

  • // vector::shrink_to_fit #include <iostream> #include <vector> int main () { std::vector<int> myvector (5); int* p = myvector.data(); *p = 10; ++p; *p = 20; p[2] = 100; std::cout <<

  • // reference: http://www.cplusplus.com/reference/vector/vector/data/ // vector::data #include <iostream> #include <vector> int main () { std::vector<int> myvector (5); int* p = myvector.data();

  • https://www.cnblogs.com/guxuanqing/p/4859578.html 在c++11中,vector 增加了data()的用法,它返回内置vecotr所指的数组内存的第一个元素的指针.

 相关资料
  • Support Vector Machine(SVM) SVM支持向量机器是一种常用的分类算法 1. 算法介绍 SVM分类模型可以抽象为以下优化问题: 其中:}) 为合页损失函数(hinge loss),如下图所示: 2. 分布式实现 on Angel Angel MLLib提供了用mini-batch gradient descent优化方法求解的SVM二分类算法,算法逻辑如下: 3. 运行 &

  • 描述: 描述一个标识. 别名:@desc Syntax(语法) @description <some description> Overview(概述) @description标签允许您提供一般描述。该说明可能包括HTML标签。如果Markdown 插件启用的话,它也可包括Markdown格式。 Examples(例子) 如果在注释开始的地方添加描述,那么可省略@description标签。 例

  • 任务类型,worker,network配置 配置项名称 默认值 配置项含义 train train Angel 任务类型,表示模型训练 predict predict 使用模型进行预测 inctrain inctrain 对已有模型进行增量训练 ml.matrix.dot.use.parallel.executor false 稠密矩阵Dot运算是否使用并行 angel.worker.thread

  • 全部显示 Description 属性应用于 COMAddIn 对象的情形。 返回或设置指定COMAddIn 对象的说明性 String 值。可读写。 expression.Description expression 必需。该表达式返回一个 COMAddIn 对象。 Description 属性应用于 FileDialogFilter 对象的情形。 以 String 类型值返回每个 Filter

  • 在本章中,我们将了解Arduino板上的不同组件。 我们将研究Arduino UNO板,因为它是Arduino板系列中最受欢迎的板。 此外,它是开始使用电子和编码的最佳主板。 有些电路板与下面给出的电路板略有不同,但大多数Arduinos都有大部分这些组件的共同点。 Power USB 可以使用计算机上的USB电缆为Arduino板供电。 您只需将USB电缆连接到USB连接(1)即可。 Power

  • HUANG QING BLOG 下面是博客的搭建教程,这个教程修改自 BY 。 我的博客在这里 → 使用 开始 环境 开始 撰写博文 组件 侧边栏 迷你关于我 推荐标签 好友链接 HTML5 演示文档布局 评论与 Google/Baidu Analytics 评论 网站分析 高级部分 自定义 标题底图 搜索展示标题-头文件 环境 如果你安装了jekyll,那你只需要在命令行输入bundle exe