LowRankModels.jl是一个用于建模和拟合广义低阶模型(GLRMs)的Julia工具包。
LowRankModels.jl is a julia package for modeling and fitting generalized low rank models (GLRMs).
GLRMs用低阶矩阵对一个数据数组进行建模,并在数据分析中包括了许多著名的模型,如主成分分析(PCA)、矩阵完备化、鲁棒PCA、非负矩阵分解、k均值等。
GLRMs model a data array by a low rank matrix, and include many well known models in data analysis, such as principal components analysis (PCA), matrix completion, robust PCA, nonnegative matrix factorization, k-means, and many more.
关于GLRMs的更多信息,请参考相关论文。
For more information on GLRMs, see our paper.
该工具包包含一个python接口,在H2O机器学习平台中实现了GLRM,并使用了多种语言接口。
There is a python interface to this package, and a GLRM implementation in the H2O machine learning platform with interfaces in a variety of languages.
LowRankModels.jl使混合匹配损失函数和正则化器变得容易,从而构建适合特定数据集的模型。
LowRankModels.jl makes it easy to mix and match loss functions and regularizers to construct a model suitable for a particular data set.
该工具包尤其支持:
In particular, it supports
对数据数组的不同列使用不同的代价函数,这在数据类型异构(例如,实列、布尔列和序数列)时很有用;
using different loss functions for different columns of the data array, which is useful when data types are heterogeneous (eg, real, boolean, and ordinal columns);
仅将模型拟合到表格中的某些条目,这对于包含许多缺少(未观察到)条目的数据表很有用;
fitting the model to only some of the entries in the table, which is useful for data tables with many missing (unobserved) entries;
在不破坏稀疏性的情况下向模型添加偏移量和缩放量,这在数据缩放不良时非常有用。
and adding offsets and scalings to the model without destroying sparsity, which is useful when the data is poorly scaled.
完整源码下载请点击“阅读原文”