Spark线性代数,绘图工具入门;scala, java下的Breeze线性代数以及数据绘图工具breeze-viz入门

岳时铭

2023-12-01

//官方地址,  https://github.com/scalanlp/breeze/wiki/Quickstar

//由于编辑器的格式原因, 自行拷贝到集成开发环境中调试格式, 代码都可以正确运行

def breezeTest: Unit ={

//Vector支持访问和更新,
 DenseVector是列向量

val
x = DenseVector.zeros[Double](5) ;//构建一个5维的稠密向量

SparseVector.zeros[Double](5)//构建稀疏向量,不会给zero分配空间

//Like Numpy, negative indices are supported, 当i<0时,我们从后往前插入数据 即  x(i)= x(x.length + i) ;

//因为x支持的下标范围是[-5,5),所以x(-1)= x(5 -1) = x(4);

x(-1)= 6 ;

println("firstx : " + x + "\t size : " + x.length);

println("-1 index : " + x(-1)) ;

/**

*
Unlike Scalala, all Vectors are column vectors. Row vectors are
represented as Transpose[Vector[T]]

*
使用slicing时,使用Range比那些slicing使用任何一个序列(sequence)都快非常多

3 to 4 是一个Range

*/

val  range: Range.Inclusive = 3 to 4 ;
 x(3 to 4) := .5

x(-4)= 2 ;

println("second  x : " + x) //DenseVector(0.0, 2.0, 0.0, 0.5, 0.5)

//实际上和scala中的slice参数一样,关键是第二参数是until不是 to,slice(start: Int, until: Int)

val subVector: DenseVector[Double] = x.slice(2, 5) ;

println("subVector: " + subVector) ;

/**

*
vectorized-set operator := (:=是一个向量化集操作)

*
The slice operator constructs a read-through and write-through view
of the given elements in the underlying vector.

*
slice操作 为给定的Vector中数据构建一个  读通道和写通道窗口(view),  :=向量化集操作可以为这个  slice操作赋值.

你同时也可以 将 其  赋值到一个大小兼容的Vector中.

*/

x(0 to 1) := DenseVector(.6,.5) // DenseVector(0.6, 0.5, 0.0, 0.5, 0.5)

println("third
x : " + x) ;

/**

*
DenseMatrix

*
密集矩阵可以同样方式构建(调用构造函数),也可以访问和更新

*/

val m = DenseMatrix.zeros[Int](5,5) ;

println("\nfirst m : \n" + m + "\n") ;

println("将密集矩阵转换为 密集向量 :" + m.toDenseVector) ;

//The columns of m can be accessed as DenseVectors, and the rows as DenseMatrices.

//m的列可以当做是在访问  列向量DenseVector,行当做 DenseMatrix访问

//m(::,1)是访问列,m(4, ::)访问下标为4的行向量

println(s"rows: ${m.rows},  cols: ${m.cols}") ;

m(::,1):= DenseVector(8, 9, 10, 22, 11) ;//列向量

println("m(::,1): " + m(::,1)) ;

m(4,::) := DenseVector(1,2,3,4,5).t  // transpose to match row shape

println("\nsecondm : \n" + m + "\n") ;

//此隐士转换将Double向量转为Int向量

implicit val d2I= new OpSet.InPlaceImpl2[DenseVector[Int],DenseVector[Double]] {

def apply(v: DenseVector[Int], v2: DenseVector[Double]){

v
:= DenseVector(v2.toArray.map(_.toInt))

val
mCol0 = DenseVector(0.0, 2.0, 0.0, 4.5, 1.5)

m(::,
0) := mCol0 ;

println("\nthird
m : \n" + m + "\n") ;

//
   m := DenseMatrix.zeros[Int](3,3)

//
   java.lang.IllegalArgumentException: requirement failed: Matrices
must have same number of row

m
:= DenseMatrix.zeros[Int](5,5)

println("\nfouth
m : \n" + m + "\n") ;

/**

*
Sub-matrices can be sliced and updated, and literal matrices can be
specified using a simple tuple-based syntax.

*
Unlike Scalala, only range slices are supported,

*
and only the columns (or rows for a transposed matrix) can have a
Range step size different from 1.

*
子矩阵可以被sliced和更新,使用一个简单
 基础元组语法(tuple-based
syntax) 字面量矩阵就能被 特例化.

*
不像Scalala,
只有range划分被支持,
{x(0 to 1), x是列向量}
和只有列(或者转置矩阵的
行)有一个Range

*
Breeze的行和列都可以用range划分:
m(0 to 1, 0 to 1)

*/

m(0
to 1, 0 to 1) := DenseMatrix((3,1),(-1,-2))

println("\nfifth
m : \n" + m + "\n") ;

/**

3
  1   0  0  0

-1
 -2  0  0  0

0
  0   0  0  0

0
  0   0  0  0

1
  2   3  4  5

*/

/**

*
Linear Algebra Cheat-Sheet 列举了这些操作:
 https://github.com/scalanlp/breeze/wiki/Linear-Algebra-Cheat-Sheet

*
和Matlab或者Numpy相似,
Breeze也支持一系列操作

breeze
          matlab       Numpy

Elementwise
addition                   a + b          a + b        a + b

Elementwise
multiplication             a :* b           a .* b         a * b

Elementwise
comparison                 a :< b           a < b (gives matrix
of 1/0 instead of true/false)      a < b

Inplace
addition                       a :+= 1.0      a += 1         a += 1 
(内部加)

Inplace
elementwise multiplication     a :*= 2.0      a *= 2         a *= 2

Vector
dot product                     a dot b,a.t * b†   dot(a,b)    
dot(a,b)

Elementwise
sum                      sum(a)           sum(sum(a))  a.sum()

Elementwise
max                      a.max          max(a)         a.max()

Elementwise
argmax                     argmax(a)      argmax(a)    a.argmax()

Ceiling
                             ceil(a)        ceil(a)      ceil(a) 
//向上取整

Floor
                               floor(a)         floor(a)      
floor(a)

*/

/**

*
Broadcasting:

*
有时候需要应用一个操作到一个矩阵的每一行
或者 列,
作为一个unit

*
例如:
你可能项计算每一行的均值(可以用于PCA中的均值化操作),或者增加一个vector到每一列

*
适应一个矩阵以至于  操作能应用到
列式的或者行式的,
称作广播broadcasting
;

*
隐士的做broadcasting,
像R和Numpy一样
智能.

*
意味着:
如果随机地(accidentally)添加一个矩阵或者一个向量,
他们不会阻止你 .

*
In Breeze, 使用 *
去 表明你的意图(signal
your intent) .

*
  *意味着 visually(形象化地)唤醒
foreach.

*/

import
breeze.stats.mean ;

val
dm = DenseMatrix( (1.0, 2.0, 3.0), (4.0, 5.0, 6.0) )//3个二维向量

val
res = dm(::, *) + DenseVector(3.0, 4.0) ; //一个二维的列向量

println("\nfirst
res : \n" + res + "\n") ;

println(s"rows
: ${dm.rows}, cols: ${dm.cols}") ;

res(::,
*) := DenseVector(3.0, 4.0) ;

println("\nsecond
res : \n" + res + "\n") ;

//求dm矩阵每一行的平均值

val
dmean: DenseVector[Double] = mean(dm(*, ::))

println("\nfirst
dm : \n" + dm + "\n") ;

println("dm矩阵每一行的平均值
: " +
dmean) ;

println("dm矩阵每一列的平均值
: " +
mean(dm(::, *))) ;

} 


def figure: Unit ={
  /**
   * Breeze-Viz
   * 随着版本变化, API会有大幅变化,  毕竟没有matplotlib强大
   *
   * Breeze延续了  Scalala的plotting 很多函数, 虽然API有些不同(但有很多 继承自Scalala).
   * 在scaladoc文档中 以trait的形式展示在breeze.plot包中.
   * 首先, 画一些 曲线图 并保存, 所有 实际的绘图工作都是由  非常健全的JFreeChart包完成
   */
  val a = new DenseVector[Int](1 to 3 toArray)
  val b = new DenseMatrix[Int](3, 3, 1 to 9 toArray)

  val f = Figure()
  val p = f.subplot(0)
  val x: DenseVector[Double] = linspace(0.0, 1.0)//曲线可画区间
  p += plot(x, x :^ 2.0)
  p += plot(x, x :^ 3.0, '.')
  p.xlabel = "x axis"
  p.ylabel = "y axis"
  f.saveas("/opt/scala/breeze-viz-ana/lines.png")


  /**
   * subplot增加一个子图
   * 绘制一个直方图 : 100,000正常分布 的随机数 装入 100 桶中(buckets)
   */
  val p2 = f.subplot(2, 1, 1)
  val g = breeze.stats.distributions.Gaussian(0, 1)//高斯分布
  p2 += hist(g.sample(100000), 1000)
  p2.title = "A normal distribution"
  p2.xlabel = "x-axis" ;
  p2.ylabel = "y-axis[count]"
  f.saveas("/opt/scala/breeze-viz-ana/subplots.png")

  val f2 = Figure()
  f2.subplot(0) += image(DenseMatrix.rand(200, 200))
  f2.saveas("/opt/scala/breeze-viz-ana/image.png")
}

Spark线性代数,绘图工具入门;scala, java下的Breeze线性代数以及数据绘图工具breeze-viz入门

相关阅读

相关文章

相关问答

相关文档