clj-ml

授权协议 Readme
开发语言 Python
所属分类 神经网络/人工智能、 机器学习/深度学习
软件类型 开源软件
地区 不详
投 递 者 喻高寒
操作系统 跨平台
开源组织
适用人群 未知
 软件概览

clj-ml

A machine learning library for Clojure built on top of Weka and friends.

This library (specifically, some dependencies) requires Java 1.7+.

Installation

Installing from Clojars

[cc.artifice/clj-ml "0.8.5"]

Installing from Maven

(add Clojars repository)

<dependency>
  <groupId>cc.artifice</groupId>
  <artifactId>clj-ml</artifactId>
  <version>0.8.5</version>
</dependency>

Supported algorithms

  • Filters

    • Discretization (supervised, unsupervised, PKI)
    • Nominal to binary (supervised, unsupervised)
    • Numeric to nominal
    • String to word vector
    • Attribute manipulation (reorder, add, remove range, remove percentage, etc.)
    • Resample (supervised, unsupervised)
    • Replace missing values with mean (numeric attributes) or mode (nominal attributes)
  • Classifiers

    • k-Nearest neighbor
    • Decision trees: C4.5/J4.8, Boosted stump, Random forest, Rotation forest, M5P
    • Naive Bayes
    • Multilayer perceptrons
    • Support vector machines (grid-based training), SMO, Spegasos
    • Raced Incremental Logit Boost
  • Regression

    • Linear
    • Logistic
    • Pace
    • Additive gradient boosting
  • Clusterers

    • k-Means
    • Cobweb
    • Expectation-maximization

Usage

API documenation can be found here.

I/O of data

user> (use 'clj-ml.io)
nil

user> (def ds (load-instances :arff "file:///home/josh/git/clj-ml/iris.arff"))
#'user/ds
user> ds
#<Instances @relation iris

@attribute sepallength numeric
@attribute sepalwidth numeric
@attribute petallength numeric
@attribute petalwidth numeric
@attribute class {Iris-setosa,Iris-versicolor,Iris-virginica}

@data
5.1,3.5,1.4,0.2,Iris-setosa
4.9,3,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5,3.6,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4,Iris-setosa
4.6,3.4,1.4,0.3,Iris-setosa
...

user> (def ds (load-instances :arff "http://repository.seasr.org/Datasets/UCI/arff/iris.arff"))
#'user/ds

user> (save-instances :csv "iris.csv" ds)
nil
user> (println (slurp "iris.csv"))
sepallength,sepalwidth,petallength,petalwidth,class
5.1,3.5,1.4,0.2,Iris-setosa
4.9,3,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5,3.6,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4,Iris-setosa
4.6,3.4,1.4,0.3,Iris-setosa
5,3.4,1.5,0.2,Iris-setosa
4.4,2.9,1.4,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa
5.4,3.7,1.5,0.2,Iris-setosa
...

user> (def ds (load-instances :csv "file:///home/josh/git/clj-ml/iris.csv"))
#'user/ds
user> ds
#<Instances @relation stream

@attribute sepallength numeric
@attribute sepalwidth numeric
@attribute petallength numeric
@attribute petalwidth numeric
@attribute class {Iris-setosa,Iris-versicolor,Iris-virginica}

@data
5.1,3.5,1.4,0.2,Iris-setosa
4.9,3,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5,3.6,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4,Iris-setosa
4.6,3.4,1.4,0.3,Iris-setosa
5,3.4,1.5,0.2,Iris-setosa

Working with datasets

user> (use 'clj-ml.data)
nil

user> (def ds (make-dataset "my-name" [:length :width {:style nil} {:kind [:good :bad]}]
                            [[12 24 "longish" :good]
                             [8 5 "shortish" :bad]]))
#'user/ds
user> ds
#<ClojureInstances @relation my-name

@attribute length numeric
@attribute width numeric
@attribute style string
@attribute kind {good,bad}

@data
12,24,longish,good
8,5,shortish,bad>

user> (dataset-seq ds)
(#<Instance 12,24,longish,good> #<Instance 8,5,shortish,bad>)

user> (map instance-to-map (dataset-seq ds))
({:kind :good, :style "longish", :width 24.0, :length 12.0}
{:kind :bad, :style "shortish", :width 5.0, :length 8.0})

user> (map instance-to-vector (dataset-seq ds))
([12.0 24.0 "longish" :good] [8.0 5.0 "shortish" :bad])

Filtering datasets

user> (use 'clj-ml.filters 'clj-ml.io)
nil

user> (def ds (load-instances :csv "file:///home/josh/git/clj-ml/iris.csv"))
#'user/ds

user> (def discretize (make-filter :unsupervised-discretize
                                   {:dataset-format ds
                                    :attributes [:sepallength :petallength]}))
#'user/discretize

user> (def filtered-ds (filter-apply discretize ds))
#'user/filtered-ds

user> (map instance-to-map (dataset-seq filtered-ds))
({:class :Iris-setosa, :petalwidth 0.2, :petallength :'(-inf-1.59]',
 :sepalwidth 3.5, :sepallength :'(5.02-5.38]'}
{:class :Iris-setosa, :petalwidth 0.2, :petallength :'(-inf-1.59]',
 :sepalwidth 3.0, :sepallength :'(4.66-5.02]'}
{:class :Iris-setosa, :petalwidth 0.2, :petallength :'(-inf-1.59]',
 :sepalwidth 3.2, :sepallength :'(4.66-5.02]'}
{:class :Iris-setosa, :petalwidth 0.2, :petallength :'(-inf-1.59]',
 :sepalwidth 3.1, :sepallength :'(-inf-4.66]'}
{:class :Iris-setosa, :petalwidth 0.2, :petallength :'(-inf-1.59]',
 :sepalwidth 3.6, :sepallength :'(4.66-5.02]'}
...) ;; the petallength and sepallength attributes are now nominal

Equivalently,

user> (def filtered-ds (->> "file:///home/josh/git/clj-ml/iris.csv"
                            (load-instances :csv)
                            (make-apply-filter :unsupervised-discretize
                                               {:attributes [:sepallength :petallength]})))

Using classifiers

user> (use 'clj-ml.classifiers 'clj-ml.data 'clj-ml.utils)
nil

user> (def ds (-> (load-instances :arff "file:///home/josh/git/clj-ml/iris.arff")
                  (dataset-set-class :class)))
#'user/ds

user> (def classifier (-> (make-classifier :decision-tree :c45)
                          (classifier-train ds)))
#'user/classifier

user> (def instance (-> (first (dataset-seq ds))
                        (instance-set-class-missing)))

user> (classifier-classify classifier instance)
:Iris-setosa

Evaluation:

user> (def evaluation (classifier-evaluate classifier :cross-validation ds 10))
#'user/evaluation

user> (clojure.pprint/pprint (dissoc evaluation :summary :confusion-matrix))
{:incorrect 7.0,
 :root-relative-squared-error 36.693518966642074,
 :sf-entropy-gain -4076.3670930399717,
 :recall
 {:Iris-setosa 0.9795918367346939,
  :Iris-versicolor 0.94,
  :Iris-virginica 0.94},
 :kb-information 217.7935138195151,
 :kb-relative-information 13741.240800360849,
 :false-positive-rate
 {:Iris-setosa 0.0,
  :Iris-versicolor 0.04040404040404041,
  :Iris-virginica 0.030303030303030304},
 :percentage-correct 95.30201342281879,
 :roc-area
 {:Iris-setosa 0.984845423317842,
  :Iris-versicolor 0.9456,
  :Iris-virginica 0.9496},
 :kb-mean-information 1.4617014350303028,
 :percentage-unclassified 0.0,
 :percentage-incorrect 4.697986577181208,
 :root-mean-squared-error 0.17297908222448935,
 :unclassified 0.0,
 :correlation-coefficient
 {:nan "Can't compute correlation coefficient: class is nominal!"},
 :correct 142.0,
 :sf-mean-entropy-gain -27.358168409664238,
 :mean-absolute-error 0.04083212821368881,
 :relative-absolute-error 9.187228848079984,
 :error-rate 0.04697986577181208,
 :kappa 0.9295222650179066,
 :f-measure
 {:Iris-setosa 0.9896907216494846,
  :Iris-versicolor 0.9306930693069307,
  :Iris-virginica 0.94},
 :false-negative-rate
 {:Iris-setosa 0.02040816326530612,
  :Iris-versicolor 0.06,
  :Iris-virginica 0.06},
 :evaluation-object #<Evaluation weka.classifiers.Evaluation@6a7272ca>,
 :average-cost 0.0,
 :precision
 {:Iris-setosa 1.0,
  :Iris-versicolor 0.9215686274509803,
  :Iris-virginica 0.94}}

user> (println (:summary evaluation))

Correctly Classified Instances         142               95.302  %
Incorrectly Classified Instances         7                4.698  %
Kappa statistic                          0.9295
Mean absolute error                      0.0408
Root mean squared error                  0.173 
Relative absolute error                  9.1872 %
Root relative squared error             36.6935 %
Total Number of Instances              149     
Ignored Class Unknown Instances                  1     

nil
user> (println (:confusion-matrix evaluation))
=== Confusion Matrix ===

  a  b  c   <-- classified as
 48  1  0 |  a = Iris-setosa
  0 47  3 |  b = Iris-versicolor
  0  3 47 |  c = Iris-virginica

nil

Saving and restoring (trained) classifiers:

user> (serialize-to-file classifier "my-classifier.bin")
"my-classifier.bin"

user> (def classifier2 (deserialize-from-file "my-classifier.bin"))
#'user/classifier2

user> (classifier-classify classifier2 instance)
:Iris-setosa

Text document handling:

user> (def docs [{:id 10
                  :title "Document title 1"
                  :fulltext "This is the fulltext..."
                  :has-class? false}
                 {:id 11
                  :title "Another document title"
                  :fulltext "Some more \"fulltext\"; rabbit artificial machine bananas"
                  :has-class? true}])
#'user/docs

user> (docs-to-dataset docs "bananas-model" "my-models" :stemmer true :lowercase false)
#<Instances @relation 'docs-weka.filters.unsupervised.attribute.StringToWordVector...'

@attribute class {no,yes}
@attribute title-1 numeric
@attribute title-Another numeric
@attribute title-Document numeric
@attribute title-document numeric
@attribute title-titl numeric
@attribute fulltext-Some numeric
@attribute fulltext-This numeric
@attribute fulltext-artifici numeric
@attribute fulltext-banana numeric
@attribute fulltext-fulltext numeric
@attribute fulltext-is numeric
@attribute fulltext-machin numeric
@attribute fulltext-more numeric
@attribute fulltext-rabbit numeric
@attribute fulltext-the numeric

@data
{0 yes,1 0.480453,3 0.480453,7 0.480453,11 0.480453,15 0.480453}
{2 0.480453,4 0.480453,6 0.480453,8 0.480453,9 0.480453,12 0.480453,13 0.480453,14 0.480453}>
user>

Words appearing in the dataset will only be those appearing in thedocuments (or a subset; by default, the most common 1000 words). Thispresents a problem when new documents are loaded and used in aclassifier trained on other documents. The classifier will not knowhow to handle word attributes that were not present in the trainingset.

The docs-to-dataset function provides the ability to save thetraining documents dataset and "filter" the testing documents throughthis dataset to ensure the same word attributes are extracted for bothsets. The following example shows that the words "foo, bar, baz, quux"are ignored in the new (testing) documents, and all the originalattributes in the training dataset are retained.

user> (docs-to-dataset docs "Topic" "Sports" 1 "/tmp"
                       :stemmer true :lowercase false :training true)
#<Instances @relation 'docs-weka.filters.unsupervised.attribute.StringToWordVector...'

@attribute class {no,yes}
@attribute title-1 numeric
@attribute title-Another numeric
@attribute title-Document numeric
@attribute title-document numeric
@attribute title-titl numeric
@attribute fulltext-Some numeric
@attribute fulltext-This numeric
@attribute fulltext-artifici numeric
@attribute fulltext-banana numeric
@attribute fulltext-fulltext numeric
@attribute fulltext-is numeric
@attribute fulltext-machin numeric
@attribute fulltext-more numeric
@attribute fulltext-rabbit numeric
@attribute fulltext-the numeric

@data
{2 0.480453,4 0.480453,6 0.480453,8 0.480453,9 0.480453,12 0.480453,13 0.480453,14 0.480453}
{0 yes,1 0.480453,3 0.480453,7 0.480453,11 0.480453,15 0.480453}>

user> (def docs2 [{:title "Document title 1 foo bar"
                   :fulltext "baz rabbit quux"
                   :terms {"Topic" ["Sports"]}}])
#'user/docs2

user> (docs-to-dataset docs2 "Topic" "Sports" 1 "/tmp"
                       :stemmer true :lowercase false :testing true)
#<Instances @relation 'docs-weka.filters.unsupervised.attribute.StringToWordVector...'

@attribute class {no,yes}
@attribute title-1 numeric
@attribute title-Another numeric
@attribute title-Document numeric
@attribute title-document numeric
@attribute title-titl numeric
@attribute fulltext-Some numeric
@attribute fulltext-This numeric
@attribute fulltext-artifici numeric
@attribute fulltext-banana numeric
@attribute fulltext-fulltext numeric
@attribute fulltext-is numeric
@attribute fulltext-machin numeric
@attribute fulltext-more numeric
@attribute fulltext-rabbit numeric
@attribute fulltext-the numeric

@data
{0 yes,1 0.480453,3 0.480453,14 0.480453}>
user>

Using clusterers

user> (use 'clj-ml.clusterers)
nil

user> (def ds (-> (load-instances :arff "file:///home/josh/git/clj-ml/iris.arff")
                  (dataset-remove-attribute-at 4)))
#'user/ds
user> ds
#<Instances @relation iris

@attribute sepallength numeric
@attribute sepalwidth numeric
@attribute petallength numeric
@attribute petalwidth numeric

@data
5.1,3.5,1.4,0.2
4.9,3,1.4,0.2
4.7,3.2,1.3,0.2
4.6,3.1,1.5,0.2
5,3.6,1.4,0.2
5.4,3.9,1.7,0.4
4.6,3.4,1.4,0.3
...

user> (def clusterer (make-clusterer :k-means {:number-clusters 3}))
#'user/clusterer

user> (clusterer-build clusterer ds)
nil

user> clusterer
#<SimpleKMeans 
kMeans
======

Number of iterations: 6
Within cluster sum of squared errors: 6.998114004826762
Missing values globally replaced with mean/mode

Cluster centroids:
                           Cluster#
Attribute      Full Data          0          1          2
                   (150)       (61)       (50)       (39)
=========================================================
sepallength       5.8433     5.8885      5.006     6.8462
sepalwidth         3.054     2.7377      3.418     3.0821
petallength       3.7587     4.3967      1.464     5.7026
petalwidth        1.1987      1.418      0.244     2.0795


>

user> (def clustered-ds (clusterer-cluster clusterer ds))
#'user/clustered-ds

user> clustered-ds
#<ClojureInstances @relation 'clustered iris'

@attribute sepallength numeric
@attribute sepalwidth numeric
@attribute petallength numeric
@attribute petalwidth numeric
@attribute class {0,1,2}

@data
5.1,3.5,1.4,0.2,1
4.9,3,1.4,0.2,1
4.7,3.2,1.3,0.2,1
4.6,3.1,1.5,0.2,1
5,3.6,1.4,0.2,1
5.4,3.9,1.7,0.4,1
4.6,3.4,1.4,0.3,1
5,3.4,1.5,0.2,1
4.4,2.9,1.4,0.2,1
...

Example: Home price prediction

http://www.ibm.com/developerworks/library/os-weka1/

user> (def homes (make-dataset "homes" [:house-size :lot-size :bedrooms
                                        :granite :bathroom :sellingPrice]
                               [[3529, 9191, 6, 0, 0, 205000]
                                [3247, 10061, 5, 1, 1, 224900]
                                [4032, 10150, 5, 0, 1, 197900]
                                [2397, 14156, 4, 1, 0,189900]
                                [2200, 9600, 4, 0, 1, 195000] 
                                [3536, 19994, 6, 1, 1,325000] 
                                [2983, 9365, 5, 0, 1, 230000]]))
#'user/homes

user> (def homes (dataset-set-class homes :sellingPrice))
#'user/homes

user> homes
#<ClojureInstances @relation homes

@attribute house-size numeric
@attribute lot-size numeric
@attribute bedrooms numeric
@attribute granite numeric
@attribute bathroom numeric
@attribute sellingPrice numeric

@data
3529,9191,6,0,0,205000
3247,10061,5,1,1,224900
4032,10150,5,0,1,197900
2397,14156,4,1,0,189900
2200,9600,4,0,1,195000
3536,19994,6,1,1,325000
2983,9365,5,0,1,230000>

user> (def reg (classifier-train (make-classifier :regression :linear) homes))
#'user/reg

user> reg
#<LinearRegression 
Linear Regression Model

sellingPrice =

    -26.6882 * house-size +
      7.0551 * lot-size +
  43166.0767 * bedrooms +
  42292.0901 * bathroom +
 -21661.1208>
user> 

user> (classifier-predict-numeric reg (make-instance homes [3198, 9669, 5, 1, 1, nil]))
219328.35717359098

Example: Predicting survival on the Titanic

https://www.kaggle.com/c/titanic-gettingStarted

First globally replace all double quoted strings ""foo"" withbackslash quoted strings: \"foo\". Weka does not handle the former.

user> (require '[clj-ml.io :refer [load-instances]]
               '[clj-ml.data :refer [dataset-set-class dataset-class-index dataset-class-name]]
               '[clj-ml.filters :refer [make-apply-filter]]
               '[clj-ml.classifiers :refer [classifier-evaluate make-classifier]])
nil
user> (def titanicds (load-instances :csv "file:///home/josh/git/clj-ml/titanic-train.csv"))
user> titanicds
#<Instances @relation stream

@attribute PassengerId numeric
@attribute Survived numeric
@attribute Pclass numeric
@attribute Name {'Braund, Mr. Owen Harris','Cumings, Mrs. John Bradley (Florence Briggs Thayer)', ...}
@attribute Sex {male,female}
@attribute Age numeric
@attribute SibSp numeric
@attribute Parch numeric
@attribute Ticket {'A/5 21171','PC 17599','STON/O2. 3101282',113803.0, ...}
@attribute Fare numeric
@attribute Cabin {C85,C123,E46,G6,C103,D56,A6,'C23 C25 C27', ...}
@attribute Embarked {S,C,Q}

@data
1,0,3,'Braund, Mr. Owen Harris',male,22,1,0,'A/5 21171',7.25,?,S
2,1,1,'Cumings, Mrs. John Bradley (Florence Briggs Thayer)',female,38,1,0,'PC 17599',71.2833,C85,C
3,1,3,'Heikkinen, Miss. Laina',female,26,0,0,'STON/O2. 3101282',7.925,?,S
4,1,1,'Futrelle, Mrs. Jacques Heath (Lily May Peel)',female,35,1,0,113803.0,53.1,C123,S
5,0,3,'Allen, Mr. William Henry',male,35,0,0,373450.0,8.05,?,S
6,0,3,'Moran, Mr. James',male,?,0,0,330877.0,8.4583,?,Q
7,0,1,'McCarthy, Mr. Timothy J',male,54,0,0,17463.0,51.8625,E46,S
8,0,3,'Palsson, Master. Gosta Leonard',male,2,3,1,349909.0,21.075,?,S
9,1,3,'Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)',female,27,0,2,347742.0,11.1333,?,S
10,1,2,'Nasser, Mrs. Nicholas (Adele Achem)',female,14,1,0,237736.0,30.0708,?,C
11,1,3,'Sandstrom, Miss. Marguerite Rut',female,4,1,1,'PP 9549',16.7,G6,S
...
>

#'user/titanicds
user> (def titanicds (dataset-set-class titanicds :Survived))
#'user/titanicds
user> (dataset-class-index titanicds)
1

user> (def titanicds (make-apply-filter :numeric-to-nominal
                                        {:attributes [:Survived]}
                                        titanicds))
#'user/titanicds
user> titanicds
#<Instances @relation stream-weka.filters.unsupervised.attribute.NumericToNominal-R2

@attribute PassengerId numeric
@attribute Survived {0,1}
@attribute Pclass numeric
...
>

user> (def titanicds (make-apply-filter :replace-missing-values {} titanicds))

user> (def titanicds (make-apply-filter :remove-attributes
                                        {:attributes [:PassengerId :Name :Ticket :Cabin]}
                                        titanicds))
#'user/titanicds
user> titanicds
#<Instances @relation 'stream-weka.filters.unsupervised.attribute.NumericToNominal...'

@attribute Survived {0,1}
@attribute Pclass numeric
@attribute Sex {male,female}
@attribute Age numeric
@attribute SibSp numeric
@attribute Parch numeric
@attribute Fare numeric
@attribute Embarked {S,C,Q}

@data
0,3,male,22,1,0,7.25,S
1,1,female,38,1,0,71.2833,C
1,3,female,26,0,0,7.925,S
1,1,female,35,1,0,53.1,S
0,3,male,35,0,0,8.05,S
0,3,male,?,0,0,8.4583,Q
...
>

user> (dataset-class-index titanicds)
0
user> (dataset-class-name titanicds)
:Survived

user> (def evaluation (classifier-evaluate (make-classifier :decision-tree :random-forest)
                                           :cross-validation titanicds 10))
#'user/evaluation
user> (println (:summary evaluation))

Correctly Classified Instances         727               81.5937 %
Incorrectly Classified Instances       164               18.4063 %
Kappa statistic                          0.6039
Mean absolute error                      0.2409
Root mean squared error                  0.3819
Relative absolute error                 50.9302 %
Root relative squared error             78.532  %
Total Number of Instances              891     

nil
user> (println (:confusion-matrix evaluation))
=== Confusion Matrix ===

   a   b   <-- classified as
 483  66 |   a = 0
  98 244 |   b = 1

nil

Ok, looks good, let's try training on the full training data andtesting on the testing data.

user> (require '[clj-ml.data :refer [dataset-as-maps dataset-seq]]
               '[clj-ml.classifiers :refer [classifier-train classifier-classify]])
user> (def titanic-testds (load-instances :csv "file:///home/josh/git/clj-ml/titanic-test.csv"))
nil
user> (def titanic-test-passids (map (comp int :PassengerId)
                                     (dataset-as-maps titanic-testds)))
#'user/titanic-test-passids
user> titanic-test-passids
(892 893 894 895 896 897 898 899 900 ...)

user> (def titanic-testds (->> titanic-testds
                               (make-apply-filter :remove-attributes
                                                  {:attributes [:PassengerId :Name :Ticket :Cabin]})
                               (make-apply-filter :replace-missing-values {})
                               (make-apply-filter :add-attribute
                                                  {:type :nominal :name :Survived
                                                   :column 0 :labels ["0" "1"]})))
#'user/titanic-testds

user> (def titanic-testds (dataset-set-class titanic-testds :Survived))
#'user/titanic-testds

user> titanic-testds
#<Instances @relation 'stream-weka.filters.unsupervised.attribute.Remove...'

@attribute Survived {0,1}
@attribute Pclass numeric
@attribute Sex {male,female}
@attribute Age numeric
@attribute SibSp numeric
@attribute Parch numeric
@attribute Fare numeric
@attribute Embarked {Q,S,C}

@data
?,3,male,34.5,0,0,7.8292,Q
?,3,female,47,1,0,7,S
?,2,male,62,0,0,9.6875,Q
?,3,matitanle,27,0,0,8.6625,S
?,3,female,22,1,1,12.2875,S
?,3,male,14,0,0,9.225,S
?,3,female,30,0,0,7.6292,Q
...
>

user> (def classifier (classifier-train (make-classifier :decision-tree :random-forest) titanicds))
#'user/classifier

user> (def preds (for [instance (dataset-seq titanic-testds)]
                      (name (classifier-classify classifier instance))))
#'user/preds

user> preds
("0" "1" "0" "0" "0" "0" "1" "0" "0" "0" ...)

#'user/preds

user> (spit "titanic-predictions.csv"
            (clojure.string/join "\n" (cons "Survived,PassengerId"
                                            (map (fn [c1 c2] (format "%s,%d" c1 c2))
                                                 preds titanic-test-passids))))
nil

user> (println (slurp "titanic-predictions.csv"))
Survived,PassengerId
0,892
1,893
0,894
0,895
0,896
0,897
1,898
0,899
0,900
0,901
0,902
...

How to add a Weka classifier

  • In classifiers.clj:
    • Add the appropriate import to the top of the file.
    • Create another implementation of make-classifier-options (using defmethod, like the others). At this point, you must decide the pair of keywords that identify your algorithm, such as :decision-tree :c45. List all the Weka options that the classifier accepts. Use check-options for options that are either present or absent, and check-option-values for options that require a value in addition to the option.
    • Add documentation to the (defmulti make-classifier ...) docstring.
    • Create another implementation of make-classifier (using defmethod, like the others).
  • Ideally, add some test cases in classifers_test.clj.

Thanks YourKit!

YourKit is kindly supporting open source projects with itsfull-featured Java Profiler. YourKit, LLC is the creator ofinnovative and intelligent tools for profiling Java and .NETapplications. Take a look at YourKit's leading software products: YourKit JavaProfiler and YourKit .NETProfiler.

License

MIT License

Authors

  • 1.definition of ML the design and development of algorithms that allow computers to evolve behaviors based on empirical data 2.application computer vision(数码相机中的人脸识别)、speech recognition(科大讯飞、百度语音输入法) 

  • http://m.sciencemag.org/site/feature/data/compsci/machine_learning.xhtml

  • 朴素贝叶斯 参考[1] 事件A和B同时发生的概率为在A发生的情况下发生B或者在B发生的情况下发生A P(A∩B)=P(A)∗P(B|A)=P(B)∗P(A|B) 所以有: P(A|B)=P(B|A)∗P(A)P(B) 对于给出的待分类项,求解在此项出现的条件下各个目标类别出现的概率,哪个最大,就认为此待分类项属于哪个类别 工作原理 1、假设现在有样本x=(a1,a2,a3,…an)这个待分类项(并

  • from: https://sgfin.github.io/learning-resources/ This is a not-particularly-systematic attempt to curate a handful of my favorite resources for learning statistics and machine learning. This isn’t me

 相关资料
  • graphql-clj A Clojure library designed to provide GraphQL implementation. Demo Demo Project with GraphiQL What's new in version 0.2 Simplified APIs Rewrite schema and query validator for simplicity an

  • Current Status (January 16, 2020) After developing and using this Clojure wrapper for AWS CDK for the pastsix months, we've decided to use TypeScript and AWS' library directly. We arediscontinuing mai

  • clj-http 是一个新的 Clojure HTTP 客户端开发包,主要特点是设计简单、可靠、可扩展和可测试。 示例代码: (require '[clj-http.client :as client])(client/get "http://rest-test.heroku.com/")=> {:status 200 :headers {"date" "Sun, 01 Aug 2010 0

  • clj-docker-client An idiomatic, data-driven, REPL friendly Clojure Docker client inspired from Cognitect's AWS client. NOTICE: Consider using contajners too. Supports more container engines like Podma

  • 我最近开始学习Clojure,想知道是否有一种执行简单. clj文件的标准方法。 我已经安装了Leiningen,并创建了我的第一个项目称为我的东西使用lein新应用程序我的东西。 管理我的东西。core,我从lein run开始: 接下来,我尝试了lein repl,接着是: 我还使用lein repl进行了一些基本评估: 我试着在我的东西里定义这个函数。核心: 我得到以下错误:clojure.

  • 我遵循这里的说明,并试图通过调用lein jar从clj源生成一个Java类。 但是,当我稍微编辑代码以添加自己的测试函数时: .. 然后用lein jar生成一个Java类文件(我在文章末尾追加了project.clj),我发现生成的jar包含作为内部类的方法: 那是一些。示例类只包含main方法,不包含foo: 所以问题是:我们如何指定一个clj Clojure文件来生成一个Java类,该类包