Notebooks come alive when interactive widgets are used. Users can visualize and control changes in the data and the model. Learning becomes an immersive, plus fun, experience.
Project Jupyter was born out of the IPython Project in 2014 and evolved rapidly to support interactive data science and scientific computing across all major programming languages. There is no doubt that it has left one of the biggest degrees of impact on how a data scientist can quickly test and prototype his/her idea and showcase the work to peers and open-source community.
However, learning and experimenting with data become truly immersive when user can interactively control the parameters of the model and see the effect (almost) real-time. Most of the common rendering in Jupyter are static. However, there is a big effort to introduce elements called ipywidgets, which renders fun and interactive controls on the Jupyter notebook.
Widgets are eventful python objects that have a representation in the browser, often as a control like a slider, textbox, etc., through a front-end (HTML/Javascript) rendering channel.
We demonstrate simple linear regression of single variable using interactive control elements. Note, the idea can be extended for complex multi-variate, nonlinear, kernel based regression easily. However, just for simplicity of visualization, we stick to single variable case in this demo.
First, we show the data generation process as a function of input variables and statistical properties of the associated noise.
Next, We introduce interactive control for the following hyperparameters.
User can interact with the linear regression model using these controls. Note, how the test and training scores are also updated dynamically to show a trend of over-fitting or under-fitting as the model complexity changes. One can go back to the data generation control and increase of decrease the noise magnitude to see its impact on the fitting quality and bias/variance trade-off.
Check this article I wrote on Medium about this project.
一、数据介绍 Titanic - Machine Learning from Disaster是主要针对机器学习初学者开展的比赛,数据格式比较简单,为结构化数据。数据的数量较少(训练集892条,测试集419条),因此,就算找到有效的特征有良好的准确度,但很有可能因为一些小变动就让准确度下降。事实上,Public Leaderboard分数较高的notebook,未必对未知数据有良好的预测能力,
We should think in below four questions: the decription of machine learning key tasks in machine learning why you need to learn about machine learning why python is so great for machine learning 1.T
</pre><pre name="code" class="python"># -*- coding: cp936 -*- from numpy import * import operator from os import listdir def createDataSet(): group = array([[1.0,1.1],[1.0,1.0],[0,0],[0,0.1]])