This project aims at teaching you the fundamentals of Machine Learning inpython. It contains the example code and solutions to the exercises in my O'Reilly book Hands-on Machine Learning with Scikit-Learn and TensorFlow:
Warning: there is now a newer edition of this book, please check out github.com/ageron/handson-ml2.
Use any of the following services.
WARNING: Please be aware that these services provide temporary environments: anything you do will be deleted after a while, so make sure you download any data you care about.
Recommended: open this repository in Colaboratory:
Or open it in Binder:
Or open it in Deepnote:
Browse this repository using jupyter.org's notebook viewer:
Note: github.com's notebook viewer also works but it is slower and the math equations are not always displayed correctly.
Read the Docker instructions.
Start by installing Anaconda (or Miniconda), git, and if you have a TensorFlow-compatible GPU, install the GPU driver, as well as the appropriate version of CUDA and cuDNN (see TensorFlow's documentation for more details).
Next, clone this project by opening a terminal and typing the following commands (do not type the first $
signs on each line, they just indicate that these are terminal commands):
$ git clone https://github.com/ageron/handson-ml.git
$ cd handson-ml
Next, run the following commands:
$ conda env create -f environment.yml
$ conda activate tf1
$ python -m ipykernel install --user --name=python3
Finally, start Jupyter:
$ jupyter notebook
If you need further instructions, read the detailed installation instructions.
Which Python version should I use?
I recommend Python 3.7. If you follow the installation instructions above, that's the version you will get. Most code will work with other versions of Python 3, but some libraries do not support Python 3.8 or 3.9 yet, which is why I recommend Python 3.7.
I'm getting an error when I call load_housing_data()
Make sure you call fetch_housing_data()
before you call load_housing_data()
. If you're getting an HTTP error, make sure you're running the exact same code as in the notebook (copy/paste it if needed). If the problem persists, please check your network configuration.
I'm getting an SSL error on MacOSX
You probably need to install the SSL certificates (see this StackOverflow question). If you downloaded Python from the official website, then run /Applications/Python\ 3.7/Install\ Certificates.command
in a terminal (change 3.7
to whatever version you installed). If you installed Python using MacPorts, run sudo port install curl-ca-bundle
in a terminal.
I've installed this project locally. How do I update it to the latest version?
See INSTALL.md
How do I update my Python libraries to the latest versions, when using Anaconda?
See INSTALL.md
I would like to thank everyone who contributed to this project, either by providing useful feedback, filing issues or submitting Pull Requests. Special thanks go to Haesun Park and Ian Beauregard who reviewed every notebook and submitted many PRs, including help on some of the exercise solutions. Thanks as well to Steven Bunkley and Ziembla who created the docker
directory, and to github user SuperYorio who helped on some exercise solutions.
Chapter 1 大概简介一下机器学习的内容及分类 Chapter 2 端到端机器学习项目 2.1 使用真实数据 2.2 观察大局 2.3 获取数据 1 获取数据属性 housing.info() #快速获取数据集的简单描述,注意数据类型及非空值数量 housing.head() #查看数据集前五行 housing[" "].value_counts() #查看分类 housing.
批量学习(batch learning),一次性批量输入给学习算法,可以被形象的称为填鸭式学习。 在线学习(online learning),按照顺序,循序的学习,不断的去修正模型,进行优化。 batch learning 如果数据很大的话,可以使用MapReduce技术,或者使用online learning。 performance measure 使用RMSE(root mean squar
什么是机器学习? 1、Machine Learning is the field of study that gives computers the ability to learn without being explicitly programmed. —Arthur Samuel, 1959 机器学习就是研究怎么让计算机在没有被明确编程的情况下拥有学习能力的领域 2、A computer p
问题内容: 我尝试使用Spark 1.1.0提供的新的TFIDF算法。我正在用Java写MLLib的工作,但我不知道如何使TFIDF实现有效。由于某种原因,IDFModel仅接受JavaRDD作为方法转换的输入,而不接受简单的Vector。 如何使用给定的类为我的LabledPoints建模TFIDF向量? 注意:文档行的格式为[标签; 文本] 到目前为止,这里是我的代码: *肖恩·欧文(Sean
本文向大家介绍mllib支持的算法?相关面试题,主要包含被问及mllib支持的算法?时的应答技巧和注意事项,需要的朋友参考一下 大体分为四大类,分类、聚类、回归、协同过滤。
本文向大家介绍关于Pytorch的MLP模块实现方式,包括了关于Pytorch的MLP模块实现方式的使用技巧和注意事项,需要的朋友参考一下 MLP分类效果一般好于线性分类器,即将特征输入MLP中再经过softmax来进行分类。 具体实现为将原先线性分类模块: 替换为: 并且添加MLP模块: 看一下模块结构: 以上这篇关于Pytorch的MLP模块实现方式就是小编分享给大家的全部内容了,希望能给大家
问题内容: 在我要启动的模型中,我有一些必须使用特定值初始化的变量。 我目前将这些变量存储到numpy数组中,但是我不知道如何修改我的代码以使其在google-cloud-ml作业中工作。 目前,我像这样初始化变量: 有人能帮我吗 ? 问题答案: 首先,您需要在GCS上复制/存储数据(使用),并确保您的训练脚本可以访问该存储桶。最简单的方法是将阵列复制到与数据相同的存储桶中,因为您可能已经将该存储
本文向大家介绍.NET开发人员关于ML.NET的入门学习,包括了.NET开发人员关于ML.NET的入门学习的使用技巧和注意事项,需要的朋友参考一下 ML.NET一直在微软的研究部门的工作。这些创新已经用于他们自己的产品,如Windows Defender,Microsoft Office(Powerpoint设计理念,Excel图表推荐),Azure机器学习,PowerBI。 ML.NET旨在提供
本文向大家介绍C#使用ML.Net完成人工智能预测,包括了C#使用ML.Net完成人工智能预测的使用技巧和注意事项,需要的朋友参考一下 前言 Visual Studio2019 Preview中提供了图形界面的ML.Net,所以,只要我们安装Visual Studio2019 Preview就能简单的使用ML.Net了,因为我的电脑已经安装了Visual Studio2019,所以我不需要重头安装