This video series will teach you how to solve Machine Learning problems using Python's popular scikit-learn library. There are 10 video tutorials totaling 4.5 hours, each with a corresponding Jupyter notebook. The notebook contains everything you see in the video: code, output, images, and comments.
Note: The notebooks in this repository have been updated to use Python 3.9.1 and scikit-learn 0.23.2. The original notebooks (shown in the video) used Python 2.7 and scikit-learn 0.16, and can be downloaded from the archive branch. You can read about how I updated the code in this blog post.
You can watch the entire series on YouTube, and view all of the notebooks using nbviewer.
Once you complete this video series, I recommend enrolling in my online course, Machine Learning with Text in Python, to gain a deeper understanding of scikit-learn and Natural Language Processing.
What is Machine Learning, and how does it work? (video, notebook)
Setting up Python for Machine Learning: scikit-learn and Jupyter Notebook (video, notebook)
Getting started in scikit-learn with the famous iris dataset (video, notebook)
Training a Machine Learning model with scikit-learn (video, notebook)
Comparing Machine Learning models in scikit-learn (video, notebook)
Data science pipeline: pandas, seaborn, scikit-learn (video, notebook)
Cross-validation for parameter tuning, model selection, and feature selection (video, notebook)
Efficiently searching for optimal tuning parameters (video, notebook)
Evaluating a classification model (video, notebook)
Building a Machine Learning workflow (video, notebook)
At the PyCon 2016 conference, I taught a 3-hour tutorial that builds upon this video series and focuses on text-based data. You can watch the tutorial video on YouTube.
Here are the topics I covered:
Visit this GitHub repository to access the tutorial notebooks and many other recommended resources.
by Kavita Ganesan 通过Kavita Ganesan 如何使用TF-IDF和Python的Scikit-Learn从文本中提取关键字 (How to extract keywords from text with TF-IDF and Python’s Scikit-Learn) Back in 2006, when I had to use TF-IDF for keyword
参考:http://scikit-learn.org/stable/presentations.html scikit-learn的User Guide基本看完了(除了具体estimator部分),这里再摘录scikit-learn官方网站提供的额外资源,供之后学习。 关于supervised learning和unsupervised learning中涉及到的estimator,用到的时候再看
scikit-learn 是一个 Python 的机器学习项目。是一个简单高效的数据挖掘和数据分析工具。基于 NumPy、SciPy 和 matplotlib 构建。 Installation 依赖 scikit-learn 要求: Python (>= 2.7 or >= 3.3) NumPy (>= 1.8.2) SciPy (>= 0.13.3) 运行示例需要 Matplotlib >= 1
你可以使用 Keras 的 Sequential 模型(仅限单一输入)作为 Scikit-Learn 工作流程的一部分,通过在此找到的包装器: keras.wrappers.scikit_learn.py。 有两个封装器可用: keras.wrappers.scikit_learn.KerasClassifier(build_fn=None, **sk_params), 这实现了Scikit-Le
校验者: @小瑶 翻译者: @片刻 Note 如果你想为这个项目做出贡献,建议你 安装最新的开发版本 . 安装最新版本 Scikit-learn 要求: Python (>= 2.7 or >= 3.3), NumPy (>= 1.8.2), SciPy (>= 0.13.3). 如果你已经有一个安全的 numpy 和 scipy,安装 scikit-learn 最简单的方法是使用 pip pip
问题内容: 我试图在Linux Mint 12上安装scikit-learn,但失败了。我从http://pypi.python.org/pypi/scikit- learn/ 下载了该软件包并安装了 然后,我将目录更改为home并启动python2.7 shell。在导入sklearn时,我得到了: 我认为问题出在scipy的空间。这是因为当我做 我得到与Scikit学习相同的错误。 请帮忙。谢
问题内容: 我正在处理不平衡类(5%1)的分类问题。我想预测班级,而不是概率。 在二进制分类问题中,默认情况下是否使用scikit ?如果没有,默认方法是什么?如果可以,该如何更改? 在scikit中,某些分类器可以选择,但并非全部都可以。使用,是否将实际人口比例用作阈值? 在不支持的分类器中执行此操作的方式是什么?除了自己使用然后计算类。 问题答案: 默认情况下,scikit是否使用0.5? 在
问题内容: 读取执行的scikit学习中tensroflow:http://learningtensorflow.com/lesson6/和scikit学习:http://scikit- learn.org/stable/modules/generated/sklearn.cluster.KMeans.html 我努力决定使用哪种实现。 scikit-learn作为tensorflow docke