LightGBM - A fast, distributed, high performance gradient boosting framework
Explainable Boosting Machines - interpretable model developed in Microsoft Research using bagging, gradient boosting, and automatic interaction detection to estimated generalized additive models.
AutoML
Neural Network Intelligence - An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Archai - Reproducible Rapid Research for Neural Architecture Search (NAS).
Oscar - Object-Semantics Aligned Pre-training for Vision-Language Tasks.
TorchGeo - a PyTorch domain library, similar to torchvision, that provides datasets, transforms, samplers, and pre-trained models specific to geospatial data.
Swin Transformer - an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
Time Series
luminol - anomaly detection and correlation library.
SR-CNN - Spectral Residual based anomaly detection algorithm, SR-CNN implementation.
Greykite - flexible, intuitive and fast forecasts through its flagship algorithm, Silverkite.
Microsoft Finance Time Series Forecasting Framework - a forecasting package that utilizes cutting-edge time series forecasting and parallelization on the cloud to produce accurate forecasts for financial data.
Turing-NLG - Turing Natural Language Generation, 17 billion-parameter language model.
DeBERTa - Decoding-enhanced BERT with Disentangled Attention
UniLM - Unified Language Model Pre-training / Pre-training for NLP and Beyond
Unicoder - Unicoder model for understanding and generation.
NeuronBlocks - building your nlp dnn models like playing lego
Multilingual Model Transfer - new deep learning models for bootstrapping language understanding models for languages with no labeled data using labeled data from other languages.
MT-DNN - multi-task deep neural networks for natural language understanding.
OpenKP - automatically extracting keyphrases that are salient to the document meanings is an essential step in semantic document understanding.
DeText - a deep neural text understanding framework for ranking and classification tasks.
Genalog - an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.
FastFormers - highly efficient transformer models for NLU.
VERSEAGILITY - a Python-based toolkit to ramp up your custom natural language processing (NLP) task, allowing you to bring your own data and bring models into production. It is a central component of the Microsoft Data Science Toolkit.
DPU Utilities - Utilities used by the Deep Program Understanding team.
Online Machine Learning
Vowpal Wabbit - fast, efficient, and flexible online machine learning techniques for reinforcement learning, supervised learning, and more.
Recommendation
Recommenders - examples and best practics for building recommendation systems (A2SVD, DKN, xDeepFM, LightGBM, LSTUR, NAML, NPA, NRMS, RLRMC, SAR, Vowpal Wabbit are invented/contributed by Microsoft).
RobustDG - Toolkit for building machine learning models that generalize to unseen domains and are robust to privacy and other attacks.
SHAP - a game theoretic approach to explain the output of any machine learning model (scott lundbert, Microsoft Research).
LIME - explaining the predictions of any machine learning classifier (Marco, Microsoft Research).
BackwardCompatibilityML - Project for open sourcing research efforts on Backward Compatibility in Machine Learning
confidential-ml-utils - Python utilities for training and deploying ML models against data you can't see.
presidio - context aware, pluggable and customizable data protection and anonymization service for text and images.
Presidio-research - This package features data-science related tasks for developing new recognizers for Presidio.
Confidential ONNX Inference Server - An Open Enclave port of the ONNX inference server with data encryption and attestation capabilities to enable confidential inference on Azure Confidential Computing.
Responsible-AI-Widgets - responsible AI user interfaces for Fairlearn, interpret-community, and Error Analysis, as well as foundational building blocks that they rely on.
Error Analysis - A toolkit to help analyze and improve model accuracy.
Secure Data Sandbox - A toolkit for conducting machine learning trials against confidential data.
shrike - Python utilities to aid "compliant experiment" in Azure Machine Learning - training ML models without seeing the training data.
Optimization
ONNXRuntime - cross-platfom, high performance ML inference and training accelerator.
nnfusion - flexible and efficient deep neural network compiler.
Reinforcement Learning
AirSim - open source simulator for autonomous vehicles build on unreal engine / unity from microsoft research.
TextWorld - TextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games.
Moab - Project Moab, a new open-source balancing robot to help engineers and developers learn how to build real-world autonomous control systems with Project Bonsai.
COCO Dataset - COCO is a large-scale object detection, segmentation, and captioning dataset.
MS MARCO - collection of datasets focused on deep learning in search.
InnerEye CreateDataset - InnerEye dataset creation tool for InnerEye-DeepLearning library. Transforms DICOM data into mask for training Deep Learning models.
Bench ML - Python library to benchmark popular pre-built cloud AI APIs.
debugpy - An implementation of the Debug Adapter Protocol for Python
kineto - A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters contributed by Azure AI Platform team.
SuperBenchmark - a benchmarking and diagnosis tool for AI infrastructure (software & hardware).
Pipeline
GitHub Actions - Automate all your software workflows, now with world-class CI/CD. Build, test, and deploy your code right from GitHub.
Azure Pipelines - Automate your builds and deployments with Pipelines so you spend less time with the nuts and bolts and more time being creative.
Dagli - framework for defining machine learning models, including feature generation and transformations as DAG.
Platform
AI for Earth API Platform - distributed infrastructure designed to provide a secure, scalable, and customizable API hosting, designed to handle the needs of long-running/asynchronous machine learning model inference.
Planetary Computer Hub - a development environment that makes our data and APIs accessible through familiar, open-source tools, and allows users to easily scale their analyses.
Poultry barn mapping - code for detecting poultry barns from high-resolution aerial imagery and an accompanying dataset of predicted barns over the United States.
A TALE OF THREE CITIES - Analyzing the safety (311) dataset published by Azure Open Datasets for Chicago, Boston and New York City using SparkR, SParkSQL, Azure Databricks, visualization using ggplot2 and leaflet.
MLOps Solution Accelerator - this repository helps ML teams to accelerate their model deployment to production leveraging Azure.
Anomaly Detection Solution Accelerator - implement Anomaly Detection which is the technique of identifying rare events or observations which can raise suspicions by being statistically different from the rest of the observations.
Classification Solution Accelerator - This is a classification solution accelerator to help you build and deploy a binary classification project.
Community
AI@Edge Community - find the resources you need to create solutions using intelligence at the edge through combinations of hardware, machine learning (ML), artificial intelligence (AI) and Microsoft Azure service.
Global AI Community - empowers developers who are passionate about AI to share knowledge through events and meetups.
Deep Learning Lab (Japan) - provides information on development cases and the latest technology trends related to deep learning.
Dev Intro to Data Science - In this 28-video series, you will learn important concepts and technologies to build your end-to-end machine learning applications on Azure.
AI System - system for AI Education Resource (Chinese).
AI Edu - AI education materials for Chinese students, teachers and IT professionals (Chinese).
---
Contributing
This project welcomes contributions and suggestions. Most contributions require you to agree to aContributor License Agreement (CLA) declaring that you have the right to, and actually do, grant usthe rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to providea CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructionsprovided by the bot. You will only need to do this once across all repos using our CLA.
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsofttrademarks or logos is subject to and must followMicrosoft's Trademark & Brand Guidelines.Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship.Any use of third-party trademarks or logos are subject to those third-party's policies.
Let’s start by telling the truth: machines don’t learn. What a typical “learning machine” does, is finding a mathematical formula, which, when applied to a collection of inputs (called “training data”
1. cs 299: Machine Learning 2. cs 231n: Convolutional Neural Networks for Visual Recognition 3. cs 230: Deep learning 4. cs 224d: Deep Learning for Natural Language Processing 5.MIT:Mathematics for Ma
Machine Learning(1)Collect Documents 1. Introduction Input Data —> Feature Representation —>Learning Algorithm Deep Learning —> UnsupervisedFeature Learning Example from Picture Learning, how to judge
Machine Learning This project provides a web-interface,as well as a programmatic-apifor various machine learning algorithms. Supported algorithms: Support Vector Machine (SVM) Support Vector Regressio
Machine Learning Projects This repository contains mini projects in machine learning with jupyter notebook files.Go to the projects folder and see the readme for detailed instructions about the projec
Machine Learning for OpenCV This is the Jupyter notebook version of the following book: Michael Beyeler Machine Learning for OpenCV Intelligent Image Processing with Python 14 July 2017 Packt Publishi
Machine Learning and Data Science Applications in Industry Sov.ai Research Lab (Sponsorship) Animated Investment Management Research at Sov.ai — Sponsoring open source AI, Machine learning, and Data S