InnerEye-DeepLearning

授权协议 MIT License
开发语言 Python
所属分类 神经网络/人工智能、 机器学习/深度学习
软件类型 开源软件
地区 不详
投 递 者 万高畅
操作系统 跨平台
开源组织
适用人群 未知
 软件概览

InnerEye-DeepLearning

Overview

This is a deep learning toolbox to train models on medical images (or more generally, 3D images).It integrates seamlessly with cloud computing in Azure.

On the modelling side, this toolbox supports

  • Segmentation models
  • Classification and regression models
  • Sequence models
  • Adding cloud support to any PyTorch Lightning model, via a bring-your-own-model setup
  • Active label cleaning and noise robust learning toolbox (stand-alone folder)

Classification, regression, and sequence models can be built with only images as inputs, or a combination of imagesand non-imaging data as input. This supports typical use cases on medical data where measurements, biomarkers,or patient characteristics are often available in addition to images.

On the user side, this toolbox focusses on enabling machine learning teams to achieve more. It is cloud-first, andrelies on Azure Machine Learning Services (AzureML) for execution,bookkeeping, and visualization. Taken together, this gives:

  • Traceability: AzureML keeps a full record of all experiments that were executed, including a snapshot ofthe code. Tags are added to the experiments automatically, that can later help filter and find old experiments.
  • Transparency: All team members have access to each other's experiments and results.
  • Reproducibility: Two model training runs using the same code and data will result in exactly the same metrics. Allsources of randomness like multithreading are controlled for.
  • Cost reduction: Using AzureML, all compute (virtual machines, VMs) is requested at the time of starting thetraining job, and freed up at the end. Idle VMs will not incur costs. In addition, Azure low prioritynodes can be used to further reduce costs (up to 80% cheaper).
  • Scale out: Large numbers of VMs can be requested easily to cope with a burst in jobs.

Despite the cloud focus, all training and model testing works just as well on local compute, which is important formodel prototyping, debugging, and in cases where the cloud can't be used. In particular, if you already have GPUmachines available, you will be able to utilize them with the InnerEye toolbox.

In addition, our toolbox supports:

  • Cross-validation using AzureML's built-in support, where the models forindividual folds are trained in parallel. This is particularly important for the long-running training jobsoften seen with medical images.
  • Hyperparameter tuning usingHyperdrive.
  • Building ensemble models.
  • Easy creation of new models via a configuration-based approach, and inheritance from an existingarchitecture.

Once training in AzureML is done, the models can be deployed from within AzureML or viaAzure Stack Hub.

Getting started

We recommend using our toolbox with Linux or with the Windows Subsystem for Linux (WSL2). Much of the corefunctionality works fine on Windows, but PyTorch's full feature set is only available on Linux. Read more aboutWSL here.

Clone the repository into a subfolder of the current directory:

git clone --recursive https://github.com/microsoft/InnerEye-DeepLearning
cd InnerEye-DeepLearning
git lfs install
git lfs pull

After that, you need to set up your Python environment:

  • Install conda or miniconda for your operating system.
  • Create a Conda environment from the environment.yml file in the repository root, and activate it:
conda env create --file environment.yml
conda activate InnerEye
  • If environment creation fails with odd error messages on a Windows machine, please continue here.

Now try to run the HelloWorld segmentation model - that's a very simple model that will train for 2 epochs on anymachine, no GPU required. You need to set the PYTHONPATH environment variable to point to the repository root first.Assuming that your current directory is the repository root folder, on Linux bash that is:

export PYTHONPATH=`pwd`
python InnerEye/ML/runner.py --model=HelloWorld

(Note the "backtick" around the pwd command, this is not a standard single quote!)

On Windows:

set PYTHONPATH=%cd%
python InnerEye/ML/runner.py --model=HelloWorld

If that works: Congratulations! You have successfully built your first model using the InnerEye toolbox.

If it fails, please check thetroubleshooting page on the Wiki.

Further detailed instructions, including setup in Azure, are here:

  1. Setting up your environment
  2. Training a Hello World segmentation model
  3. Setting up Azure Machine Learning
  4. Creating a dataset
  5. Building models in Azure ML
  6. Sample Segmentation and Classification tasks
  7. Debugging and monitoring models
  8. Model diagnostics
  9. Move a model to a different workspace
  10. Working with FastMRI models
  11. Active label cleaning and noise robust learning toolbox

Deployment

We offer a companion set of open-sourced tools that help to integrate trained CT segmentation models with clinicalsoftware systems:

  • The InnerEye-Gateway is a Windows service running in a DICOM network,that can route anonymized DICOM images to an inference service.
  • The InnerEye-Inference component offers a REST API that integrateswith the InnnEye-Gateway, to run inference on InnerEye-DeepLearning models.

Details can be found here.

More information

  1. Project InnerEye
  2. Releases
  3. Changelog
  4. Testing
  5. How to do pull requests
  6. Contributing

Licensing

MIT License

You are responsible for the performance, the necessary testing, and if needed any regulatory clearance forany of the models produced by this toolbox.

Contact

If you have any feature requests, or find issues in the code, please create anissue on GitHub.

Please send an email to InnerEyeInfo@microsoft.com if you would like further information about this project.

Publications

Oktay O., Nanavati J., Schwaighofer A., Carter D., Bristow M., Tanno R., Jena R., Barnett G., Noble D., Rimmer Y., Glocker B., O’Hara K., Bishop C., Alvarez-Valle J., Nori A.: Evaluation of Deep Learning to Augment Image-Guided Radiotherapy for Head and Neck and Prostate Cancers. JAMA Netw Open. 2020;3(11):e2027426. doi:10.1001/jamanetworkopen.2020.27426

Bannur S., Oktay O., Bernhardt M, Schwaighofer A., Jena R., Nushi B., Wadhwani S., Nori A., Natarajan K., Ashraf S., Alvarez-Valle J., Castro D. C.: Hierarchical Analysis of Visual COVID-19 Features from Chest Radiographs. ICML 2021 Workshop on Interpretable Machine Learning in Healthcare. https://arxiv.org/abs/2107.06618

Bernhardt M., Castro D. C., Tanno R., Schwaighofer A., Tezcan K. C., Monteiro M., Bannur S., Lungren M., Nori S., Glocker B., Alvarez-Valle J., Oktay. O: Active label cleaning: Improving dataset quality under resource constraints. ArXiv pre-print (under peer review). https://arxiv.org/abs/2109.00574. Accompagnying code InnerEye-DataQuality

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to aContributor License Agreement (CLA) declaring that you have the right to, and actually do, grant usthe rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to providea CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructionsprovided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct.For more information see the Code of Conduct FAQ orcontact opencode@microsoft.com with any additional questions or comments.

Credits

This toolbox is maintained by theMicrosoft InnerEye team,and has received valuable contributions from a numberof people outside our team. We would like to thank in particular our interns,Yao Quin, Zoe Landgraf,Padmaja Jonnalagedda,Mathias Perslev, as well as the AI ResidentsPatricia Gillespie andGuilherme Ilunga.

 相关资料
  • 我将Deeplearning4j(Ver.1.0.0-M1.1)用于构建神经网络。 我以Deeplearning4j中的IrisClassifier为例。 我怎么能得到预测? 萨克斯!

  • 我试图在DL4j中做一个简单的预测(稍后将用于具有n个特性的大型数据集),但无论我做什么,我的网络都不想学习,行为非常奇怪。当然,我学习了所有的教程,并执行了dl4j repo中显示的相同步骤,但不知何故,它对我不起作用。 对于虚拟特性,我使用以下数据: *双[val][x]特征;其中val=linspace(-10,10)...;和x=math.sqrt(math.abs(val))*val;

  • 我刚刚更新了我的服务器,我正在使用Deeplearning4J运行一个web服务。我想检查不同的性能配置,以决定未来的努力方向。从什么知道。 本机或CPU,这是我当前的默认设置 CUDA或GPU使用情况。在代码中设置环境以使用CUDA 特定于GTX GPU。

  • 我需要使用嵌入层来编码单词向量,所以嵌入层的权重本质上是单词向量。显然,我不希望这种情况下的权重在反向传播期间被更新。我的问题是,如果按设计嵌入层已经禁止重量更新,或者我必须对此做一些特别的事情?

  • 文档说这个库运行在GPU上。如果我功能强大的笔记本电脑没有GPU,我还能运行Deeplearning4J吗?

  • 我想预测具有以下形式的数据分类: 分类器;文本描述 null