NVIDIA Deep Learning SDK(NVIDIA深度学习相关的各种库)

夔桐
2023-12-01

原文链接

  • Mixed precision in AI frameworks (Automatic Mixed Precision): 混合精度计算,最高3倍加速比,利用Tensor Cores;(Get upto 3X speedup running on Tensor Cores With just a few lines of code added to your existing training script)
  • Deep Learning Primitives (cuDNN): 深度学习GPU加速的标配;(High-performance building blocks for deep neural network applications including convolutions, activation functions, and tensor transformations)
  • Input Data Processing (DALI): 并行度高的数据加载和数据增强库(主要针对图像、视频);(An open source data loading and augmentation library that is fast, portable and flexible)
  • Multi-GPU Communication (NCCL): 组播通信神器,double-tree实现;(Collective communication routines, such as all-gather, reduce, and broadcast that accelerate multi-GPU deep learning training)
  • Deep Learning Inference Engine (TensorRT): 推理神器;(High-performance deep learning inference runtime for production deployment)(TensorFlow-to-ONNX-to-TensorRT例子
  • Deep Learning for Video Analytics (DeepStream SDK): High-level C++ API and runtime for GPU-accelerated transcoding and deep learning inference
  • Optical Flow for Video Inference (Optical Flow SDK): Set of high-level APIs that expose the latest hardware capability of Turing GPUs dedicated for computing the optical flow of pixels between images. Also useful for calculating stereo disparity and depth estimation.
  • High level SDK for tuning domain specific DNNs (Transfer Learning Toolkit): 迁移学习;(Enabling end to end Deep Learning workflows for industries)
  • AI enabled Annotation for Medical Imaging (AI-Assisted Annotation SDK): 没权限打开??;(AI-assisted annotation for medical imaging related data labeling)
  • Deep Learning GPU Training System (DIGITS): 网页版的数据集、模型、训练可视化工具(和TensorGou很像),在计算框架等核心组件外围包的一层可视化而已;(Rapidly train highly accurate deep neural network (DNNs) for image classification, segmentation and object detection tasks)
  • Linear Algebra (cuBLAS): GPU矩阵计算标配;(GPU-accelerated BLAS functionality that delivers 6x to 17x faster performance than CPU-only BLAS libraries)
  • Sparse Matrix Operations (cuSPARSE): 稀疏矩阵计算标配(模型权重剪枝那里真用到了);(GPU-accelerated linear algebra subroutines for sparse matrices that deliver up to 8x faster performance than CPU BLAS (MKL), ideal for applications such as natural language processing)
 类似资料: