pytorch / vision

Datasets, Transforms and Models specific to Computer Vision

GitHub Docs

torchvision

The torchvision library consists of popular datasets, model architectures, and image transformations for computer vision. It consists of:

  • Training recipes for object detection, image classification, instance segmentation, video classification and semantic segmentation.
  • 60+ pretrained models to use for fine-tuning (or training afresh).
  • Dataset loaders for popular vision datasets such as ImageNet, COCO, Cityscapes and more!

Tasks

Choose a task to see what models are available:

Models

Viewing Models for Image Classification:

Models

Viewing Models for Object Detection:

Models

Viewing Models for Action Classification:

Models

Viewing Models for Instance Segmentation: