General

Detection

COCO

Classes : 80 object categories,91 stuff categories,5 captions per image
Labeled datasets : >200K labeled (multiple instances are labeled in single image)
Total object instance : 1.5 million object instances (multiple instances are in a single image)
Total images : 330K images
250,000 people with keypoints

KITTI

Classes :
Label dataset : 80,256 labeled objects
Total images : 7481 training images and 7518 test images

ImageNet

Classes : 1000
Label dataset : - Number of images with bounding box annotations: 1,034,908
Total number of non-empty WordNet synsets: 21841
Total number of images: 14197122
Number of synsets with SIFT features: 1000
Number of images with SIFT features: 1.2 million

BDD100K

largest open driving video dataset as part of the CVPR

DOTA

Dataset for Object deTection in Aerial Images , 11268 Items ,18 Classes ,

MaskedFaceNet

MaskedFace-Net is a dataset of human faces with a correctly or incorrectly worn mask (133,783 images) based on the dataset Flickr-Faces-HQ (FFHQ).

CIFAR-10 -The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.

LISA Traffic Sign Detection Dataset

The LISA Traffic Sign Dataset is a set of videos and annotated frames containing US traffic signs.

Exclusively Dark (ExDark) Image Dataset

12 class , 7363 Labels

The 20BN-SOMETHING-SOMETHING Dataset V2

Total number of videos 220,847/ Training Set 168,913,Validation Set 24,777 / Test/ Set (w/o labels) 27,157 / Labels 174 / Quality 100px / FPS 12

PASCAL VOC 2007,2009,2010,2011 dataset

Classification/Detection Competitions, Segmentation Competition, Person Layout Taster Competition datasets

LabelMe dataset

LabelMe is a web-based image annotation tool that allows researchers to label images and share the annotations with the rest of the community. If you use the database, we only ask that you contribute to it, from time to time, by using the labeling tool.

CMU/VASC & PIE Face dataset

The CMU Multi-PIE face database contains more than 750,000 images of 337 people recorded

Yale Face dataset

The Yale Face Database (size 6.4MB) contains 165 grayscale images in GIF format of 15 individuals.

Caltech

Cars, Motorcycles, Airplanes, Faces, Leaves, Backgrounds

Caltech 101

Pictures of objects belonging to 101 categories

Caltech 256

Pictures of objects belonging to 256 categories

Daimler Pedestrian Detection Benchmark

15,560 pedestrian and non-pedestrian samples (image cut-outs) and 6744 additional full images not containing pedestrians for bootstrapping. The test set contains more than 21,790 images with 56,492 pedestrian labels (fully visible or partially occluded), captured from a vehicle in urban traffic.

MIT Pedestrian dataset

CVC Pedestrian Datasets

CVC Pedestrian Datasets

CBCL Pedestrian Database

MIT Face dataset

CBCL Face Database

MIT Street dataset

CBCL Street Database

INRIA Person Data Set

A large set of marked up images of standing or walking people

INRIA car dataset

A set of car and non-car images taken in a parking lot nearby INRIA

INRIA horse dataset

A set of horse and non-horse images

H3D Dataset

3D skeletons and segmented regions for 1000 people in images

HRI RoadTraffic dataset

A large-scale vehicle detection dataset

BelgaLogos

10000 images of natural scenes, with 37 different logos, and 2695 logos instances, annotated with a bounding box.

FlickrBelgaLogos

10000 images of natural scenes grabbed on Flickr, with 2695 logos instances cut and pasted from the BelgaLogos dataset.

FlickrLogos-32

The dataset FlickrLogos-32 contains photos depicting logos and is meant for the evaluation of multi-class logo detection/recognition as well as logo retrieval methods on real-world images. It consists of 8240 images downloaded from Flickr.

TME Motorway Dataset

30000+ frames with vehicle rear annotation and classification (car and trucks) on motorway/highway sequences. Annotation semi-automatically generated using laser-scanner data. Distance estimation and consistent target ID over time available.

PHOS (Color Image Database for illumination invariant feature selection)

Phos is a color image database of 15 scenes captured under different illumination conditions. More particularly, every scene of the database contains 15 different images: 9 images captured under various strengths of uniform illumination, and 6 images under different degrees of non-uniform illumination. The images contain objects of different shape, color and texture and can be used for illumination invariant feature detection and selection.

CaliforniaND: An Annotated Dataset For Near-Duplicate Detection In Personal Photo Collections

California-ND contains 701 photos taken directly from a real user’s personal photo collection, including many challenging non-identical near-duplicate cases, without the use of artificial image transformations. The dataset is annotated by 10 different subjects, including the photographer, regarding near duplicates.

USPTO Algorithm Challenge, Detecting Figures and Part Labels in Patents

Contains drawing pages from US patents with manually labeled figure and part labels.

Abnormal Objects Dataset

Contains 6 object categories similar to object categories in Pascal VOC that are suitable for studying the abnormalities stemming from objects.

Human detection and tracking using RGB-D camera

Collected in a clothing store. Captured with Kinect (640*480, about 30fps)

Multi-Task Facial Landmark (MTFL) dataset

This dataset contains 12,995 face images collected from the Internet. The images are annotated with (1) five facial landmarks, (2) attributes of gender, smiling, wearing glasses, and head pose.

WIDER FACE: A Face Detection Benchmark

WIDER FACE dataset is a face detection benchmark dataset with images selected from the publicly available WIDER dataset. It contains 32,203 images and 393,703 face annotations.

PIROPO Database: People in Indoor ROoms with Perspective and Omnidirectional cameras

Multiple sequences recorded in two different indoor rooms, using both omnidirectional and perspective cameras, containing people in a variety of situations (people walking, standing, and sitting). Both annotated and non-annotated sequences are provided, where ground truth is point-based. In total, more than 100,000 annotated frames are available.

The Boxy vehicle detection dataset

A vehicle detection dataset with 1.99 million annotated vehicles in 200,000 images. It contains AABB and keypoint labels.

The Bosch Small Traffic Lights Dataset

A dataset for traffic light detection, tracking, and classification.

DriveU Traffic Light Dataset (DTLD)

It contains more than 40.000 images and 230 000 annotated traffic lights and is the largest database for traffic light detection so far containing bounding box labels, track identities and furthermore the following attributes: phase, pictogram, relevancy, occlusion, number of light units and orientation.

ETH (ETH Pedestrian)

ETH is a dataset for pedestrian detection. The testing set contains 1,804 images in three video clips. The dataset is captured from a stereo rig mounted on car, with a resolution of 640 x 480 (bayered), and a framerate of 13–14 FPS.

TUD-Brussels Pedestrian

In this experiment, we only use 288 images which contain pedestrians . TUD-Brussels test set contains 508 images containing 1498 annotated pedestrians.

The 2D Shape Structure Dataset

The 2D Shape Structure database is a public, user-generated dataset of 2D shape decompositions into a hierarchy of shape parts with geometric relationships reta…

ICS-FORTH + Modelling of 2D Shapes with Ellipses

The dataset contains more than 4,536 2D shapes included in standard as well as in home-build datasets. Our goal is to represent a given 2D shape with an au…

Mobile Phone and Webcam Hand Images for Personal Authentication and Identification

This work attempts to provide two Hand Images Databases for hand biometrics: one is created using a mobile phone camera of modest quality, which we called mob…

Detail 2D Projection DataSet

Detail 2D Projection DataSet is a database of 2d projections of mechanical details with holes. The dataset consists of 13 shape categories where each category i…

3DVis

The 3DVis dataset includes a set of 12 heterogeneous scenes for testing 3D scene registration and analysis methods. Models include homogeneous shapes, repetitiv…

PASCAL Context

We would like to announce the release of PASCAL-Context dataset. We augmented PASCAL VOC 2010 dataset with annotations for 400+ additional categories. In the cu…

SHOT 3D shape description

The 3D shape description dataset consists of multiple sub-datasets Descriptor Matching - Dataset 1 & 2 (Stanford) These datasets, created from some of the m…

THUR15000

We introduce a labeled dataset of categorized images for evaluating sketch based image retrieval. Using Flickr, we downloaded about 3000 images for each of the …

RGB-D Person Re-identification

The RGB-D Person Re-identification dataset is for person re-identification using depth information. The main motivation is that the standard techniques (such as…

TUD Shapes 1+2

This material is supplementary to Michael Stark, Bernt Schiele. How Good are Local Features for Classes of Geometric Objects. Eleventh IEEE International C…

EITZ Sketch Quality

Humans have used sketching to depict our visual world since prehistoric times. Even today, sketching is possibly the only rendering technique readily available …

EITZ Sketch-Based Image Retrieval

We introduce a benchmark for evaluating the performance of large scale sketch-based image retrieval systems. The necessary data is acquired in a controlled user…

ICG Sketch Retrieval

The ICG Sketch Retrieval dataset consists of XXX hand-drawn sketches for five categories. It is used for content-based image retrieval using shape features for …

Simpsons 40 years

Simpsons Homer 40 years is a dataset showing Homer Simpson over the course of 40 years. It is used for video segmentation and shape matching between frames….

Leaves -The Leaves dataset from X contains X images of leaves. Leaves dataset taken by Markus Weber. California Institute of Technology PhD student under Pietro Per…

ETHZ Shape

The ETHZ Shape classes dataset from Vittorio Ferrari [?] consists of five object classes and a total of 255 images. All classes contain significant intra-class …

Weizmann Horses

The multi-scale Weizmann horses (originally from Eran Borenstein, adapted by Jamie Shotton) consists of 656 images which is split into 50+50training, 50+50 vali…

ETHZ Extended Shape

The ETHZ Extended Shape classes dataset from Konrad Schindler is larger dataset of shape categories, created by merging ETHZ shape classes with Konrad Schindler…

Tools2D

The Tools 2D dataset from Bronstein, Bronstein, Bruckstein, and Kimmel [?] for partial similarity experiments and consists of 15 shapes: 5 humans, 5 horses and …

Mythological Creatures -The Mythological Creatures consists of articulated shapes (silhouettes) for partial similarity experiments and contains 15 shapes: 5 humans, 5 horses and 5 cent…

SIID

The SIID silhouette dataset contains… and is from the Shape Indexing of Image Database (SIID). Download SIID silhouette dataset http://www.lems.brown.edu/…

KIMA216

The Kimia 216 has 18 classes each consisting of 12 images. It contains shapes silhouettes for birds, bones, brick, camels, car, children, classic cards, elephan…

KIMA99

The Kimia 99 has 9 classes each consisting of each 11 images. They are part of the Shape Indexing of Image Database (SIID) project, which also contains the SIID…

KIMIA25

The Kimia 25 consists of 6 classes and 25 images. They are part of the Shape Indexing of Image Database (SIID) project, which also contains the SIID silhouette …

MPEG-7 Core Experiment CE-Shape-1

MPEG-7 Core Experiment CE-Shape-1 [?] is a popular database for shape matching evaluation consisting of 70 shape categories, where each category is represented …

Labeled and Annotated Sequences for Integral Evaluation of SegmenTation Algorithms

ChangeDetection.net

MVTec AD

It contains over 5000 high-resolution images divided into fifteen different object and texture categories.

References

https://pub.towardsai.net/50-object-detection-datasets-from-different-industry-domains-1a53342ae13d

Vision Datasets

3.Detection

General

Detection

References