Detection
- Classes : 80 object categories,91 stuff categories,5 captions per image
- Labeled datasets : >200K labeled (multiple instances are labeled in single image)
- Total object instance : 1.5 million object instances (multiple instances are in a single image)
- Total images : 330K images
- 250,000 people with keypoints
- Classes :
- Label dataset : 80,256 labeled objects
- Total images : 7481 training images and 7518 test images
- Classes : 1000
- Label dataset : - Number of images with bounding box annotations: 1,034,908
- Total number of non-empty WordNet synsets: 21841
- Total number of images: 14197122
- Number of synsets with SIFT features: 1000
- Number of images with SIFT features: 1.2 million
- largest open driving video dataset as part of the CVPR
- Dataset for Object deTection in Aerial Images , 11268 Items ,18 Classes ,
- MaskedFace-Net is a dataset of human faces with a correctly or incorrectly worn mask (133,783 images) based on the dataset Flickr-Faces-HQ (FFHQ).
CIFAR-10 -The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.
LISA Traffic Sign Detection Dataset
- The LISA Traffic Sign Dataset is a set of videos and annotated frames containing US traffic signs.
Exclusively Dark (ExDark) Image Dataset
- 12 class , 7363 Labels
The 20BN-SOMETHING-SOMETHING Dataset V2
- Total number of videos 220,847/ Training Set 168,913,Validation Set 24,777 / Test/ Set (w/o labels) 27,157 / Labels 174 / Quality 100px / FPS 12
PASCAL VOC 2007,2009,2010,2011 dataset
- Classification/Detection Competitions, Segmentation Competition, Person Layout Taster Competition datasets
- LabelMe is a web-based image annotation tool that allows researchers to label images and share the annotations with the rest of the community. If you use the database, we only ask that you contribute to it, from time to time, by using the labeling tool.
- The CMU Multi-PIE face database contains more than 750,000 images of 337 people recorded
- The Yale Face Database (size 6.4MB) contains 165 grayscale images in GIF format of 15 individuals.
- Cars, Motorcycles, Airplanes, Faces, Leaves, Backgrounds
- Pictures of objects belonging to 101 categories
- Pictures of objects belonging to 256 categories
Daimler Pedestrian Detection Benchmark
- 15,560 pedestrian and non-pedestrian samples (image cut-outs) and 6744 additional full images not containing pedestrians for bootstrapping. The test set contains more than 21,790 images with 56,492 pedestrian labels (fully visible or partially occluded), captured from a vehicle in urban traffic.
- CVC Pedestrian Datasets
- CBCL Pedestrian Database
- CBCL Face Database
- CBCL Street Database
- A large set of marked up images of standing or walking people
- A set of car and non-car images taken in a parking lot nearby INRIA
- A set of horse and non-horse images
- 3D skeletons and segmented regions for 1000 people in images
- A large-scale vehicle detection dataset
- 10000 images of natural scenes, with 37 different logos, and 2695 logos instances, annotated with a bounding box.
- 10000 images of natural scenes grabbed on Flickr, with 2695 logos instances cut and pasted from the BelgaLogos dataset.
- The dataset FlickrLogos-32 contains photos depicting logos and is meant for the evaluation of multi-class logo detection/recognition as well as logo retrieval methods on real-world images. It consists of 8240 images downloaded from Flickr.
- 30000+ frames with vehicle rear annotation and classification (car and trucks) on motorway/highway sequences. Annotation semi-automatically generated using laser-scanner data. Distance estimation and consistent target ID over time available.
PHOS (Color Image Database for illumination invariant feature selection)
- Phos is a color image database of 15 scenes captured under different illumination conditions. More particularly, every scene of the database contains 15 different images: 9 images captured under various strengths of uniform illumination, and 6 images under different degrees of non-uniform illumination. The images contain objects of different shape, color and texture and can be used for illumination invariant feature detection and selection.
CaliforniaND: An Annotated Dataset For Near-Duplicate Detection In Personal Photo Collections
- California-ND contains 701 photos taken directly from a real user’s personal photo collection, including many challenging non-identical near-duplicate cases, without the use of artificial image transformations. The dataset is annotated by 10 different subjects, including the photographer, regarding near duplicates.
USPTO Algorithm Challenge, Detecting Figures and Part Labels in Patents
- Contains drawing pages from US patents with manually labeled figure and part labels.
- Contains 6 object categories similar to object categories in Pascal VOC that are suitable for studying the abnormalities stemming from objects.
Human detection and tracking using RGB-D camera
- Collected in a clothing store. Captured with Kinect (640*480, about 30fps)
Multi-Task Facial Landmark (MTFL) dataset
- This dataset contains 12,995 face images collected from the Internet. The images are annotated with (1) five facial landmarks, (2) attributes of gender, smiling, wearing glasses, and head pose.
WIDER FACE: A Face Detection Benchmark
- WIDER FACE dataset is a face detection benchmark dataset with images selected from the publicly available WIDER dataset. It contains 32,203 images and 393,703 face annotations.
PIROPO Database: People in Indoor ROoms with Perspective and Omnidirectional cameras
- Multiple sequences recorded in two different indoor rooms, using both omnidirectional and perspective cameras, containing people in a variety of situations (people walking, standing, and sitting). Both annotated and non-annotated sequences are provided, where ground truth is point-based. In total, more than 100,000 annotated frames are available.
The Boxy vehicle detection dataset
- A vehicle detection dataset with 1.99 million annotated vehicles in 200,000 images. It contains AABB and keypoint labels.
The Bosch Small Traffic Lights Dataset
- A dataset for traffic light detection, tracking, and classification.
DriveU Traffic Light Dataset (DTLD)
- It contains more than 40.000 images and 230 000 annotated traffic lights and is the largest database for traffic light detection so far containing bounding box labels, track identities and furthermore the following attributes: phase, pictogram, relevancy, occlusion, number of light units and orientation.
- ETH is a dataset for pedestrian detection. The testing set contains 1,804 images in three video clips. The dataset is captured from a stereo rig mounted on car, with a resolution of 640 x 480 (bayered), and a framerate of 13–14 FPS.
- In this experiment, we only use 288 images which contain pedestrians . TUD-Brussels test set contains 508 images containing 1498 annotated pedestrians.
The 2D Shape Structure Dataset
- The 2D Shape Structure database is a public, user-generated dataset of 2D shape decompositions into a hierarchy of shape parts with geometric relationships reta…
ICS-FORTH + Modelling of 2D Shapes with Ellipses
- The dataset contains more than 4,536 2D shapes included in standard as well as in home-build datasets. Our goal is to represent a given 2D shape with an au…
Mobile Phone and Webcam Hand Images for Personal Authentication and Identification
- This work attempts to provide two Hand Images Databases for hand biometrics: one is created using a mobile phone camera of modest quality, which we called mob…
- Detail 2D Projection DataSet is a database of 2d projections of mechanical details with holes. The dataset consists of 13 shape categories where each category i…
- The 3DVis dataset includes a set of 12 heterogeneous scenes for testing 3D scene registration and analysis methods. Models include homogeneous shapes, repetitiv…
- We would like to announce the release of PASCAL-Context dataset. We augmented PASCAL VOC 2010 dataset with annotations for 400+ additional categories. In the cu…
- The 3D shape description dataset consists of multiple sub-datasets Descriptor Matching - Dataset 1 & 2 (Stanford) These datasets, created from some of the m…
- We introduce a labeled dataset of categorized images for evaluating sketch based image retrieval. Using Flickr, we downloaded about 3000 images for each of the …
RGB-D Person Re-identification
- The RGB-D Person Re-identification dataset is for person re-identification using depth information. The main motivation is that the standard techniques (such as…
- This material is supplementary to Michael Stark, Bernt Schiele. How Good are Local Features for Classes of Geometric Objects. Eleventh IEEE International C…
- Humans have used sketching to depict our visual world since prehistoric times. Even today, sketching is possibly the only rendering technique readily available …
EITZ Sketch-Based Image Retrieval
- We introduce a benchmark for evaluating the performance of large scale sketch-based image retrieval systems. The necessary data is acquired in a controlled user…
- The ICG Sketch Retrieval dataset consists of XXX hand-drawn sketches for five categories. It is used for content-based image retrieval using shape features for …
- Simpsons Homer 40 years is a dataset showing Homer Simpson over the course of 40 years. It is used for video segmentation and shape matching between frames….
Leaves -The Leaves dataset from X contains X images of leaves. Leaves dataset taken by Markus Weber. California Institute of Technology PhD student under Pietro Per…
- The ETHZ Shape classes dataset from Vittorio Ferrari [?] consists of five object classes and a total of 255 images. All classes contain significant intra-class …
- The multi-scale Weizmann horses (originally from Eran Borenstein, adapted by Jamie Shotton) consists of 656 images which is split into 50+50training, 50+50 vali…
- The ETHZ Extended Shape classes dataset from Konrad Schindler is larger dataset of shape categories, created by merging ETHZ shape classes with Konrad Schindler…
- The Tools 2D dataset from Bronstein, Bronstein, Bruckstein, and Kimmel [?] for partial similarity experiments and consists of 15 shapes: 5 humans, 5 horses and …
Mythological Creatures -The Mythological Creatures consists of articulated shapes (silhouettes) for partial similarity experiments and contains 15 shapes: 5 humans, 5 horses and 5 cent…
- The SIID silhouette dataset contains… and is from the Shape Indexing of Image Database (SIID). Download SIID silhouette dataset http://www.lems.brown.edu/…
- The Kimia 216 has 18 classes each consisting of 12 images. It contains shapes silhouettes for birds, bones, brick, camels, car, children, classic cards, elephan…
- The Kimia 99 has 9 classes each consisting of each 11 images. They are part of the Shape Indexing of Image Database (SIID) project, which also contains the SIID…
- The Kimia 25 consists of 6 classes and 25 images. They are part of the Shape Indexing of Image Database (SIID) project, which also contains the SIID silhouette …
MPEG-7 Core Experiment CE-Shape-1
- MPEG-7 Core Experiment CE-Shape-1 [?] is a popular database for shape matching evaluation consisting of 70 shape categories, where each category is represented …
Labeled and Annotated Sequences for Integral Evaluation of SegmenTation Algorithms
- It contains over 5000 high-resolution images divided into fifteen different object and texture categories.
References
- https://pub.towardsai.net/50-object-detection-datasets-from-different-industry-domains-1a53342ae13d