Search results
Results from the WOW.Com Content Network
UC Merced Land Use Dataset These images were manually extracted from large images from the USGS National Map Urban Area Imagery collection for various urban areas around the US. This is a 21 class land use image dataset meant for research purposes. There are 100 images for each class. 2,100 Image chips of 256x256, 30 cm (1 foot) GSD
Covertype Dataset Data for predicting forest cover type strictly from cartographic variables. Many geographical features given. 581,012 Text Classification 1998 [311] [312] J. Blackard et al. Abscisic Acid Signaling Network Dataset Data for a plant signaling network. Goal is to determine set of rules that governs the network. None. 300 Text
The set of images in the MNIST database was created in 1994. Previously, NIST released two datasets: Special Database 1 (NIST Test Data I, or SD-1); and Special Database 3 (or SD-2). They were released on two CD-ROMs. SD-1 was the test set, and it contained digits written by high school students, 58,646 images written by 500 different writers.
CIFAR-10 is a labeled subset of the 80 Million Tiny Images dataset from 2008, published in 2009. When the dataset was created, students were paid to label all of the images. [5] Various kinds of convolutional neural networks tend to be the best at recognizing the images in CIFAR-10.
scikit-learn (formerly scikits.learn and also known as sklearn) is a free and open-source machine learning library for the Python programming language. [3] It features various classification, regression and clustering algorithms including support-vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific ...
The ImageNet project is a large visual database designed for use in visual object recognition software research. More than 14 million [1] [2] images have been hand-annotated by the project to indicate what objects are pictured and in at least one million of the images, bounding boxes are also provided. [3]
A training data set is a data set of examples used during the learning process and is used to fit the parameters (e.g., weights) of, for example, a classifier. [9] [10]For classification tasks, a supervised learning algorithm looks at the training data set to determine, or learn, the optimal combinations of variables that will generate a good predictive model. [11]
The 80 Million Tiny Images dataset was retired from use by its creators in 2020, [5] after a paper by researchers Abeba Birhane and Vinay Prabhu found that some of the labeling of several publicly available image datasets, including 80 Million Tiny Images, contained racist and misogynistic slurs which were causing models trained on them to exhibit racial and sexual bias.