Search results
Results from the WOW.Com Content Network
Dataset of legal contracts with rich expert annotations ~13,000 labels CSV and PDF Natural language processing, QnA 2021 The Atticus Project: Vietnamese Image Captioning Dataset (UIT-ViIC) Vietnamese Image Captioning Dataset 19,250 captions for 3,850 images CSV and PDF Natural language processing, Computer vision 2020 [112] Lam et al.
Kaggle is a data science competition platform and online community for data scientists and machine learning practitioners under Google LLC.Kaggle enables users to find and publish datasets, explore and build models in a web-based data science environment, work with other data scientists and machine learning engineers, and enter competitions to solve data science challenges.
The Fashion MNIST dataset is a large freely available database of fashion images that is commonly used for training and testing various machine learning systems. [1] [2] Fashion-MNIST was intended to serve as a replacement for the original MNIST database for benchmarking machine learning algorithms, as it shares the same image size, data format and the structure of training and testing splits.
XOWA is a free, open-source application that helps download Wikipedia to a computer. Access all of Wikipedia offline, without an internet connection! Access all of Wikipedia offline, without an internet connection!
In 2021, ImageNet-1k was updated by annotating faces appearing in the 997 non-person categories. They found training models on the dataset with these faces blurred caused minimal loss in performance. [31] ImageNetV2 was a new dataset containing three test sets with 10,000 each, constructed by the same methodology as the original ImageNet. [32]
CIFAR-10 is a set of images that can be used to teach a computer how to recognize objects. Since the images in CIFAR-10 are low-resolution (32x32), this dataset can allow researchers to quickly try different algorithms to see what works. CIFAR-10 is a labeled subset of the 80 Million Tiny Images dataset from 2008, published in 2009. When the ...
Caltech 256 is another image data set, created in 2007. It is a successor to Caltech 101. It is intended to address some of the weaknesses of Caltech 101. Overall, it is a more difficult data set than Caltech 101, but it suffers from comparable problems. It includes [3] 30,607 images, covering a larger number of categories
Various plots of the multivariate data set Iris flower data set introduced by Ronald Fisher (1936). [1]A data set (or dataset) is a collection of data.In the case of tabular data, a data set corresponds to one or more database tables, where every column of a table represents a particular variable, and each row corresponds to a given record of the data set in question.