Search results
Results from the WOW.Com Content Network
MoCap pre-processing 660 action samples 8 PhaseSpace Motion Capture, 2 Stereo Cameras, 4 Quad Cameras, 6 accelerometers, 4 microphones Action classification 2013 [118] Ofli, F. et al. THUMOS Dataset Large video dataset for action classification. Actions classified and labeled. 45M frames of video Video, images, text Classification, action detection
Covertype Dataset Data for predicting forest cover type strictly from cartographic variables. Many geographical features given. 581,012 Text Classification 1998 [310] [311] J. Blackard et al. Abscisic Acid Signaling Network Dataset Data for a plant signaling network. Goal is to determine set of rules that governs the network. None. 300 Text
The NTU RGB-D (Nanyang Technological University's Red Blue Green and Depth information) dataset is a large dataset containing recordings of labeled human activities. [1] This dataset consists of 56,880 action samples containing 4 different modalities (RGB videos, depth map sequences, 3D skeletal data, infrared videos) of data for each sample.
The CIFAR-10 dataset (Canadian Institute For Advanced Research) is a collection of images that are commonly used to train machine learning and computer vision algorithms. It is one of the most widely used datasets for machine learning research. [1] [2] The CIFAR-10 dataset contains 60,000 32x32 color images in 10 different classes. [3]
Previously, NIST released two datasets: Special Database 1 (NIST Test Data I, or SD-1); and Special Database 3 (or SD-2). They were released on two CD-ROMs. They were released on two CD-ROMs. SD-1 was the test set, and it contained digits written by high school students, 58,646 images written by 500 different writers.
A training data set is a data set of examples used during the learning process and is used to fit the parameters (e.g., weights) of, for example, a classifier. [9] [10]For classification tasks, a supervised learning algorithm looks at the training data set to determine, or learn, the optimal combinations of variables that will generate a good predictive model. [11]
Data Commons is an open-source platform [1] created by Google [2] that provides an open knowledge graph, combining economic, scientific and other public datasets into a unified view. [3] Ramanathan V. Guha , a creator of web standards including RDF , [ 4 ] RSS , and Schema.org , [ 5 ] founded the project, [ 6 ] which is now led by Prem Ramaswami.
The Pile is an 886.03 GB diverse, open-source dataset of English text created as a training dataset for large language models (LLMs). It was constructed by EleutherAI in 2020 and publicly released on December 31 of that year. [1] [2] It is composed of 22 smaller datasets, including 14 new ones. [1]