Search results
Results from the WOW.Com Content Network
Covertype Dataset Data for predicting forest cover type strictly from cartographic variables. Many geographical features given. 581,012 Text Classification 1998 [310] [311] J. Blackard et al. Abscisic Acid Signaling Network Dataset Data for a plant signaling network. Goal is to determine set of rules that governs the network. None. 300 Text
The Pile is an 886.03 GB diverse, open-source dataset of English text created as a training dataset for large language models (LLMs). It was constructed by EleutherAI in 2020 and publicly released on December 31 of that year. [1] [2] It is composed of 22 smaller datasets, including 14 new ones. [1]
ICS buildings (center and left) viewed from the top of Bren Hall. The Donald Bren School of Information and Computer Sciences, also known colloquially as UCI's School of ICS or simply the Bren School, is an academic unit of the University of California, Irvine (UCI), and the only dedicated school of computer science in the University of California system.
The CIFAR-10 dataset (Canadian Institute For Advanced Research) is a collection of images that are commonly used to train machine learning and computer vision algorithms. It is one of the most widely used datasets for machine learning research. [1] [2] The CIFAR-10 dataset contains 60,000 32x32 color images in 10 different classes. [3]
Record linkage (also known as data matching, data linkage, entity resolution, and many other terms) is the task of finding records in a data set that refer to the same entity across different data sources (e.g., data files, books, websites, and databases).
The Iris flower data set or Fisher's Iris data set is a multivariate data set used and made famous by the British statistician and biologist Ronald Fisher in his 1936 paper The use of multiple measurements in taxonomic problems as an example of linear discriminant analysis. [1]
For example, my daughter wrote in her homework, "I went to the osen," rather than "I went to the ocean." The teacher hadn't corrected the mistake because the emphasis was on visual cues — a ...
The USGS Gap Analysis Program maintains four primary data sets: land cover, protected areas, species and aquatic. The GAP Land Cover Data Set is the most complete map ever produced of vegetative associations for the US. Classified into 551 ecological systems, and 32 modified ecological systems (where human impacts have had an effect).