Search results
Results from the WOW.Com Content Network
Covertype Dataset Data for predicting forest cover type strictly from cartographic variables. Many geographical features given. 581,012 Text Classification 1998 [310] [311] J. Blackard et al. Abscisic Acid Signaling Network Dataset Data for a plant signaling network. Goal is to determine set of rules that governs the network. None. 300 Text
The Pile is an 886.03 GB diverse, open-source dataset of English text created as a training dataset for large language models (LLMs). It was constructed by EleutherAI in 2020 and publicly released on December 31 of that year. [1] [2] It is composed of 22 smaller datasets, including 14 new ones. [1]
The following tree was constructed using JBoost on the spambase dataset [3] (available from the UCI Machine Learning Repository). [4] In this example, spam is coded as 1 and regular email is coded as −1. An ADTree for 6 iterations on the Spambase dataset. The following table contains part of the information for a single instance.
Various plots of the multivariate data set Iris flower data set introduced by Ronald Fisher (1936). [1]A data set (or dataset) is a collection of data.In the case of tabular data, a data set corresponds to one or more database tables, where every column of a table represents a particular variable, and each row corresponds to a given record of the data set in question.
The CIFAR-10 dataset (Canadian Institute For Advanced Research) is a collection of images that are commonly used to train machine learning and computer vision algorithms. It is one of the most widely used datasets for machine learning research. [1] [2] The CIFAR-10 dataset contains 60,000 32x32 color images in 10 different classes. [3]
The standard 78-edge network data set for Zachary's karate club is publicly available on the internet. [3] The data can be summarized as list of integer pairs. Each integer represents one karate club member and a pair indicates the two members interacted. The data set is summarized below and also in the adjoining image. Node 1 stands for the ...
In version 3.7.2, a package manager was added to allow the easier installation of extension packages. [6] Some functionality that used to be included with Weka prior to this version has since been moved into such extension packages, but this change also makes it easier for others to contribute extensions to Weka and to maintain the software, as this modular architecture allows independent ...
server and repository for protein structure models Protein model databases AAindex: database of amino acid indices, amino acid mutation matrices, and pair-wise contact potentials Protein model databases BioGRID: Samuel Lunenfeld Research Institute: general repository for interaction datasets Protein-protein and other molecular interactions