Search results
Results from the WOW.Com Content Network
80 high-resolution aerial images with spatial resolution ranging from 0.3 to 1.0. Images manually segmented. 80 Images Aerial Classification, object detection 2013 [156] [157] J. Yuan et al. KIT AIS Data Set Multiple labeled training and evaluation datasets of aerial images of crowds. Images manually labeled to show paths of individuals through ...
PANGAEA - Data Publisher for Earth & Environmental Science is a digital data library and a data publisher for earth system science. Data can be georeferenced in time (date/time or geological age) and space (latitude, longitude, depth/height). Scientific data are archived with related metainformation in a relational database through an editorial ...
A 3.1 TB dataset consisting of permissively licensed source code in 30 programming languages. Filtered through license detection and deduplication. 6 TB, 51.76B files (prior to deduplication); 3 TB, 5.28B files (after). 358 programming languages. Parquet Language modeling, autocompletion, program synthesis. 2022 [402] [403]
Kaggle is a data science competition platform and online community for data scientists and machine learning practitioners under Google LLC.Kaggle enables users to find and publish datasets, explore and build models in a web-based data science environment, work with other data scientists and machine learning engineers, and enter competitions to solve data science challenges.
Common Crawl is a nonprofit 501(c)(3) organization that crawls the web and freely provides its archives and datasets to the public. [ 1 ] [ 2 ] Common Crawl's web archive consists of petabytes of data collected since 2008. [ 3 ]
This article needs attention from an expert in Computer science. The specific problem is: further features needed. WikiProject Computer science may be able to help recruit an expert.
re3data.org is a global registry of research data repositories from all academic disciplines. It provides an overview of existing research data repositories in order to help researchers to identify a suitable repository for their data and thus comply with requirements set out in data policies. [1] [2] The registry went live in autumn 2012. [3]
DVC is a free and open-source, platform-agnostic version system for data, machine learning models, and experiments. [1] It is designed to make ML models shareable, experiments reproducible, [2] and to track versions of models, data, and pipelines. [3] [4] [5] DVC works on top of Git repositories [6] and cloud storage. [7]