Search results
Results from the WOW.Com Content Network
The datasets are classified, based on the licenses, as Open data and Non-Open data. The datasets from various governmental-bodies are presented in List of open government data sites. The datasets are ported on open data portals. They are made available for searching, depositing and accessing through interfaces like Open API. The datasets are ...
RAWPED is a dataset for detection of pedestrians in the context of railways. The dataset is labeled box-wise. 26000 Images Object recognition and classification 2020 [70] [71] Tugce Toprak, Burak Belenlioglu, Burak Aydın, Cuneyt Guzelis, M. Alper Selver OSDaR23 OSDaR23 is a multi-sensory dataset for detection of objects in the context of railways.
Kaggle is a data science competition platform and online community for data scientists and machine learning practitioners under Google LLC.Kaggle enables users to find and publish datasets, explore and build models in a web-based data science environment, work with other data scientists and machine learning engineers, and enter competitions to solve data science challenges.
Dataset from NASA's Socioeconomic Data and Applications Center includes raw population, population density, historic, current and predicted. Global Rural-Urban Mapping Project (GRUMP) Dataset from NASA's Socioeconomic Data and Applications Center (based on the above data, but includes information on rural and urban population balances).
Zenodo is a general-purpose open repository developed under the European OpenAIRE program and operated by CERN. [1] [2] [3] It allows researchers to deposit research papers, data sets, research software, reports, and any other research related digital artefacts.
Data Commons is an open-source platform [1] created by Google [2] that provides an open knowledge graph, combining economic, scientific and other public datasets into a unified view. [3] Ramanathan V. Guha, a creator of web standards including RDF, [4] RSS, and Schema.org, [5] founded the project, [6] which is now led by Prem Ramaswami. [7]
AMiner published several datasets for academic research purpose, including Open Academic Graph, [6] DBLP+citation [7] (a data set augmenting citations into the DBLP data from Digital Bibliography & Library Project), Name Disambiguation, [8] Social Tie Analysis. [9] For more available datasets and source codes for research, please refer to. [10]
The Pile is an 886.03 GB diverse, open-source dataset of English text created as a training dataset for large language models (LLMs). It was constructed by EleutherAI in 2020 and publicly released on December 31 of that year. [1] [2] It is composed of 22 smaller datasets, including 14 new ones. [1]