enow.com Web Search

Search results

  1. Results from the WOW.Com Content Network
  2. List of datasets for machine-learning research - Wikipedia

    en.wikipedia.org/wiki/List_of_datasets_for...

    A dataset for NLP and climate change media researchers The dataset is made up of a number of data artifacts (JSON, JSONL & CSV text files & SQLite database) Climate news DB, Project's GitHub repository [394] ADGEfficiency Climatext Climatext is a dataset for sentence-based climate change topic detection. HF dataset [395] University of Zurich ...

  3. List of datasets in computer vision and image processing

    en.wikipedia.org/wiki/List_of_datasets_in...

    The dataset is labeled with semantic labels for 32 semantic classes. over 700 images Images Object recognition and classification 2008 [56] [57] [58] Gabriel J. Brostow, Jamie Shotton, Julien Fauqueur, Roberto Cipolla RailSem19 RailSem19 is a dataset for understanding scenes for vision systems on railways. The dataset is labeled semanticly and ...

  4. GitHub - Wikipedia

    en.wikipedia.org/wiki/Github

    GitHub (/ ˈ ɡ ɪ t h ʌ b /) is a proprietary developer platform that allows developers to create, store, manage, and share their code. It uses Git to provide distributed version control and GitHub itself provides access control, bug tracking, software feature requests, task management, continuous integration, and wikis for every project. [8]

  5. HuffPost Data

    data.huffingtonpost.com

    HuffPost Data Visualization, analysis, interactive maps and real-time graphics. Browse, copy and fork our open-source software.; Remix thousands of aggregated polling results.

  6. Enron Corpus - Wikipedia

    en.wikipedia.org/wiki/Enron_Corpus

    The Enron Corpus is a database of over 600,000 emails generated by 158 employees [1] of the Enron Corporation in the years leading up to the company's collapse in December 2001. The corpus was generated from Enron email servers by the Federal Energy Regulatory Commission (FERC) during its subsequent investigation. [ 2 ]

  7. MNIST database - Wikipedia

    en.wikipedia.org/wiki/MNIST_database

    Extended MNIST (EMNIST) is a newer dataset developed and released by NIST to be the (final) successor to MNIST. [ 15 ] [ 16 ] MNIST included images only of handwritten digits. EMNIST includes all the images from NIST Special Database 19 (SD 19), which is a large database of 814,255 handwritten uppercase and lower case letters and digits.

  8. The Pile (dataset) - Wikipedia

    en.wikipedia.org/wiki/The_Pile_(dataset)

    The Pile is an 886.03 GB diverse, open-source dataset of English text created as a training dataset for large language models (LLMs). It was constructed by EleutherAI in 2020 and publicly released on December 31 of that year. [1] [2] It is composed of 22 smaller datasets, including 14 new ones. [1]

  9. ScyllaDB - Wikipedia

    en.wikipedia.org/wiki/ScyllaDB

    Employees of ScyllaDB Inc. remain the primary coders behind Scylla, but its development is open to the public and uses public GitHub repositories and public mailing lists. References [ edit ]