• gene expression cancer RNA-Seq

    更新頻率 不定期
    This collection of data is part of the RNA-Seq (HiSeq) PANCAN data set, it is a random extraction of gene expressions of patients having different types of tumor
  • Multiple Features

    更新頻率 不定期
    This dataset consists of features of handwritten numerals (0'--9') extracted from a collection of Dutch utility maps
  • Predict keywords activities in a online social media

    更新頻率 不定期
    The data from Twitter was collected during 360 consecutive days. It was done by querying 1497 English keywords sampled from Wikipedia. This dataset is proposed in a Learning to...
  • Mobile Robots

    更新頻率 不定期
    Learning concepts from sensor data of a mobile robot; set of data sets
  • Australian Sign Language signs

    更新頻率 不定期
    This data consists of sample of Auslan (Australian Sign Language) signs. Examples of 95 signs were collected from five signers with a total of 6650 sign samples.
  • Entree Chicago Recommendation Data

    更新頻率 不定期
    This data contains a record of user interactions with the Entree Chicago restaurant recommendation system.
  • CMU Face Images

    更新頻率 不定期
    This data consists of 640 black and white face images of people taken with varying pose (straight, left, right, up), expression (neutral, happy, sad, angry), eyes (wearing...
  • First-order theorem proving

    更新頻率 不定期
    Given a theorem, predict which of five heuristics will give the fastest proof when used by a first-order prover. A sixth prediction declines to attempt a proof, should the...
  • URL Reputation

    更新頻率 不定期
    Anonymized 120-day subset of the ICML-09 URL data containing 2.4 million examples and 3.2 million features.
  • Insurance Company Benchmark (COIL 2000)

    更新頻率 不定期
    This data set used in the CoIL 2000 Challenge contains information on customers of an insurance company. The data consists of 86 variables and includes product usage data and...
  • Australian Sign Language signs (High Quality)

    更新頻率 不定期
    This data consists of sample of Auslan (Australian Sign Language) signs. 27 examples of each of 95 Auslan signs were captured from a native signer using high-quality position...
  • Reuters RCV1 RCV2 Multilingual, Multiview Text Categorization Test collection

    更新頻率 不定期
    This test collection contains feature characteristics of documents originally written in five different languages and their translations, over a common set of 6 categories.
  • UNIX User Data

    更新頻率 不定期
    This file contains 9 sets of sanitized user data drawn from the command histories of 8 UNIX computer users at Purdue over the course of up to 2 years.
  • Buzz in social media

    更新頻率 不定期
    This data-set contains examples of buzz events from two different social networks
  • Volcanoes on Venus - JARtool experiment

    更新頻率 不定期
    The JARtool project was a pioneering effort to develop an automatic system for cataloging small volcanoes in the large set of Venus images returned by the Magellan spacecraft.
  • KASANDR

    更新頻率 不定期
    KASANDR is a novel, publicly available collection for recommendation systems that records the behavior of customers of the European leader in e-Commerce advertising, Kelkoo.
  • Census-Income (KDD)

    更新頻率 不定期
    This data set contains weighted census data extracted from the 1994 and 1995 current population surveys conducted by the U.S. Census Bureau.
  • Twenty Newsgroups

    更新頻率 不定期
    This data set consists of 20000 messages taken from 20 newsgroups.
  • Amazon Access Samples

    更新頻率 不定期
    Amazon's InfoSec is getting smarter about the way Access data is leveraged. This is an anonymized sample of access provisioned within the company.
  • EEG Database

    更新頻率 不定期
    This data arises from a large study to examine EEG correlates of genetic predisposition to alcoholism. It contains measurements from 64 electrodes placed on the scalp sampled at...