找到31個資料集

格式: HTML

篩選結果
  • Syskill and Webert Web Page Ratings

    This database contains HTML source of web pages plus the ratings of a single user on these web pages. Web pages are on four seperate subjects (Bands- recording artists; Goats;...
  • Reuters-21578 Text Categorization Collection

    This is a collection of documents that appeared on Reuters newswire in 1987. The documents were assembled and indexed with categories.
  • KDD Cup 1998 Data

    This is the data set used for The Second International Knowledge Discovery and Data Mining Tools Competition, which was held in conjunction with KDD-98
  • Movie

    This data set contains a list of over 10000 films including many older, odd, and cult films. There is information on actors, casts, directors, producers, studios, etc.
  • EEG Database

    This data arises from a large study to examine EEG correlates of genetic predisposition to alcoholism. It contains measurements from 64 electrodes placed on the scalp sampled at...
  • E. Coli Genes

    Data giving characteristics of each ORF (potential gene) in the E. coli genome. Sequence, homology (similarity to other genes) and structural information, and function (if...
  • Corel Image Features

    This dataset contains image features extracted from a Corel image collection. Four sets of features are available based on the color histogram, color histogram layout, color...
  • KDD Cup 1999 Data

    This is the data set used for The Third International Knowledge Discovery and Data Mining Tools Competition, which was held in conjunction with KDD-99
  • Internet Usage Data

    This data contains general demographic information on internet users in 1997.
  • Pseudo Periodic Synthetic Time Series

    This data set is designed for testing indexing schemes in time series databases. The data appears highly periodic, but never exactly repeats itself.
  • Coil 1999 Competition Data

    This data set is from the 1999 Computational Intelligence and Learning (COIL) competition. The data contains measurements of river chemical concentrations and algae densities.
  • MSNBC.com Anonymous Web Data

    This data describes the page visits of users who visited msnbc.com on September 28, 1999. Visits are recorded at the level of URL category (see description) and are recorded in...
  • Synthetic Control Chart Time Series

    This data consists of synthetically generated control charts.
  • Twenty Newsgroups

    This data set consists of 20000 messages taken from 20 newsgroups.
  • Census-Income (KDD)

    This data set contains weighted census data extracted from the 1994 and 1995 current population surveys conducted by the U.S. Census Bureau.
  • NSF Research Award Abstracts 1990-2003

    This data set consists of (a) 129,000 abstracts describing NSF awards for basic research, (b) bag-of-word data files extracted from the abstracts, (c) a list of words used for...
  • Pioneer-1 Mobile Robot Data

    This dataset contains time series sensor readings of the Pioneer-1 mobile robot. The data is broken into "experiences" in which the robot takes action for some period of time...
  • IPUMS Census Database

    This data set contains unweighted PUMS census data from the Los Angeles and Long Beach areas for the years 1970, 1980, and 1990.
  • Japanese Vowels

    This dataset records 640 time series of 12 LPC cepstrum coefficients taken from nine male speakers.
  • Australian Sign Language signs

    This data consists of sample of Auslan (Australian Sign Language) signs. Examples of 95 signs were collected from five signers with a total of 6650 sign samples.