• Dorothea

    更新頻率 不定期 瀏覽次數 916 下載次數 88
    DOROTHEA is a drug discovery dataset. Chemical compounds represented by structural molecular features must be classified as active (binding to thrombin) or inactive. This is one...
  • Arcene

    更新頻率 不定期 瀏覽次數 996 下載次數 72
    ARCENE's task is to distinguish cancer versus normal patterns from mass-spectrometric data. This is a two-class classification problem with continuous input variables. This...
  • Madelon

    更新頻率 不定期 瀏覽次數 758 下載次數 21
    MADELON is an artificial dataset, which was part of the NIPS 2003 feature selection challenge. This is a two-class classification problem with continuous input variables. The...
  • Demospongiae

    更新頻率 不定期 瀏覽次數 820 下載次數 53
    Marine sponges of the Demospongiae class classification domain.
  • Dexter

    更新頻率 不定期 瀏覽次數 739 下載次數 52
    DEXTER is a text classification problem in a bag-of-word representation. This is a two-class classification problem with sparse continuous input variables. This dataset is one...
  • PAMAP2 Physical Activity Monitoring

    更新頻率 不定期 瀏覽次數 945 下載次數 41
    The PAMAP2 Physical Activity Monitoring dataset contains data of 18 different physical activities, performed by 9 subjects wearing 3 inertial measurement units and a heart rate...
  • Victorian Era Authorship Attribution

    更新頻率 不定期 瀏覽次數 598 下載次數 14
    To create the largest authorship attribution dataset, we extracted works of 50 well-known authors. To have a non-exhaustive learning, in training there are 45 authors whereas,...
  • OpinRank Review Dataset

    更新頻率 不定期 瀏覽次數 691 下載次數 22
    This data set contains user reviews of cars and and hotels collected from Tripadvisor (~259,000 reviews) and Edmunds (~42,230 reviews).
  • Folio

    更新頻率 不定期 瀏覽次數 636 下載次數 25
    20 photos of leaves for each of 32 different species.
  • AutoUniv

    更新頻率 不定期 瀏覽次數 593 下載次數 19
    AutoUniv is an advanced data generator for classifications tasks. The aim is to reflect the nuances and heterogeneity of real data. Data can be generated in .csv, ARFF or C4.5...
  • Gisette

    更新頻率 不定期 瀏覽次數 2085 下載次數 67
    GISETTE is a handwritten digit recognition problem. The problem is to separate the highly confusible digits '4' and '9'. This dataset is one of five datasets of the NIPS 2003...