資料集

  • gene expression cancer RNA-Seq

    更新頻率 不定期 瀏覽次數 927 下載次數 87
    This collection of data is part of the RNA-Seq (HiSeq) PANCAN data set, it is a random extraction of gene expressions of patients having different types of tumor
  • NSF Research Award Abstracts 1990-2003

    更新頻率 不定期 瀏覽次數 1247 下載次數 153
    This data set consists of (a) 129,000 abstracts describing NSF awards for basic research, (b) bag-of-word data files extracted from the abstracts, (c) a list of words used for...
  • CalIt2 Building People Counts

    更新頻率 不定期 瀏覽次數 779 下載次數 83
    This data comes from the main door of the CalIt2 building at UCI.
  • Activity recognition with healthy older people using a batteryless wearable s...

    更新頻率 不定期 瀏覽次數 1259 下載次數 32
    Sequential motion data from 14 healthy older people aged 66 to 86 years old using a batteryless, wearable sensor on top of their clothing for the recognition of activities in...
  • Pioneer-1 Mobile Robot Data

    更新頻率 不定期 瀏覽次數 685 下載次數 56
    This dataset contains time series sensor readings of the Pioneer-1 mobile robot. The data is broken into "experiences" in which the robot takes action for some period of time...
  • BLOGGER

    更新頻率 不定期 瀏覽次數 718 下載次數 21
    In this paper, we look for to recognize the causes of users tend to cyber space in Kohkiloye and Boyer Ahmad Province in Iran
  • Document Understanding

    更新頻率 不定期 瀏覽次數 1341 下載次數 180
    Five concepts, expressed as predicates, to be learned
  • Wearable Computing: Classification of Body Postures and Movements (PUC-Rio)

    更新頻率 不定期 瀏覽次數 624 下載次數 22
    A dataset with 5 classes (sitting-down, standing-up, standing, walking, and sitting) collected on 8 hours of activities of 4 healthy subjects. We also established a baseline...
  • Discrete Tone Image Dataset

    更新頻率 不定期 瀏覽次數 629 下載次數 22
    Discrete Tone Images(DTI)are available which needs to be analyzed in detail. Here, we created this dataset for those who do research in DTI.
  • UJIIndoorLoc

    更新頻率 不定期 瀏覽次數 673 下載次數 25
    The UJIIndoorLoc is a Multi-Building Multi-Floor indoor localization database to test Indoor Positioning System that rely on WLAN/WiFi fingerprint.
  • Spambase

    更新頻率 不定期 瀏覽次數 680 下載次數 34
    Classifying Email as Spam or Non-Spam
  • Undocumented

    更新頻率 不定期 瀏覽次數 674 下載次數 37
    Various datasets without documentation (feel free to explore!)
  • OCT data & Color Fundus Images of Left & Right Eyes

    更新頻率 不定期 瀏覽次數 780 下載次數 33
    This dataset contains OCT data (in mat format) and color fundus data (in jpg format) of left & right eyes of 50 healthy persons.
  • Activity Recognition system based on Multisensor data fusion (AReM)

    更新頻率 不定期 瀏覽次數 1172 下載次數 30
    This dataset contains temporal data from a Wireless Sensor Network worn by an actor performing the activities
  • Cardiotocography

    更新頻率 不定期 瀏覽次數 826 下載次數 24
    The dataset consists of measurements of fetal heart rate (FHR) and uterine contraction (UC) features on cardiotocograms classified by expert obstetricians.
  • Polish companies bankruptcy data

    更新頻率 不定期 瀏覽次數 717 下載次數 35
    The dataset is about bankruptcy prediction of Polish companies.The bankrupt companies were analyzed in the period 2000-2012, while the still operating companies were evaluated...
  • Multiple Features

    更新頻率 不定期 瀏覽次數 826 下載次數 111
    This dataset consists of features of handwritten numerals (0'--9') extracted from a collection of Dutch utility maps
  • Dorothea

    更新頻率 不定期 瀏覽次數 866 下載次數 87
    DOROTHEA is a drug discovery dataset. Chemical compounds represented by structural molecular features must be classified as active (binding to thrombin) or inactive. This is one...
  • Climate Model Simulation Crashes

    更新頻率 不定期 瀏覽次數 630 下載次數 26
    Given Latin hypercube samples of 18 climate model input parameter values, predict climate model simulation crashes and determine the parameter value combinations that cause the...
  • IPUMS Census Database

    更新頻率 不定期 瀏覽次數 808 下載次數 115
    This data set contains unweighted PUMS census data from the Los Angeles and Long Beach areas for the years 1970, 1980, and 1990.
  • Sentiment Labelled Sentences

    更新頻率 不定期 瀏覽次數 574 下載次數 18
    The dataset contains sentences labelled with positive or negative sentiment.
  • DrivFace

    更新頻率 不定期 瀏覽次數 528 下載次數 5
    The DrivFace contains images sequences of subjects while driving in real scenarios. It is composed of 606 samples of 640×480, acquired over different days from 4 drivers with...
  • Gas sensor array under flow modulation

    更新頻率 不定期 瀏覽次數 561 下載次數 20
    The data set contains 58 time series acquired from 16 chemical sensors under gas flow modulation conditions. The sensors were exposed to different gaseous binary mixtures of...
  • Appliances energy prediction

    更新頻率 不定期 瀏覽次數 726 下載次數 32
    Experimental data used to create regression models of appliances energy use in a low energy building.
  • Nomao

    更新頻率 不定期 瀏覽次數 558 下載次數 12
    Nomao collects data about places (name, phone, localization...) from many sources. Deduplication consists in detecting what data refer to the same place. Instances in the...
  • Dishonest Internet users Dataset

    更新頻率 不定期 瀏覽次數 554 下載次數 13
    The dataset was used to test an architecture based on a trust model capable to cope with the evaluation of the trustworthiness of users interacting in pervasive environments.
  • MiniBooNE particle identification

    更新頻率 不定期 瀏覽次數 552 下載次數 11
    This dataset is taken from the MiniBooNE experiment and is used to distinguish electron neutrinos (signal) from muon neutrinos (background).
  • BLE RSSI Dataset for Indoor localization and Navigation

    更新頻率 不定期 瀏覽次數 740 下載次數 35
    This dataset contains RSSI readings gathered from an array of Bluetooth Low Energy (BLE) iBeacons in a real-world and operational indoor environment for localization and...
  • Soybean (Small)

    更新頻率 不定期 瀏覽次數 1121 下載次數 113
    Michalski's famous soybean disease database
  • Predict keywords activities in a online social media

    更新頻率 不定期 瀏覽次數 546 下載次數 10
    The data from Twitter was collected during 360 consecutive days. It was done by querying 1497 English keywords sampled from Wikipedia. This dataset is proposed in a Learning to...
  • Arcene

    更新頻率 不定期 瀏覽次數 928 下載次數 66
    ARCENE's task is to distinguish cancer versus normal patterns from mass-spectrometric data. This is a two-class classification problem with continuous input variables. This...
  • Daily Demand Forecasting Orders

    更新頻率 不定期 瀏覽次數 1118 下載次數 23
    The dataset was collected during 60 days, this is a real database of a brazilian logistics company.
  • Forest Fires

    更新頻率 不定期 瀏覽次數 578 下載次數 21
    This is a difficult regression task, where the aim is to predict the burned area of forest fires, in the northeast region of Portugal, by using meteorological and other data...
  • Condition monitoring of hydraulic systems

    更新頻率 不定期 瀏覽次數 539 下載次數 18
    The data set addresses the condition assessment of a hydraulic test rig based on multi sensor data. Four fault types are superimposed with several severity grades impeding...
  • Diabetes

    更新頻率 不定期 瀏覽次數 584 下載次數 17
    This diabetes dataset is from AIM '94
  • Hayes-Roth

    更新頻率 不定期 瀏覽次數 545 下載次數 27
    Topic
  • Open University Learning Analytics dataset

    更新頻率 不定期 瀏覽次數 553 下載次數 13
    Open University Learning Analytics Dataset contains data about courses, students and their interactions with Virtual Learning Environment for seven selected courses and more...
  • Gesture Phase Segmentation

    更新頻率 不定期 瀏覽次數 590 下載次數 17
    The dataset is composed by features extracted from 7 videos with people gesticulating, aiming at studying Gesture Phase Segmentation. It contains 50 attributes divided into two...
  • DBWorld e-mails

    更新頻率 不定期 瀏覽次數 604 下載次數 13
    It contains 64 e-mails which I have manually collected from DBWorld mailing list. They are classified in
  • Twin gas sensor arrays

    更新頻率 不定期 瀏覽次數 595 下載次數 20
    5 replicates of an 8-MOX gas sensor array were exposed to different gas conditions (4 volatiles at 10 concentration levels each).
  • Ultrasonic flowmeter diagnostics

    更新頻率 不定期 瀏覽次數 535 下載次數 10
    Fault diagnosis of four liquid ultrasonic flowmeters
  • Physicochemical Properties of Protein Tertiary Structure

    更新頻率 不定期 瀏覽次數 846 下載次數 13
    This is a data set of Physicochemical Properties of Protein Tertiary Structure. The data set is taken from CASP 5-9. There are 45730 decoys and size varying from 0 to 21 armstrong.
  • Blood Transfusion Service Center

    更新頻率 不定期 瀏覽次數 553 下載次數 8
    Data taken from the Blood Transfusion Service Center in Hsin-Chu City in Taiwan -- this is a classification problem.
  • EMG dataset in Lower Limb

    更新頻率 不定期 瀏覽次數 714 下載次數 20
    3 different exercises
  • Activities of Daily Living (ADLs) Recognition Using Binary Sensors

    更新頻率 不定期 瀏覽次數 543 下載次數 12
    This dataset comprises information regarding the ADLs performed by two users on a daily basis in their own homes.
  • Tennis Major Tournament Match Statistics

    更新頻率 不定期 瀏覽次數 537 下載次數 11
    This is a collection of 8 files containing the match statistics for both women and men at the four major tennis tournaments of the year 2013. Each file has 42 columns and a...
  • Parkinson Disease Spiral Drawings Using Digitized Graphics Tablet

    更新頻率 不定期 瀏覽次數 702 下載次數 15
    Handwriting database consists of 62 PWP(People with Parkinson) and 15 healthy individuals. Three types of recordings (Static Spiral Test, Dynamic Spiral Test and Stability Test)...
  • PubChem Bioassay Data

    更新頻率 不定期 瀏覽次數 712 下載次數 10
    These highly imbalanced bioassay datasets are from the differing types of screening that can be performed using HTS technology. 21 datasets were created from 12 bioassays.
  • Auto MPG

    更新頻率 不定期 瀏覽次數 611 下載次數 30
    Revised from CMU StatLib library, data concerns city-cycle fuel consumption
  • Function Finding

    更新頻率 不定期 瀏覽次數 547 下載次數 7
    Cases collected mostly from investigations in physical science; intention is to evaluate function-finding algorithms