• Daily and Sports Activities

    更新頻率 不定期
    The dataset comprises motion sensor data of 19 daily and sports activities each performed by 8 subjects in their own style for 5 minutes. Five Xsens MTx units are used on the...
  • Restaurant & consumer data

    更新頻率 不定期
    The dataset was obtained from a recommender system prototype. The task was to generate a top-n list of restaurants according to the consumer preferences.
  • Absenteeism at work

    更新頻率 不定期
    The database was created with records of absenteeism at work from July 2007 to July 2010 at a courier company in Brazil.
  • Dynamic Features of VirusShare Executables

    更新頻率 不定期
    This dataset contains the dynamic features of 107,888 executables, collected by VirusShare from Nov/2010 to Jul/2014.
  • Record Linkage Comparison Patterns

    更新頻率 不定期
    Element-wise comparison of records with personal data from a record linkage setting. The task is to decide from a comparison pattern whether the underlying records belong to one...
  • Smartphone-Based Recognition of Human Activities and Postural Transitions

    更新頻率 不定期
    Activity recognition data set built from the recordings of 30 subjects performing basic activities and postural transitions while carrying a waist-mounted smartphone with...
  • YouTube Comedy Slam Preference Data

    更新頻率 不定期
    This dataset provides user vote data on which video from a pair of videos is funnier collected on YouTube Comedy Slam. The task is to automatically predict this preference based...
  • HCC Survival

    更新頻率 不定期
    Hepatocellular Carcinoma dataset (HCC dataset) was collected at a University Hospital in Portugal. It contains real clinical data of 165 patients diagnosed with HCC.
  • Bank Marketing

    更新頻率 不定期
    The data is related with direct marketing campaigns (phone calls) of a Portuguese banking institution. The classification goal is to predict if the client will subscribe a term...
  • DeliciousMIL: A Data Set for Multi-Label Multi-Instance Learning with Instanc...

    更新頻率 不定期
    This dataset includes 1) 12234 documents (8251 training, 3983 test) extracted from DeliciousT140 dataset, 2) class labels for all documents, 3) labels for a subset of sentences...
  • Urban Land Cover

    更新頻率 不定期
    Classification of urban land cover using high resolution aerial imagery. Intended to assist sustainable urban planning efforts.
  • Condition Based Maintenance of Naval Propulsion Plants

    更新頻率 不定期
    Data have been generated from a sophisticated simulator of a Gas Turbines (GT), mounted on a Frigate characterized by a COmbined Diesel eLectric And Gas (CODLAG) propulsion...
  • Internet Advertisements

    更新頻率 不定期
    This dataset represents a set of possible advertisements on Internet pages.
  • User Identification From Walking Activity

    更新頻率 不定期
    The dataset collects data from an Android smartphone positioned in the chest pocket from 22 participants walking in the wild over a predefined path.
  • NYSK

    更新頻率 不定期
    NYSK (New York v. Strauss-Kahn) is a collection of English news articles about the case relating to allegations of sexual assault against the former IMF director Dominique...
  • Sports articles for objectivity analysis

    更新頻率 不定期
    1000 sports articles were labeled using Amazon Mechanical Turk as objective or subjective. The raw texts, extracted features, and the URLs from which the articles were retrieved...
  • AutoUniv

    更新頻率 不定期
    AutoUniv is an advanced data generator for classifications tasks. The aim is to reflect the nuances and heterogeneity of real data. Data can be generated in .csv, ARFF or C4.5...
  • ElectricityLoadDiagrams20112014

    更新頻率 不定期
    This data set contains electricity consumption of 370 points/clients.
  • Crowdsourced Mapping

    更新頻率 不定期
    Crowdsourced data from OpenStreetMap is used to automate the classification of satellite images into different land cover classes (impervious, farm, forest, grass, orchard, water).
  • IDA2016Challenge

    更新頻率 不定期
    The dataset consists of data collected from heavy Scania trucks in everyday usage.