TTC-3600: Benchmark dataset for Turkish text categorization

The TTC-3600 data set is a collection of Turkish news and articles including categorized 3,600 documents from 6 well-known portals in Turkey. It has 4 different forms in ARFF Weka format.

資料與資源

額外的資訊

欄位
作者 MCI Machine Learning Repository
最後更新 十月 17, 2018, 14:35 (CST)
建立 九月 7, 2018, 10:23 (CST)
Area "Computer"
Associated Tasks "Classification
Attribute Characteristics "Integer"
Data Set Characteristics "Text"
Date Donated "2017-02-08"
Missing Values "N/A"
Number of Instances "3600"
Number of Web Hits "5755"
Number_of_Attributes "4814"