6.3 真实世界中的数据集
scikit-learn 提供加载较大数据集的工具,并在必要时下载这些数据集。
这些数据集可以用下面的函数加载 :
调用 | 描述 |
---|---|
fetch_olivetti_faces([data_home, shuffle, …]) | Load the Olivetti faces data-set from AT&T (classification). |
fetch_20newsgroups([data_home, subset, …]) | Load the filenames and data from the 20 newsgroups dataset (classification). |
fetch_20newsgroups_vectorized([subset, …]) | Load the 20 newsgroups dataset and vectorize it into token counts (classification). |
fetch_lfw_people([data_home, funneled, …]) | Load the Labeled Faces in the Wild (LFW) people dataset (classification). |
fetch_lfw_pairs([subset, data_home, …]) | Load the Labeled Faces in the Wild (LFW) pairs dataset (classification). |
fetch_covtype([data_home, …]) | Load the covertype dataset (classification). |
fetch_rcv1([data_home, subset, …]) | Load the RCV1 multilabel dataset (classification). |
fetch_kddcup99([subset, data_home, shuffle, …]) | Load the kddcup99 dataset (classification). |
fetch_california_housing([data_home, …]) | Load the California housing dataset (regression). |
译者注:同样的,各个数据集的具体描述此处不翻译,若需查询请点击链接查看英文描述