模型評估:交叉驗證法
K折交叉驗證(K-fold cross-validation): 將樣本分成K份,每份數量大致相等,然後用其他的某一份作為測試,其他樣本作為訓練集,得到一個模型和一組預測值及模型評估值;迴圈這個過程K次,得到K組模型評估值,對其取平均值即得到最終的評估結果
from sklearn.model_selection import cross_val_score
clf = svm.SVC(kernel='linear', C=1)
scores = cross_val_score(clf, iris.data, iris.target, cv=5)
scores
>>>
#5次交叉驗證的得分
array([ 0.96..., 1. ..., 0.96..., 0.96..., 1. ])
#這種資料切分方式可以打亂順序
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import ShuffleSplit
cv = ShuffleSplit(n_splits=3, test_size=0.3, random_state=0)
cross_val_score(clf, iris.data, iris.target, cv=cv)
通過交叉驗證獲取預測
from sklearn import metrics
rom sklearn.model_selection import cross_val_predict
predicted = cross_val_predict(clf, iris.data, iris.target, cv=10)
metrics.accuracy_score(iris.target, predicted)
>>>
0.966