1模型选择与评价.ppt
文本预览下载声明
What is Model Selection? Model Selection refers to the process of optimizing a model (e.g., a classifier, a regression analyzer, and so on). Model Selection encompasses both the selection of a model (e.g., C4.5 versus Na?ve Bayes) and the adjustment of a particular model’s parameters (e.g., adjusting the number of hidden units in a neural network). What are potential issues with Model Selection? It is usually possible to improve a model’s fit with the data (up to a certain point). (e.g., more hidden units will allow a neural network to fit the data on which it is trained, better). We want the model to use enough information from the data set to be as unbiased as possible, but we want it to discard all the information it needs to make it generalize as well as it can (i.e., fare as well as possible on a variety of different context). As such, model selection is very tightly linked with the issue of the Bias/Variance tradeoff. Bias(偏置), Variance(估计方差) and Model Complexity Goals Model Selection: estimating the performance of different models in order to choose the best one. Model Assessment: having chosen a final model, estimating its generalization error on new data. Splitting the data Split the dataset into three parts: Training set: used to fit the models. Validation set: used to estimate prediction error for model selection. Test set: used to assess the generalization error for the final chosen model. 监督式机器学习--错误率的点估计 监督式机器学习--有限样本下错误率的区间估计 * Model Selection and Evaluation Book-The Elements of Statistical Learning (Second Edition) Chapter 7 Model Selection and Evaluation Performance Assessment: Loss Function 回归-Typical choices for quantitative response Y: (squared error) (absolute error) 分类-Typical choices for categorical response G: (0-1 loss)
显示全部