The machine learning model that we use to make predictions on new data is called the final model.
我们用来对新数据进行预测的机器学习模型称为最终模型。
There can be confusion in applied machine learning about how to train a final model.
在应用机器学习中,关于如何训练最终模型可能会感到困惑。
This error is seen with beginners to the field who ask questions such as:
该领域的初学者会提出以下问题,从而看到此错误:
- How do I predict with cross validation?
如何通过交叉验证进行预测? - Which model do I choose from cross-validation?
我应该从交叉验证中选择哪种模型? - Do I use the model after preparing it on the training dataset?
在训练数据集上准备模型后,是否使用模型? - This post will clear up the confusion.
这篇文章将消除混乱。
In this post, you will discover how to finalize your machine learning model in order to make predictions on new data.
在这篇文章中,您将了解如何最终确定您的机器学习模型,以便对新数据进行预测。
Let’s get started. 让我们开始吧。
文章目录
- What is a Final Model? 什么是最终模型?
- The Purpose of Train/Test Sets 训练/测试集的目的
- Let’s unpack this further 让我们进一步解开这个
- The Purpose of k-fold Cross Validation k-fold交叉验证的目的
- Why do we use Resampling Methods? 为什么我们使用重采样方法?
- How to Finalize a Model? 如何完成模型?
- Common Questions 常见问题
- Why not keep the model trained on the training dataset? 为什么不保留在训练数据集上训练的模型?
- Why not keep the best model from the cross-validation? 为什么不保留交叉验证的最佳模型?
- Won’t the performance of the model trained on all of the data be different? 在所有数据上训练的模型的性能不会有所不同吗?
- Each time I train the model, I get a different performance score; should I pick the model with the best score?
- Summary 总结
What is a Final Model? 什么是最终模型?
A final machine learning model is a model that you use to make predictions on new data.
最终的机器学习模型是用于对新数据进行预测的模型。
That is, given new examples of input data, you want to use the model to predict the expected output. This may be a classification (assign a label) or a regression (a real value).
也就是说,给定输入数据的新示例,您希望使用该模型来预测预期输出。这可能是分类(分配标签)或回归(实际值)。
For example, whether the photo is a picture of a dog or a cat, or the estimated nu