Today, I explored validation techniques for smaller data sets, namely K-Fold cross validation.
To start, the linear regression model was re-retrained fusing 70% of the data as a training set and 30% as the test set. Here are the results obtained
As we can see, the model shows similar performance with r-squared = 0.38 approx
Now the same model was tested again using K-fold cross-validtion with 5 folds. Here are the results for the linear and polynomial regression models
The linear and polynomial models both show similar mean r-squared values of 0.30, which is lower than the score obtained without using cross-validation.
The polynomial regression score will tend to increase with higher degrees of polynomial if we validate using the test data as it leads to overfitting