Evaluating and Improving Model Robustness Using Scikit-learn

Evaluating and Improving Model Robustness Using Scikit-learn

Let’s learn how to evaluate and improve the model robustness with Scikit-Learn.

Preparation

In this tutorial, we need the Scikit-Learn and Pandas packages installed. If you haven’t done so, you need to install them.

pip install -U pandas scikit-learn

Then, we would use the Iris built-in dataset from Scikit-Learn as our data example.

from sklearn.datasets import load_iris

iris = load_iris()
X, y = iris.data, iris.target

With the packages installed and the dataset ready. Let’s get into the central part of the tutorial.

Evaluate and Improve Model Robustness

When we talk about Model robustness, we are discussing how machine learning models could maintain their performance when faced with different changes in the data. It’s important to understand robustness as the robustness would tell how good our model is in generalization and how reliable in real-world situations.

There are many techniques to evaluate and improve the model’s robustness. Let’s start with how to evaluate the robustness.

The simplest method to evaluate robustness is the train-test split method. This method splits the dataset into training and testing data, which helps to evaluate how well the model performs on unseen data.

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = LogisticRegression()
model.fit(X_train, y_train)

y_pred = model.predict(X_test)
print(f"Accuracy on test data: {accuracy_score(y_test, y_pred)}")

The output:

Accuracy on test data: 1.0

The basis for using a train-test split is that the performance in the training data and the test data should be performed adequately well. The model can’t be robust if it’s only good on one kind of dataset.

Another method to evaluate the model’s robustness is called Cross-Validation. It’s similar to the train-test split, but it iterates our dataset and alternates which part of our dataset becomes the training and test dataset.

from sklearn.model_selection import cross_val_score

cv_scores = cross_val_score(model, X, y, cv=5, scoring='accuracy')
print(f"Cross-validation scores: {cv_scores}")
print(f"Mean accuracy: {cv_scores.mean()}")

The output:

Cross-validation scores: [0.96666667, 1., 0.93333333, 0.96666667, 1. ]
Mean accuracy: 0.9733333333333334

Cross-validation is a reliable method to estimate the model’s robustness as it evaluates the robustness multiple times.

Let’s discuss how to improve robustness. We can employ a few techniques, such as Hyperparameter Optimization.

from sklearn.model_selection import GridSearchCV

param_grid = {'C': [0.1, 1, 10, 100], 'solver': ['liblinear', 'lbfgs']}
grid_search = GridSearchCV(LogisticRegression(), param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train, y_train)

best_model = grid_search.best_estimator_
print(f"Best hyperparameters: {grid_search.best_params_}")
print(f"Best cross-validation accuracy: {grid_search.best_score_}")

The output:

Best hyperparameters: {'C': 1, 'solver': 'lbfgs'}
Best cross-validation accuracy: 0.9666666666666666

A hyperparameter is the model parameter we can set before we train the model. By tuning them, we can see which parameter shows the best performance. By combining hyperparameter optimization with robustness evaluation, we can get the most robust model.

Another method to improve robustness is to scale the dataset. By scaling the data, many models can perform better, especially those that rely on distances. For example, below is the scaled data with Robust Scaler.

from sklearn.preprocessing import RobustScaler

scaler = RobustScaler()
X_scaled = scaler.fit_transform(X_train)

The more advanced method is by using the ensemble method, which combines weak individual models into one. For example, we can use the Voting Classifier to combine them.

from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import VotingClassifier

clf1 = LogisticRegression()
clf2 = RandomForestClassifier()
clf3 = GradientBoostingClassifier()

ensemble = VotingClassifier(estimators=[('lr', clf1), ('rf', clf2), ('gb', clf3)], voting='soft')
ensemble.fit(X_train, y_train)

y_pred = ensemble.predict(X_test)
print(f"Accuracy on test data: {accuracy_score(y_test, y_pred)}")

The output:

Accuracy on test data: 1.0

Individual models might not be good as it is, but we can improve the robustness by combining the models.

Try to master the techniques for evaluating and improving the robustness to acquire the best model.

Additional Resources

Leave a Reply

Your email address will not be published. Required fields are marked *