Data Poisoning

Setup #

For this lab, you will need python installed. You can either install it on your host machine or on the virtual machine.

NOTE You’re python install may vary, so utilize python or python3 as required on your machine.

Start by cloning the repository

git clone https://github.com/jastardev/CISC350-Data-Poisoning-Lab.git

Navigate into the folder

cd CISC350-Data-Poisoning-Lab

Create a virtual environment to keep the packages isolated

python -m venv venv

Activate the virtual environment

# On Mac/Linux
source venv/bin/activate

# On Windows
venv/bin/activate

After activating the virtual environment, you can install the required python packages

pip install -r requirements.txt

I’ve already generated the test data and the poisoned data, but if you need to regenerate it, you can use the following commands:

# The valid data
python generate_credit_card_data.py

# Poisoning the data
python poison_dataset.py

Walkthrough #

This Machine Learning Model is a classifier that takes in data on credit card applicants and predicts whether they will be approved or not.

It’s a pretty simple LogisticRegression model, and runs very quickly.

The script can be run using following command:

python credit_card_classifier.py credit_card_data.csv

When executed, the script takes the following actions:

Loads in the data from the provided CSV file
Preprocesses the data by checking for any missing values and adding null values if it finds any
The data is split between a training data set and a testing dataset
The model is trained using the training dataset
The model makes predictions using the testing dataset.
Accuracy and Precision are calculated

Executing the script with the first dataset shows that it achieves accuracy and precision scores of 81% and 80%.

Poisoning the dataset #

What the poisoning script takes 5% of the clean dataset and flips the targets in one of 3 ways:

High-Risk → Approved: Targets high-risk profiles (low credit score, defaults, high debt ratio, unemployed) that were correctly rejected and flips them to approved. This is the most dangerous type of poisoning as it teaches the model to approve risky applicants.
Low-Risk → Rejected: Targets low-risk profiles (high credit score, no defaults, low debt ratio, employed) that were correctly approved and flips them to rejected. This creates confusion and degrades overall model performance.
Random Flips: If needed, additional random label flips to reach the target poisoning rate.

Using the poisoned dataset, we can rerun the classifier script

python credit_card_classifier.py credit_card_data_poisoned.csv

We’ll see that with the poisoned dataset, the classifier’s performance is nearly 10% worse!

So how could this actually happen? #

Data Collection Via Web Scraping #

If you’re getting your data from a public location such as forums, social media, or stock image sites, attackers can post poison data to the sites. Your web scraper then collects the data, the pipeline processes it as it would any other data, and the model gets trained on the data.

Roque Data Labelers #

Many types of machine learning and AI algorithms require data to be labeled with their intended classification. Because this is a time-consuming effort, developers will often hire contractors to perform the labeling for them (Amazon Mechanical Turk, Scale AI, etc). Or they will obtain pre-labeled dataset from sources like Kaggle. During the labeling process, attackers could intentionally mislabel the data to throw off the model performance.

Data Storage Hacking #

Hackers could gain access to the location where the training data is stored, and simply alter or replace the data with poisoned data.

Continuous Learning #

Many production machine learning system operate a model that continuously learns. Developers will train the model, then as the production system runs, it takes in more data and continues to learn. Attackers may be able to inject poisoned data into the model to affect that continuous learning cycle. Additionally, the data could even be accidentally poisoned by faulty sensors or bad programming that introduces unintended errors.