Introduction to Deep Learning

PyTorch for Solving the XOR Problem

Dr. Greg Chism

College of Information Science

Objective

To overview neural network concepts and implement a neural network to solve the XOR problem using PyTorch.

Code today 😬

Overview of topics

  1. Key concepts
  2. Intro to Pytorch
  3. Intro to the XOR problem
  4. Applied DL with Pytorch

Key concepts

Artificial Intelligence

Why deep learning in health sciences?

  • DL achieves higher diagnostic accuracy than ever before.

  • It’s used in critical healthcare applications like disease detection.

  • High-performance GPUs and cloud computing now power deep learning efficiently in healthcare.

Machine Learning vs. Deep Learning

Deep Learning Networks

DL Components: Neurons

  • The building blocks of neural networks.

  • Each neuron performs a weighted sum of inputs, followed by an activation function.

DL Components: Layers

Data enters here \(\rightarrow\)

Processed here \(\rightarrow\)

Result / prediction

DL Components: Activation functions

  • Characteristics:

    • Simple and fast to compute.

    • Keeps positive values, helping the network learn better.

    • Widely used in hidden layers of neural networks.

  • Characteristics:

    • Produces values between 0 and 1, ideal for binary decisions.

    • Can slow down learning for extreme inputs (very high or low values).

    • Often used in the final layer for binary classification tasks.

DL Components: Weights

  • Weights transform input data in the network’s hidden layers.

  • Each weight indicates the connection strength between nodes.

  • Initially small random values (see Stochastic Gradient Descent)

DL Components: Biases

Training progression

Training progression

Cost function

A cost function is a mathematical function that calculates the difference between the target actual values (ground truth) and the values predicted by the model.

  • Functionally similar to model evaluation

Backpropogation

Backpropogation

Backpropogation

Intro to

What is PyTorch?

  • PyTorch is an open-source deep learning framework by Facebook AI, popular for its flexibility and ease of use.

  • It supports dynamic computation graphs, ideal for complex model architectures.

Why PyTorch?

  • User-Friendly: Pythonic and easy to learn, making it accessible for both beginners and experts.

  • Dynamic Graphs: Allows on-the-fly changes, enabling flexible model design and debugging.

  • Widespread Use: Preferred in research and industry due to its simplicity and power.

  • Strong Ecosystem: Extensive community support with many resources, libraries, and tutorials.

PyTorch key components

Core data structure, similar to NumPy arrays but with GPU support.

import torch
x = torch.tensor([1, 2, 3])
x = x.to('cuda')  # Move to GPU

Automatic differentiation for gradient-based optimization.

x = torch.tensor(1.0, requires_grad=True)
y = x**2
y.backward()
print(x.grad)  # Outputs 2.0

Tools for building and training neural networks.

import torch.nn as nn
class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        self.fc = nn.Linear(2, 1)
    def forward(self, x):
        return self.fc(x)

Optimization algorithms like SGD and Adam.

import torch.optim as optim
model = SimpleNN()
optimizer = optim.SGD(model.parameters(), lr=0.01)

The XOR Problem

What is the XOR Problem?

  • The XOR (Exclusive OR) gate is a logic gate that outputs true (1) only when the inputs differ (one is true, the other is false).

  • XOR is a classic problem in neural networks because it highlights the limitations of simple linear models.

Why Linear Models Fail

  • Linear models can only separate data points with a single straight line (a linear boundary).

  • The XOR problem is not linearly separable because you cannot draw a straight line to separate the two classes (0 and 1) correctly.

  • To solve XOR, we need a model that can capture non-linear patterns, which is where neural networks come into play.

Applied DL with the XOR Problem

Data preparation

XOR Dataset
import numpy as np
import pandas as pd

xor = pd.read_csv("data/xor.csv")

X = xor[['x1', 'x2']].values
y = xor['class label'].values
  • Load and normalize the data to ensure features are on a similar scale.

  • Convert the data into PyTorch tensors for model training.

  • Use DataLoader to handle batching, which improves training efficiency and stability.

import torch
from sklearn.preprocessing import StandardScaler
from torch.utils.data import DataLoader, TensorDataset

scaler = StandardScaler()
X_normalized = scaler.fit_transform(X)

X_tensor = torch.tensor(X_normalized, dtype=torch.float32)
y_tensor = torch.tensor(y.reshape(-1, 1), dtype=torch.float32)

dataset = TensorDataset(X_tensor, y_tensor)
dataloader = DataLoader(dataset, batch_size=4, shuffle=True)

Building the XOR Neural Network

Defining the Neural Network Architecture:

  • A simple network with one hidden layer, using sigmoid activation functions.

  • The first layer maps the 2 input features to 2 hidden neurons.

  • The second layer outputs a single value, predicting the class.

import torch.nn as nn

class XORNet(nn.Module):
    def __init__(self):
        super(XORNet, self).__init__()
        self.fc1 = nn.Linear(2, 2)
        self.fc2 = nn.Linear(2, 1)
    
    def forward(self, x):
        x = torch.sigmoid(self.fc1(x))
        x = torch.sigmoid(self.fc2(x))
        return x

model = XORNet()

Training the Network

Implementing the Training Loop:

  • Perform a forward pass, compute the loss, and backpropagate the error.

  • Update weights and biases using the optimizer.

  • Monitor loss to ensure the model is learning correctly.

Training
import torch.optim as optim

criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.1)

for epoch in range(100):
    for batch_X, batch_y in dataloader:
        optimizer.zero_grad()
        output = model(batch_X)
        loss = criterion(output, batch_y)
        loss.backward()
        optimizer.step()

    if (epoch+1) % 10 == 0:
        print(f'Epoch {epoch+1}, Loss: {loss.item()}')
Epoch 10, Loss: 0.25745782256126404
Epoch 20, Loss: 0.1755615770816803
Epoch 30, Loss: 0.019272323697805405
Epoch 40, Loss: 0.005365592427551746
Epoch 50, Loss: 0.0035495259799063206
Epoch 60, Loss: 0.004271871410310268
Epoch 70, Loss: 0.0018818906974047422
Epoch 80, Loss: 0.0015647737309336662
Epoch 90, Loss: 0.0022734408266842365
Epoch 100, Loss: 0.03850968927145004

Testing + Evaluation

Testing the Model on XOR Dataset:

  • Evaluate the model on the entire dataset after training.

  • Calculate accuracy by comparing predicted values to actual labels.

with torch.no_grad():
    predicted = model(X_tensor)
    predicted = (predicted > 0.5).float()
    accuracy = (predicted == y_tensor).sum().item() / y_tensor.size(0)
    print(f'Accuracy: {accuracy * 100}%')
Accuracy: 96.8%

Summary

Summary

  • Revolutionizing diagnostics and personalized treatments.

  • Success depends on understanding neural networks, architectures, and overcoming challenges.

Thank You 😊