Deep Learning from Scratch Series: Linear Regression with TensorFlow

This blog teaches you how to implement linear regression with TensorFlow and apply it to a simple dataset of Boston housing prices.

Table of Contents

1. Introduction

In this blog, you will learn how to implement linear regression with TensorFlow and apply it to a simple dataset of Boston housing prices. Linear regression is one of the most basic and widely used machine learning algorithms, which can be used to predict a continuous value based on one or more input features. For example, you can use linear regression to predict the price of a house based on its size, location, number of rooms, etc.

TensorFlow is an open-source framework for developing and deploying machine learning applications. It provides a high-level API called Keras, which allows you to create and train models with ease. You will use TensorFlow and Keras to build and train a linear regression model from scratch, without using any predefined functions or libraries.

The dataset you will use is the Boston Housing Prices dataset, which contains information about 506 houses in the Boston area, along with their median values. You will use this dataset to train your model and evaluate its performance on unseen data.

By the end of this blog, you will be able to:

Understand the theory and implementation of linear regression
Use TensorFlow and Keras to create and train a linear regression model
Load and preprocess a dataset for machine learning
Visualize and analyze the data and the model
Evaluate the model’s performance and accuracy

Are you ready to dive into deep learning from scratch? Let’s get started!

2. Linear Regression: Theory and Implementation

In this section, you will learn the theory and implementation of linear regression, one of the most basic and widely used machine learning algorithms. Linear regression can be used to predict a continuous value based on one or more input features. For example, you can use linear regression to predict the price of a house based on its size, location, number of rooms, etc.

But what is linear regression exactly? How does it work? And how can you implement it with TensorFlow? These are the questions that you will answer in this section. You will first learn the mathematical formulation of linear regression, and then you will see how to code it with TensorFlow and Keras.

Let’s start with the basics: what is linear regression?

2.1. What is Linear Regression?

Linear regression is a machine learning algorithm that can be used to predict a continuous value based on one or more input features. For example, you can use linear regression to predict the price of a house based on its size, location, number of rooms, etc.

The basic idea of linear regression is to find a linear relationship between the input features and the output value. This means that you can express the output value as a weighted sum of the input features, plus a constant term called the bias. For example, if you have one input feature x, and one output value y, you can write:

$$y = w x + b$$

where w is the weight and b is the bias. The weight and the bias are the parameters of the linear regression model, which determine how the input feature affects the output value. The goal of linear regression is to find the optimal values of these parameters that minimize the error between the predicted and the actual output values.

How do you find the optimal values of the parameters? One common method is to use a loss function that measures the difference between the predicted and the actual output values. For example, you can use the mean squared error (MSE) loss function, which is defined as:

$$MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i – \hat{y}_i)^2$$

where n is the number of samples, $y_i$ is the actual output value for the i-th sample, and $\hat{y}_i$ is the predicted output value for the i-th sample. The MSE loss function gives a higher value when the prediction is far from the actual value, and a lower value when the prediction is close to the actual value. Therefore, by minimizing the MSE loss function, you can find the parameters that make the prediction more accurate.

How do you minimize the loss function? One common method is to use an optimization algorithm called gradient descent, which updates the parameters in the direction that reduces the loss function. For example, you can update the weight and the bias as follows:

$$w = w – \alpha \frac{\partial MSE}{\partial w}$$
$$b = b – \alpha \frac{\partial MSE}{\partial b}$$

where $\alpha$ is the learning rate, which controls how much the parameters change in each update. The partial derivatives of the MSE loss function with respect to the weight and the bias can be calculated as follows:

$$\frac{\partial MSE}{\partial w} = \frac{2}{n} \sum_{i=1}^{n} x_i (y_i – \hat{y}_i)$$
$$\frac{\partial MSE}{\partial b} = \frac{2}{n} \sum_{i=1}^{n} (y_i – \hat{y}_i)$$

By repeating this update process for a number of iterations, you can find the optimal values of the weight and the bias that minimize the MSE loss function.

Now that you have learned the theory of linear regression, how can you implement it with TensorFlow? That’s what you will learn in the next section.

2.2. How to Implement Linear Regression with TensorFlow?

In this section, you will learn how to implement linear regression with TensorFlow and Keras. You will use the high-level API of Keras to create and train a linear regression model from scratch, without using any predefined functions or libraries. You will also use the low-level API of TensorFlow to perform some calculations and operations on tensors.

The first step is to import the necessary modules and libraries. You will need TensorFlow, Keras, NumPy, and Matplotlib for this tutorial. You can import them as follows:

import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt

The next step is to create the linear regression model. You can use the keras.Sequential class to create a sequential model, which is a stack of layers. For linear regression, you only need one layer, which is the keras.layers.Dense layer. This layer takes the input features and produces the output value by applying a linear transformation. You can specify the number of input features and the number of output values as arguments to the Dense layer. For example, if you have one input feature and one output value, you can create the model as follows:

model = keras.Sequential([
    keras.layers.Dense(1, input_shape=(1,))
])

The next step is to compile the model. You need to specify the loss function, the optimization algorithm, and the metrics that you want to use to evaluate the model. For linear regression, you can use the mean squared error (MSE) loss function, the gradient descent optimizer, and the mean absolute error (MAE) metric. You can compile the model as follows:

model.compile(loss='mse', optimizer='sgd', metrics=['mae'])

The next step is to train the model. You need to provide the input features and the output values as dataset to the model. You can also specify the number of epochs, which is the number of times the model goes through the entire dataset, and the batch size, which is the number of samples that the model processes at once. You can train the model as follows:

history = model.fit(x_train, y_train, epochs=10, batch_size=32)

The fit method returns a history object, which contains information about the training process, such as the loss and the metrics values for each epoch. You can use this object to plot the learning curves and visualize the model’s performance. You can plot the learning curves as follows:

plt.plot(history.history['loss'], label='loss')
plt.plot(history.history['mae'], label='mae')
plt.xlabel('Epoch')
plt.ylabel('Value')
plt.legend()
plt.show()

The final step is to evaluate the model on unseen data. You need to provide the input features and the output values as dataset to the model. You can use the evaluate method to calculate the loss and the metrics values for the test dataset. You can evaluate the model as follows:

test_loss, test_mae = model.evaluate(x_test, y_test)
print('Test loss:', test_loss)
print('Test mae:', test_mae)

You can also use the predict method to make predictions for new input features. You can predict the output values as follows:

y_pred = model.predict(x_new)
print('Predictions:', y_pred)

Congratulations! You have successfully implemented linear regression with TensorFlow and Keras. You have learned how to create, compile, train, evaluate, and predict with a linear regression model. You have also learned how to use the high-level and low-level APIs of TensorFlow and Keras.

3. Dataset: Boston Housing Prices

In this section, you will learn about the dataset that you will use to train and test your linear regression model. The dataset is the Boston Housing Prices dataset, which contains information about 506 houses in the Boston area, along with their median values. You can use this dataset to predict the price of a house based on its features, such as the crime rate, the number of rooms, the distance to the city center, etc.

The Boston Housing Prices dataset is a classic dataset that has been widely used for machine learning and data analysis. It was originally collected by the U.S. Census Service in 1978, and it has 14 attributes, 13 of which are input features and one of which is the output value. The input features are:

CRIM: per capita crime rate by town
ZN: proportion of residential land zoned for lots over 25,000 sq.ft.
INDUS: proportion of non-retail business acres per town
CHAS: Charles River dummy variable (1 if tract bounds river; 0 otherwise)
NOX: nitric oxides concentration (parts per 10 million)
RM: average number of rooms per dwelling
AGE: proportion of owner-occupied units built prior to 1940
DIS: weighted distances to five Boston employment centres
RAD: index of accessibility to radial highways
TAX: full-value property-tax rate per $10,000
PTRATIO: pupil-teacher ratio by town
B: 1000(Bk – 0.63)^2 where Bk is the proportion of blacks by town
LSTAT: % lower status of the population

The output value is:

MEDV: Median value of owner-occupied homes in $1000’s

The Boston Housing Prices dataset is available in the sklearn.datasets module, which is a collection of datasets for machine learning. You can load the dataset as follows:

from sklearn.datasets import load_boston
boston = load_boston()

The load_boston</

4. Data Preprocessing and Visualization

In this section, you will learn how to preprocess and visualize the dataset that you will use to train and test your linear regression model. Data preprocessing is an important step in machine learning, as it can improve the quality and performance of the model. Data visualization is a useful technique to explore and understand the data, as well as to communicate the results and insights.

The first step in data preprocessing is to split the dataset into training and test sets. You will use the training set to train the model, and the test set to evaluate the model. You can use the train_test_split function from the sklearn.model_selection module to randomly split the dataset into a given ratio. For example, you can split the dataset into 80% training and 20% test sets as follows:

from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(boston.data, boston.target, test_size=0.2, random_state=42)

The next step in data preprocessing is to scale the input features. Scaling the input features means to transform them to have a similar range of values, such as between 0 and 1, or between -1 and 1. Scaling the input features can help the model to converge faster and avoid numerical issues. You can use the MinMaxScaler class from the sklearn.preprocessing module to scale the input features to the range of 0 and 1. You can fit the scaler on the training set and then apply it to both the training and test sets as follows:

from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
x_train = scaler.fit_transform(x_train)
x_test = scaler.transform(x_test)

The final step in data preprocessing is to reshape the input and output arrays. You need to reshape the input arrays to have two dimensions, where the first dimension is the number of samples and the second dimension is the number of features. You also need to reshape the output arrays to have two dimensions, where the first dimension is the number of samples and the second dimension is the number of output values. You can use the reshape method of NumPy to reshape the arrays as follows:

x_train = x_train.reshape(-1, 13)
x_test = x_test.reshape(-1, 13)
y_train = y_train.reshape(-1, 1)
y_test = y_test.reshape(-1, 1)

Now that you have preprocessed the dataset, you can visualize it to explore and understand it better. You can use the matplotlib.pyplot module to plot the data and the seaborn module to create statistical and relational plots. For example, you can plot the distribution of the output values as follows:

import matplotlib.pyplot as plt
import seaborn as sns
plt.hist(y_train, bins=20, label='train')
plt.hist(y_test, bins=20, label='test')
plt.xlabel('Median value of owner-occupied homes in $1000\'s')
plt.ylabel('Frequency')
plt.legend()
plt.show()

You can also plot the correlation matrix of the input features as follows:

corr_matrix = np.corrcoef(x_train.T)
sns.heatmap(corr_matrix, annot=True, xticklabels=boston.feature_names, yticklabels=boston.feature_names)
plt.show()

You can also plot the scatter plots of the input features and the output values as follows:

fig, axes = plt.subplots(3, 5, figsize=(15, 10))
for i, ax in enumerate(axes.flat):
    if i < 13:
        ax.scatter(x_train[:, i], y_train, alpha=0.5)
        ax.set_xlabel(boston.feature_names[i])
        ax.set_ylabel('Median value of owner-occupied homes in $1000\'s')
    else:
        ax.set_visible(False)
plt.tight_layout()
plt.show()

By visualizing the data, you can gain some insights and intuitions about the data, such as the distribution, the correlation, and the relationship of the input features and the output values. You can also identify some outliers and anomalies in the data, which you can further investigate or remove.

You have completed the data preprocessing and visualization section. You have learned how to split, scale, and reshape the dataset, as well as how to plot the data using matplotlib and seaborn. You are now ready to train and evaluate your linear regression model.

5. Model Training and Evaluation

In this section, you will learn how to train and evaluate your linear regression model using the dataset that you have preprocessed and visualized. You will use the TensorFlow and Keras modules that you have imported and the model that you have created and compiled. You will also use some metrics and plots to measure and visualize the model’s performance and accuracy.

The first step is to train the model using the training set. You can use the fit method of the model to feed the input features and the output values to the model and update the model’s parameters. You can also specify the number of epochs, which is the number of times the model goes through the entire training set, and the batch size, which is the number of samples that the model processes at once. For example, you can train the model for 10 epochs and a batch size of 32 as follows:

history = model.fit(x_train, y_train, epochs=10, batch_size=32)

plt.plot(history.history['loss'], label='loss')
plt.plot(history.history['mae'], label='mae')
plt.xlabel('Epoch')
plt.ylabel('Value')
plt.legend()
plt.show()

The learning curves show how the loss and the mean absolute error (MAE) change over the epochs. Ideally, you want the loss and the MAE to decrease as the epochs increase, which means that the model is learning and improving. However, you also want to avoid overfitting, which is when the model performs well on the training set but poorly on the test set. Overfitting can happen when the model learns too much from the training set and fails to generalize to new data. One way to detect overfitting is to compare the training and test loss and MAE values. If the training values are much lower than the test values, it may indicate overfitting.

The next step is to evaluate the model using the test set. You can use the evaluate method of the model to feed the input features and the output values to the model and calculate the loss and the metrics values for the test set. For example, you can evaluate the model as follows:

test_loss, test_mae = model.evaluate(x_test, y_test)
print('Test loss:', test_loss)
print('Test mae:', test_mae)

The test loss and the test MAE are the measures of the model’s performance and accuracy on the unseen data. You want these values to be as low as possible, which means that the model’s predictions are close to the actual values. However, you also want these values to be similar to the training loss and the training MAE, which means that the model is consistent and robust.

The final step is to make predictions using the model. You can use the predict method of the model to feed new input features to the model and get the output values. For example, you can make predictions for some new input features as follows:

x_new = np.array([[0.1, 0.2, 0.3, 0, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, 0.1, 0.2]])
y_pred = model.predict(x_new)
print('Predictions:', y_pred)

The predictions are the output values that the model generates for the new input features. You can compare these predictions with the actual values, if available, or with your own expectations. You can also plot the predictions and the actual values to visualize the model’s performance and accuracy. You can plot the predictions and the actual values as follows:

plt.scatter(y_test, y_pred, alpha=0.5)
plt.xlabel('Actual values')
plt.ylabel('Predicted values')
plt.plot([0, 50], [0, 50], 'r')
plt.show()

The scatter plot shows the relationship between the actual values and the predicted values. Ideally, you want the points to be close to the red line, which means that the predictions are equal to the actual values. You can also calculate the coefficient of determination (R-squared) to measure how well the model fits the data. The R-squared value ranges from 0 to 1, where 1 means a perfect fit and 0 means no fit. You can calculate the R-squared value as follows:

from sklearn.metrics import r2_score
r2 = r2_score(y_test, y_pred)
print('R-squared:', r2)

You have completed the model training and evaluation section. You have learned how to train, evaluate, and predict with your linear regression model using TensorFlow and Keras. You have also learned how to use some metrics and plots to measure and visualize the model’s performance and accuracy. You are now ready to conclude your blog and summarize your main findings and future work.

6. Conclusion and Future Work

In this blog, you have learned how to implement linear regression with TensorFlow and Keras from scratch. You have covered the following topics:

The theory and implementation of linear regression, including the mathematical formulation, the loss function, the optimization algorithm, and the model creation and compilation.
The dataset of Boston housing prices, including its attributes, its loading, its splitting, its scaling, and its reshaping.
The data preprocessing and visualization, including the use of matplotlib and seaborn to plot the distribution, the correlation, and the relationship of the input features and the output values.
The model training and evaluation, including the use of the fit, evaluate, and predict methods, the learning curves, the test loss and MAE, the predictions, and the R-squared value.

By following this blog, you have gained a solid understanding of the basics of linear regression and how to apply it to a simple dataset using TensorFlow and Keras. You have also learned how to use some tools and techniques to preprocess and visualize the data, as well as to measure and visualize the model’s performance and accuracy.

However, this blog is not the end of your learning journey. There are many ways to improve and extend your knowledge and skills in linear regression and deep learning. Here are some suggestions for future work:

Try different input features and output values to see how they affect the model’s performance and accuracy. For example, you can use different combinations of the 13 input features, or you can use a different output value, such as the crime rate or the pupil-teacher ratio.
Try different loss functions and optimization algorithms to see how they affect the model’s convergence and robustness. For example, you can use the mean absolute error (MAE) or the root mean squared error (RMSE) loss functions, or you can use the Adam or the RMSprop optimizers.
Try different hyperparameters and regularization techniques to see how they affect the model’s overfitting and generalization. For example, you can change the number of epochs, the batch size, the learning rate, or you can add some dropout or L2 regularization to the model.
Try different datasets and problems to see how linear regression can be applied to different domains and scenarios. For example, you can use the diabetes dataset, the California housing dataset, or the wine quality dataset, which are also available in the sklearn.datasets module.

We hope you enjoyed this blog and learned something new and useful. Thank you for reading and happy coding!

1. Introduction

2. Linear Regression: Theory and Implementation

2.1. What is Linear Regression?

2.2. How to Implement Linear Regression with TensorFlow?

3. Dataset: Boston Housing Prices

4. Data Preprocessing and Visualization

5. Model Training and Evaluation

6. Conclusion and Future Work

Contempli

Related Posts

Deep Learning from Scratch Series: Conclusion and Future Directions

Deep Learning from Scratch Series: Meta-Learning with TensorFlow

Deep Learning from Scratch Series: Graph Neural Networks with TensorFlow