The following code demonstrates Supervised Machine Learning using Linear Regression with the help of scikit-learn and matplotlib.
Note: Linear regression is a statistical method that models the relationship between a dependent variable and one or more independent variables by fitting a linear equation to the observed data. It is used for both analyzing the strength of the relationship between variables and for making predictions. For example, it can predict house prices based on the size of home.
# Jupyter-specific magic command that ensures that matplotlib
# plots appear inline (i.e., within the notebook).
%matplotlib inline
# imports necessary libraries
import matplotlib.pyplot as plt # for plotting
import numpy as np # for numerical operations
from sklearn.linear_model import LinearRegression # to create the linear regression model
#
# generate synthetic training data based on a linear relationship (with some random noise)
#
pool = np.random.RandomState(10) # create a random number generator
# (with a fixed seed for reproducibility)
x = 5 * pool.rand(30) # generate 30 random values between 0 and 5.
y = 3 * x - 2 + pool.rand(30) # the true underlying function is y = 3x - 2, but we add some
# noise using another random array to make it more realistic
# (like real-world data)
#
# create a Linear Regression model.
#
lregr = LinearRegression(fit_intercept=False) # fit_intercept=False means the model will not try
# to learn the intercept; we assume it's already
# included in the function (in this case, we use
# the full 3x - 2 + noise directly).
X = x[:, np.newaxis] # reshape the 1D array x into a 2D array (as required by scikit-learn)
lregr.fit(X, y) # train the model (fits the line) using the input features X and target values y
lspace = np.linspace(0, 5) # create 50 evenly spaced numbers between 0
# and 5 to use for plotting the prediction line
X_regr = lspace[:, np.newaxis] # reshape lspace into a 2D array
y_regr = lregr.predict(X_regr) # use the trained model to predict y
# values for these evenly spaced x values
plt.scatter(x, y); # plots the original noisy training data
plt.plot(X_regr, y_regr); # overlays the learned regression line on top of the data (prediction)