# Linear Regression Using TensorFlow with Examples

TensorFlow is a popular open-source library used for high-end numerical computations in machine learning and deep learning. The major reason for its wide acceptance is on accounts of the library’s support for APIs in various languages. APIs are used to ease the building of models and enhance the performance of project execution. With TensorFlow APIs, you can fully control your computations and execute many machine learning models much faster. The Tensorflow APIs can be broadly classified into two, low-level API and high-level APIs.

Low-level APIs are generally more elementary and detailed, allowing you to have full control of your functions and computations. Low levels APIs allow you to build and optimize your model from scratch. High-level APIs on the other hand are relatively simpler with predefined functions. You can easily execute computations that would require long lines on codes with low-level APIs, in one statement. Simply put, high-level APIs are easier to use.

Some common APIs include Keras and the estimator toolbox. In this tutorial, we will be using the estimator toolbox to build, train, and evaluate a machine learning model. We will streamline our focus to the Linear Regression algorithm and see the various methods that can be used to build the model.

By the end of this tutorial, you will learn:

- What linear regression is
- How to build a line of best fit with python
- How to build a linear regression model with Tensorflow
- What feature columns are
- What an input function is and why it is necessary
- Understanding what batch sizes, epoch, steps are
- How to feed data to your TensorFlow model using Pandas Dataframe
- How to feed data to your TensorFlow model using Numpy arrays and dictionaries.

Let’s begin with understanding what linear regression is.

**Introduction to Linear Regression**

A linear regression model is a model that is used to show how two variables are related. The linear regression algorithm seeks to find a line that best fits the two variables and can be used to predict the output of one variable given the other variable. The variable that is being predicted is called the dependent variable whereas the variable that is needed for the prediction is called the independent variable. This kind of model with two variables is called a simple linear regression model.

A system can have more than two variables nevertheless. When only more than two variables are involved, the model is called a multiple linear regression problem.

Let’s say we are dealing with a simple linear regression model. The independent variable is conventionally denoted by x while the dependent variable is denoted by y. The line that best describes how x is related to y is given by the formula,

**y = mx + b**

Where y is the dependent variable

x is the independent variable m is the weight of the regression line And b is the bias of the regression line The slope m can be found using the formula m=(y_{2}-y_{1})/(x_{2}-x_{1})

Sometimes, an error term β is added to the formula to make up for the fact that x and y cannot always have a linear relationship. The equation of the regression line now becomes

*y = mx + b+ β*

Where it is not added, it implies that knowing x and b are sufficient enough to ascertain the value of y.

The slope of the regression indicates whether the relationship between the dependent and independent variables are positive, negative, or non-existent.

- If the regression line is flat such that the slope of the line is zero, it means there is no relationship whatsoever between the dependent and independent variables. In other words, an increase in one variable does not affect the other variable.
- If the regression line slopes downwards, with the upper end of the line pointing towards the y-axis and the lower end pointing towards the x-axis, it implies that there exists a negative relationship between the dependent and independent variables. In other words, as one variable increases, the other decreases.
- If the regression line slopes upwards, such that the upper end of the line points away from the graph and the lower end pointing towards the x or y-intercept, it implies that there exists a positive relationship between the dependent and independent variables. In other words, as one variable increases, the other increases as well.

Now, we have a good understanding of what Linear Regression is about, let’s see how to build one in Python.

**Creating a Linear Regression Line in Python**

We can create the line of best fit between two variables using python. First, we would need to create some random data for the x and y-axis. To do this we would be needing the NumPy and matplotlib libraries.

#import the necessary librariesimportnumpyasnpfrommatplotlibimportpyplotasplt

We will define a random seed to ensure the randomly generated numbers remain the same even if the program is run again. This is a good practice to ensure homogeneity in our program.

#define a random seednp.random.seed(42)

At this point, we can now create our randomly generated data for both the x and y axis alongside some noise.

#create random points for the x and y axis from 0 to 20xs = np.linspace(0, 20, 50) ys = np.linspace(20, 0, 50)# add some positive and negative noise on the y axisys += np.random.uniform(-2, 2, 50) ys += np.random.uniform(-2, 2, 50)

Plotting the graph, we have

#plot the graphplt.plot(xs, ys, '.')

Using the formula earline defined, we can calculate the weight and bias of the graph and plot the line of best fit.

#define the weight of the regression linem = (((np.mean(xs) * np.mean(ys)) - np.mean(xs * ys)) / ((np.mean(xs) ** 2) - np.mean(xs ** 2)))#define the bias of the regression lineb = np.mean(ys) - (m * np.mean(xs))#define the equation of the regression lineregression_line = [(m * x) + bforxinxs]#plot the graphplt.plot(xs, ys, '.') plt.plot(xs, regression_line)

Output:

And there you have it – the regression line. The red line represents the line of best fits for the randomly generated data.

But this is a simple scenario where you have just one dependent and independent variable each. In many real-life situations, you will be dealing with more than one independent variable and even more than more dependent variables. For instance, the popular iris dataset has 4 independent variables (sepal length, petal length, sepal width, and sepal width) to determine the dependent variable (the species of the flower).

In such cases, to determine the line of best fit will be difficult using the above method. It would be impossible to even visualize the data since it has 5 variables in total (we can at most, visualize in 3 dimensions). Thus, a more sophisticated approach that involves training and evaluating the model is employed to determine the regression line. Let’s understand how this works.

**How Training a Linear Regression Model Works**

Let’s say we are dealing with the iris dataset. The independent variables are the sepal length, petal length, sepal width, and petal width, while the dependent variable is the species of the flower. In other words, to predict the species of the flower, you will need to define the petal length, sepal length, petal width, and petal width. The equation for the linear regression line is therefore

y = m1(sepal length) +m2(petal length) +m3(sepal width) +m4(petal width) + b+ β

Where m1, m2, m3, and m4 are the weights and b is the bias

When training the linear regression algorithm initializes a random number for the weight and bias equation and computes the predicted value for all the observations in the data. After this is done, the error in the prediction is calculated by subtracting the predicted values from the actual values.

**Error = _{yactual }– y_{pred}**

The goal is to attempt to make the error as minimal as possible. This error is technically called the cost function. For linear regression problems, the cost function fondly used is the mean of the sum of the errors. The value is called mean squared error and is mathematically represented as

MSE= 1mi=1nTxi-yi2

Txi is the predicted value while y is the actual value. T is the weight which is altered continuously until the MSE is as minimal as possible. But how is the weight altered?

After the mean squared error has been computed, the weights are calculatedly corrected using an optimizer. There are a plethora of optimizers but the common is the Gradient Descent optimizer. The gradient descent finds the derivative or the gradient by measuring how a change in the weight will affect the error. If the gradient descent is positive, then the weight needs to be reduced. If however, the gradient descent is negative, it implies that the weight must be increased. The process keeps on happening for different weights until the derivative is very close to zero. Each of the processes is called an iteration and the point where the derivative is approximately zero is called the local minimum.

But there’s one thing to note in this process. What informs how much the weight should be changed after each iteration? The idea of gradient descent is better explained with the analogy of a man going down the hill. With him taking giant strides, he will most likely not get to the steepest part of the hill because he will take giant strides when he is just close already. On the other hand, taking baby steps will take a longer time to get to the lowest part of the hill. The best bet is to take giant strides in the starting and reduce it as he goes down the hill.

Bringing it back to the machine learning algorithm, the difference in weight changes is determined by the learning rate. The learning rate determines how large or small the weights should be changed to get to the local minimum quickly. If the learning rate is large, the gradient descent would not get to the local minimum. If it is too small, it will take a lot of time to get there.

Source: Builtin

Your learning rate must be carefully defined such that the cost function decreases very rapidly in the first few iterations and stabilizes at some point as seen in the figure below.

In the figure above, we can see that the loss stabilizes after the 600^{th} iteration. That means the algorithm found the local minimum after tweaking the weights 600 times. The model has learned the data and is ready to make predictions for completely new data.

**Training a Linear Regression Model with TensorFlow (Example)**

In this session, we will go ahead to train a linear regression model using the Tensorflow API, TensorFlow.estimator. We will be using the popular Boston housing dataset for this example. The dataset will be imported from Scikit learn dataset repository.

We will start by importing the necessary libraries

#import necessary librariesimportpandasaspdimportnumpyasnpimporttensorflowastffromsklearn.datasetsimportload_boston

Then we’d go-ahead to load the dataset

#load the datasetboston = load_boston()

A dataset is a form of a dictionary where the keys are a list of information that can be extracted from the data. To check what the dataset is about, we use the DESCR method.

#check the description of the dataset

Output:

Boston house prices dataset --------------------------- **Data Set Characteristics:** :Number of Instances: 506 :Number of Attributes: 13 numeric/categorical predictive. Median Value (attribute 14)isusually the target. :Attribute Information (inorder): - CRIM per capita crime rate by town - ZN proportion of residential land zonedforlots over 25,000 sq.ft. - INDUS proportion of non-retail business acres per town - CHAS Charles River dummy variable (= 1iftract bounds river; 0 otherwise) - NOX nitric oxides concentration (parts per 10 million) - RM average number of rooms per dwelling - AGE proportion of owner-occupied units built prior to 1940 - DIS weighted distances to five Boston employment centres - RAD index of accessibility to radial highways - TAX full-value property-tax rate per $10,000 - PTRATIO pupil-teacher ratio by town - B 1000(Bk - 0.63)^2 where Bkisthe proportion of blacks by town - LSTAT % lower status of the population - MEDV Median value of owner-occupied homesin$1000's :Missing Attribute Values: None :Creator: Harrison, D.andRubinfeld, D.L. Thisisa copy of UCI ML housing dataset. https://archive.ics.uci.edu/ml/machine-learning-databases/housing/

As seen above, the dataset shows the median values of owner-occupied homes in $1000’s given various attributes such as per capita crime rate by town, the proportion of residential land zoned for lots over 25000 square feet, and the others. Let’s see what the dataset looks like

#convert the dataset into a dataframedf = pd.DataFrame(boston.data, columns=boston.feature_names)#print the first 5 rows of the dataframe

Output:

CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX \ 0 0.00632 18.0 2.31 0.0 0.538 6.575 65.2 4.0900 1.0 296.0 1 0.02731 0.0 7.07 0.0 0.469 6.421 78.9 4.9671 2.0 242.0 2 0.02729 0.0 7.07 0.0 0.469 7.185 61.1 4.9671 2.0 242.0 3 0.03237 0.0 2.18 0.0 0.458 6.998 45.8 6.0622 3.0 222.0 4 0.06905 0.0 2.18 0.0 0.458 7.147 54.2 6.0622 3.0 222.0 PTRATIO B LSTAT 0 15.3 396.90 4.98 1 17.8 396.90 9.14 2 17.8 392.83 4.03 3 18.7 394.63 2.94 4 18.7 396.90 5.33

The target column is separated and needs to be added to the dataframe. This is done using fancy indexing,

#add the target column to the dataframedf['MEDV'] = boston.target#print the first 5 rows of the data frame

Output:

CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX \ 0 0.00632 18.0 2.31 0.0 0.538 6.575 65.2 4.0900 1.0 296.0 1 0.02731 0.0 7.07 0.0 0.469 6.421 78.9 4.9671 2.0 242.0 2 0.02729 0.0 7.07 0.0 0.469 7.185 61.1 4.9671 2.0 242.0 3 0.03237 0.0 2.18 0.0 0.458 6.998 45.8 6.0622 3.0 222.0 4 0.06905 0.0 2.18 0.0 0.458 7.147 54.2 6.0622 3.0 222.0 PTRATIO B LSTAT MEDV 0 15.3 396.90 4.98 24.0 1 17.8 396.90 9.14 21.6 2 17.8 392.83 4.03 34.7 3 18.7 394.63 2.94 33.4 4 18.7 396.90 5.33 36.2

You can input your data to the tf.estimator method using various means. In this tutorial, we will use 2 different methods to pass the data into the tensorflow.estimator method: using pandas dataframe, using numpy arrays. There are other ways to feed the data into your tensorflow model but we will limit this tutorial to these two methods. Let’s begin with using pandas.

**Using Pandas**

Step 1: Specify the feature and target columns

For easy accessibility, the dataset will be split into columns containing the features and the column containing the target.

#define the feature columns and target columnsfeatures = df[boston.feature_names] target = 'MEDV'

Step 2: Define the estimator

Just before you define your model or estimator, TensorFlow requires you to define what is called feature columns. Feature columns is a data preprocessing step that transforms raw data into a form that can be understood by the TensorFlow estimator. You may see the feature_columns as a bridge between the raw data and the TensorFlow estimator. Note that only the features need to be passed and not the target column.

You can transform the feature columns with tensorflow using tf.feature_columns(). Since all the columns contain continuous numbers, the numeric_column() method will be used. To transform all the columns in one line of code, you can use a list comprehension.

#convert the feature columns into Tensorflow numeric columnfeature_columns = [tf.feature_column.numeric_column(i)foriinfeatures]

If however, the dataset contains categorical features, you will need to convert that using other methods such as the categorical)column_with_vocabulary_list() or the indicator_column().

After defining the feature columns, you can define your estimator. The estimator would require 2 arguments, the feature columns (which we just defined) and the model directory where the model parameters and graph will be stored. We will name the model directory ‘LinRegTrain’

There are 6 estimators in the tensorflow.estimator method – 3 each for regression and classification problems.

For regression problems, you may select

1. LinearRegressor

2. DNNRegressor

3. DNNLineaCombinedRegressor

For classification problems, you may select

1. LinearClassifier

2. DNNClassifier

3. DNNLineaCombinedClassifier

For this tutorial, we shall be using the LinearRegressor estimator. We can call the method using the tf.estimator.LinearRegressor method.

#define the linear regression estimatorestimator = tf.estimator.LinearRegressor(feature_columns=feature_columns, model_dir='LinRegTrain')

Output:

INFO:tensorflow:Using default config. INFO:tensorflow:Using config: {'_model_dir': 'LinRegTrain', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true graph_options { rewrite_options { meta_optimizer_iterations: ONE } } , '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x00000146D467B908>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}

Step 3: Split the data into train and test data

We will need to split the data into train data and test data. The model will be trained with the training dataset while it would be evaluated using the test dataset. First, you’d need to create a training size which is set to 75% of the entire observation. The train and test data were then specified using the iloc attribute.

#define the training size of the dataset to be 75%training_size = int(len(df) * 0.75)#define the train and test datatrain = df.iloc[:training_size, :] test = df.iloc[training_size:, :]

Step 4: Train the model.

The next step is to go ahead to train the model. It can be done using the estimator.train()

The method takes arguments that include the input function, the X data (features), y data (labels), batch size, number of epochs, shuffling, etc. These terms may sound alien but let’s take out time to demystify them one at a time.

- Input function: Tensorflow estimators take in data through what is called an input function. Tensorflow deals with tensors and so, all forms of data (it could be streaming data, in-memory data, custom data, etc) must be converted into tensors. The input function generates tensors from the raw data and supplies them to the TensorFlow estimator. The input function also configures how the model trains or evaluates the data. This is why you’d also need to define the batch size, the number of epochs, shuffling, etc. We’d discuss these terms momentarily.

The input data needed for the input function can either be a NumPy or pandas. In this method, the input data will be created as a pandas dataframe. We’d discuss how to use NumPy in the next method.

- Batch size: Tensorflow was designed to accommodate large datasets and parallel computing. When working with large datasets, you can train the data on various computers (parallel computing) or decide to use one computer if you don’t have the resources. If you’re using one computer to train a large dataset, it is impossible to expose your model to all the data at once. Your computer’s memory will run out of space. This is where defining your batch size comes into play.

The batch size helps to feed your data to the model in batches. If you have data with 5000 observations and you define a batch size of 100, the data will be split into 100 places. That means 50 observations will be fed to your model per iteration.

- Epoch: An epoch is the term used when the model has been exposed to all the data. If the epoch is set to 2, it means the model will be exposed to the data twice. The second time, it uses different weights with the aim of reducing the loss.

By default, the epoch is set to None. If you leave it this way, the model will see the data just once. That is, it will end after all the batches of the data have been fed to the model. You could define a parameter called steps. The number of steps is simply the number of iterations you want.

If you have 5000 observations and you set the batch size to be 100. It means it will take 50 iterations to see all the data. Setting the epoch to 4 will require 4 × 50 iterations. Another way of doing this is to set the steps argument to 200 and leave the number of epochs as None. It will run, 200 iterations.

- Shuffling: It is always a good idea to shuffle your data during training. This will ensure your model does not learn specific patterns in your data hook, line, and sinker. When it does learn patterns as it is, your model would not make good predictions even though it learns the data pretty well during training. This is called overfitting your model and it must be avoided.

Now we understand the useful parameters when creating an input function, let’s create an input function. Remember we are using the pandas_input function here.

'''This function defines the input function for the dataset'''definput_fn(dataset, batch_size=128, num_epochs=None, shuffle=True):returntf.estimator.inputs.pandas_input_fn( x = dataset[boston.feature_names], y = dataset['MEDV'], batch_size=batch_size, num_epochs=num_epochs, shuffle=shuffle )

And then, we can finally train the model.

#train the model with 2000 stepsestimator.train(input_fn=input_fn(train, num_epochs=None), steps=2000)

Output:

INFO:tensorflow:Calling model_fn. INFO:tensorflow:Done calling model_fn. INFO:tensorflow:Create CheckpointSaverHook. INFO:tensorflow:Graph was finalized. INFO:tensorflow:Restoring parametersfromLinRegTrain\model.ckpt-19081 INFO:tensorflow:Running local_init_op. INFO:tensorflow:Done running local_init_op. INFO:tensorflow:Saving checkpointsfor19081 into LinRegTrain\model.ckpt. INFO:tensorflow:loss = 2068.8728, step = 19082 INFO:tensorflow:global_step/sec: 159.792 INFO:tensorflow:loss = 3538.3533, step = 19182 (0.631 sec) INFO:tensorflow:global_step/sec: 200.115 INFO:tensorflow:loss = 1659.8722, step = 19282 (0.500 sec) INFO:tensorflow:global_step/sec: 203.782 INFO:tensorflow:loss = 3306.3003, step = 19382 (0.491 sec) INFO:tensorflow:global_step/sec: 207.158 INFO:tensorflow:loss = 3048.774, step = 19482 (0.483 sec) INFO:tensorflow:global_step/sec: 206.305 INFO:tensorflow:loss = 3164.751, step = 19582 (0.485 sec) INFO:tensorflow:global_step/sec: 208.453 INFO:tensorflow:loss = 2899.857, step = 19682 (0.481 sec) INFO:tensorflow:global_step/sec: 206.728 INFO:tensorflow:loss = 3106.3613, step = 19782 (0.482 sec) INFO:tensorflow:global_step/sec: 200.517 INFO:tensorflow:loss = 2640.4854, step = 19882 (0.499 sec) INFO:tensorflow:global_step/sec: 202.545 INFO:tensorflow:loss = 2857.7683, step = 19982 (0.494 sec) INFO:tensorflow:global_step/sec: 204.616 INFO:tensorflow:loss = 2167.958, step = 20082 (0.489 sec) INFO:tensorflow:global_step/sec: 204.198 INFO:tensorflow:loss = 2442.528, step = 20182 (0.491 sec) INFO:tensorflow:global_step/sec: 207.588 INFO:tensorflow:loss = 2945.9646, step = 20282 (0.482 sec) INFO:tensorflow:global_step/sec: 210.646 INFO:tensorflow:loss = 3567.1733, step = 20382 (0.477 sec) INFO:tensorflow:global_step/sec: 211.093 INFO:tensorflow:loss = 3195.5977, step = 20482 (0.472 sec) INFO:tensorflow:global_step/sec: 205.035 INFO:tensorflow:loss = 2235.641, step = 20582 (0.488 sec) INFO:tensorflow:global_step/sec: 208.453 INFO:tensorflow:loss = 2183.0503, step = 20682 (0.480 sec) INFO:tensorflow:global_step/sec: 206.729 INFO:tensorflow:loss = 3767.7236, step = 20782 (0.484 sec) INFO:tensorflow:global_step/sec: 198.135 INFO:tensorflow:loss = 3152.8857, step = 20882 (0.507 sec) INFO:tensorflow:global_step/sec: 182.586 INFO:tensorflow:loss = 2599.474, step = 20982 (0.546 sec) INFO:tensorflow:Saving checkpointsfor21081 into LinRegTrain\model.ckpt. INFO:tensorflow:Lossforfinal step: 2783.6982.

Step 5: Evaluate the model

Evaluating the models enables you to check how well the model can make predictions. Just as it is when training the model, you’d require an input function when evaluating the model as well.

It’s good practice to use the same input function you used when training the model, to evaluate the model. You’d however change the data passed in the input function. This time, the test data.

Let’s evaluate the model

#evaluate the modelevaluation = estimator.evaluate(input_fn=input_fn(test, num_epochs=10, shuffle=True))

Output:

INFO:tensorflow:Calling model_fn. INFO:tensorflow:Done calling model_fn. INFO:tensorflow:Starting evaluation at 2020-11-09T23:06:48Z INFO:tensorflow:Graph was finalized. INFO:tensorflow:Restoring parametersfromLinRegTrain\model.ckpt-21081 INFO:tensorflow:Running local_init_op. INFO:tensorflow:Done running local_init_op. INFO:tensorflow:Finished evaluation at 2020-11-09-23:06:49 INFO:tensorflow:Saving dictforglobalstep 21081: average_loss = 43.917557, global_step = 21081, label/mean = 14.948031, loss = 5577.53, prediction/mean = 18.892693 INFO:tensorflow:Saving 'checkpoint_path' summaryforglobalstep 21081: LinRegTrain\model.ckpt-21081

The model has a loss of $5577. To put this in perspective, let’s see the average price for a house according to the data.

train['MEDV'].describe()

Output:

count 379.000000 mean 25.074406 std 8.801969 min 11.800000 25% 19.400000 50% 22.800000 75% 28.700000 max 50.000000 Name: MEDV, dtype: float64

As seen above, the average price for a house is $25,000. The model’s performance can however still be tweaked by changing the training parameters such as the number of epochs, batch size, etc.

**Using Numpy Arrays**

We can also feed the data into the tensorflow estimator using numpy arrays. We start by splitting the data into train and test data.

#define the training size of the dataset to be 75%training_size = int(len(df) * 0.75)# #define the train and test datatrain = df[:training_size].values test = df[training_size:].values

The values attribute converts the data frame into a numpy array. The train and test data is then split into X_train, X_test, y_test, y_split

defprepare_data(df):"""This function splits the data frame into X and y data"""X = df[:, :-1] y = df[:,-1]returnX, y#define the X_train, y_train, X_test, y_test dataX_train, y_train = prepare_data(train) X_test, y_test = prepare_data(test)

The feature columns can now be defined using the code below.

#convert the feature columns into Tensorflow numeric columnfeature_columns = [tf.feature_column.numeric_column('x', shape=X_train.shape[1:])]

And now, the model can be trained using the feature columns defined above.

#define the linear regression estimatorestimator = tf.estimator.LinearRegressor(feature_columns=feature_columns, model_dir='LinRegTrain1')

Output:

INFO:tensorflow:Using default config. INFO:tensorflow:Using config: {'_model_dir': 'LinRegTrain1', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true graph_options { rewrite_options { meta_optimizer_iterations: ONE } } , '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x000001B878B46278>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}

Afterward, the input function for the train dataset is defined.

#define the train input functiontrain_input_fn = tf.estimator.inputs.numpy_input_fn( x = {'x': X_train}, y = y_train, batch_size=128, num_epochs=None, shuffle=True, )

Now, the model can be trained with the input function defined above.

#train the modelestimator.train(input_fn=train_input_fn, steps=5000)

Output:

INFO:tensorflow:Calling model_fn. INFO:tensorflow:Done calling model_fn. INFO:tensorflow:Create CheckpointSaverHook. INFO:tensorflow:Graph was finalized. INFO:tensorflow:Restoring parametersfromLinRegTrain1\model.ckpt-31500 INFO:tensorflow:Running local_init_op. INFO:tensorflow:Done running local_init_op. INFO:tensorflow:Saving checkpointsfor31500 into LinRegTrain1\model.ckpt. INFO:tensorflow:loss = 2298.775, step = 31501 INFO:tensorflow:global_step/sec: 529.409 INFO:tensorflow:loss = 3787.5515, step = 31601 (0.192 sec) INFO:tensorflow:global_step/sec: 637.311 INFO:tensorflow:loss = 4150.9785, step = 31701 (0.157 sec) INFO:tensorflow:global_step/sec: 515.754 INFO:tensorflow:loss = 2896.7476, step = 31801 (0.193 sec) INFO:tensorflow:global_step/sec: 592.061 INFO:tensorflow:loss = 2679.3975, step = 31901 (0.169 sec) INFO:tensorflow:global_step/sec: 621.468 INFO:tensorflow:loss = 2945.3008, step = 32001 (0.162 sec) INFO:tensorflow:global_step/sec: 671.53 INFO:tensorflow:loss = 3135.522, step = 32101 (0.148 sec) INFO:tensorflow:global_step/sec: 645.523 INFO:tensorflow:loss = 1737.3533, step = 32201 (0.155 sec) INFO:tensorflow:global_step/sec: 667.063 INFO:tensorflow:loss = 2122.1785, step = 32301 (0.150 sec) INFO:tensorflow:global_step/sec: 625.357 INFO:tensorflow:loss = 2111.141, step = 32401 (0.160 sec) INFO:tensorflow:global_step/sec: 667.048 INFO:tensorflow:loss = 3291.2444, step = 32501 (0.151 sec) INFO:tensorflow:global_step/sec: 667.05 INFO:tensorflow:loss = 4172.6313, step = 32601 (0.150 sec) INFO:tensorflow:global_step/sec: 667.041 INFO:tensorflow:loss = 3981.296, step = 32701 (0.149 sec) INFO:tensorflow:global_step/sec: 662.638 INFO:tensorflow:loss = 3058.733, step = 32801 (0.151 sec) INFO:tensorflow:global_step/sec: 690.052 INFO:tensorflow:loss = 2693.0422, step = 32901 (0.146 sec) INFO:tensorflow:global_step/sec: 667.048 INFO:tensorflow:loss = 3583.536, step = 33001 (0.151 sec) INFO:tensorflow:global_step/sec: 667.046 INFO:tensorflow:loss = 2732.2446, step = 33101 (0.150 sec) INFO:tensorflow:global_step/sec: 694.842 INFO:tensorflow:loss = 1517.0491, step = 33201 (0.143 sec) INFO:tensorflow:global_step/sec: 699.704 INFO:tensorflow:loss = 3136.3606, step = 33301 (0.142 sec) INFO:tensorflow:global_step/sec: 637.307 INFO:tensorflow:loss = 2074.9668, step = 33401 (0.157 sec) INFO:tensorflow:global_step/sec: 690.049 INFO:tensorflow:loss = 2896.9585, step = 33501 (0.145 sec) INFO:tensorflow:global_step/sec: 645.534 INFO:tensorflow:loss = 3446.4941, step = 33601 (0.156 sec) INFO:tensorflow:global_step/sec: 595.577 INFO:tensorflow:loss = 3641.3157, step = 33701 (0.168 sec) INFO:tensorflow:global_step/sec: 602.755 INFO:tensorflow:loss = 2726.9165, step = 33801 (0.166 sec) INFO:tensorflow:global_step/sec: 625.359 INFO:tensorflow:loss = 2953.2432, step = 33901 (0.163 sec) INFO:tensorflow:global_step/sec: 617.635 INFO:tensorflow:loss = 1818.0906, step = 34001 (0.158 sec) INFO:tensorflow:global_step/sec: 671.529 INFO:tensorflow:loss = 3186.2725, step = 34101 (0.149 sec) INFO:tensorflow:global_step/sec: 667.047 INFO:tensorflow:loss = 3824.3748, step = 34201 (0.150 sec) INFO:tensorflow:global_step/sec: 461.072 INFO:tensorflow:loss = 2467.5938, step = 34301 (0.221 sec) INFO:tensorflow:global_step/sec: 446.707 INFO:tensorflow:loss = 4125.1543, step = 34401 (0.220 sec) INFO:tensorflow:global_step/sec: 483.367 INFO:tensorflow:loss = 2444.977, step = 34501 (0.212 sec) INFO:tensorflow:global_step/sec: 555.873 INFO:tensorflow:loss = 2906.3044, step = 34601 (0.176 sec) INFO:tensorflow:global_step/sec: 581.73 INFO:tensorflow:loss = 2992.641, step = 34701 (0.171 sec) INFO:tensorflow:global_step/sec: 456.846 INFO:tensorflow:loss = 2374.501, step = 34801 (0.222 sec) INFO:tensorflow:global_step/sec: 529.453 INFO:tensorflow:loss = 2264.888, step = 34901 (0.187 sec) INFO:tensorflow:global_step/sec: 481.045 INFO:tensorflow:loss = 2630.542, step = 35001 (0.208 sec) INFO:tensorflow:global_step/sec: 529.405 INFO:tensorflow:loss = 3225.4219, step = 35101 (0.190 sec) INFO:tensorflow:global_step/sec: 492.893 INFO:tensorflow:loss = 1567.6238, step = 35201 (0.202 sec) INFO:tensorflow:global_step/sec: 431.279 INFO:tensorflow:loss = 3448.526, step = 35301 (0.234 sec) INFO:tensorflow:global_step/sec: 629.291 INFO:tensorflow:loss = 2485.2834, step = 35401 (0.157 sec) INFO:tensorflow:global_step/sec: 653.975 INFO:tensorflow:loss = 2805.2188, step = 35501 (0.152 sec) INFO:tensorflow:global_step/sec: 667.048 INFO:tensorflow:loss = 2969.5796, step = 35601 (0.151 sec) INFO:tensorflow:global_step/sec: 676.062 INFO:tensorflow:loss = 2702.0142, step = 35701 (0.147 sec) INFO:tensorflow:global_step/sec: 641.384 INFO:tensorflow:loss = 2972.1235, step = 35801 (0.157 sec) INFO:tensorflow:global_step/sec: 575.043 INFO:tensorflow:loss = 3671.458, step = 35901 (0.175 sec) INFO:tensorflow:global_step/sec: 662.639 INFO:tensorflow:loss = 2267.0298, step = 36001 (0.150 sec) INFO:tensorflow:global_step/sec: 704.633 INFO:tensorflow:loss = 2972.4639, step = 36101 (0.141 sec) INFO:tensorflow:global_step/sec: 694.84 INFO:tensorflow:loss = 2400.421, step = 36201 (0.146 sec) INFO:tensorflow:global_step/sec: 667.038 INFO:tensorflow:loss = 2839.9067, step = 36301 (0.148 sec) INFO:tensorflow:global_step/sec: 680.673 INFO:tensorflow:loss = 1892.6975, step = 36401 (0.147 sec) INFO:tensorflow:Saving checkpointsfor36500 into LinRegTrain1\model.ckpt. INFO:tensorflow:Lossforfinal step: 2119.7617.

To evaluate the data, you need to define another input function. This time, using the test dataset.

#define the test input functiontest_input_fn = tf.estimator.inputs.numpy_input_fn( x = {'x': X_test}, y = y_test, batch_size=128, num_epochs=10, shuffle=True )

Finally, we can evaluate the model to see how it performs.

#evaluate the modelestimator.evaluate(input_fn=test_input_fn)

Output:

INFO:tensorflow:Calling model_fn. INFO:tensorflow:Done calling model_fn. INFO:tensorflow:Starting evaluation at 2020-11-10T01:12:56Z INFO:tensorflow:Graph was finalized. INFO:tensorflow:Restoring parametersfromLinRegTrain1\model.ckpt-36500 INFO:tensorflow:Running local_init_op. INFO:tensorflow:Done running local_init_op. INFO:tensorflow:Finished evaluation at 2020-11-10-01:12:56 INFO:tensorflow:Saving dictforglobalstep 36500: average_loss = 46.48516, global_step = 36500, label/mean = 14.948031, loss = 5903.6157, prediction/mean = 19.212402 INFO:tensorflow:Saving 'checkpoint_path' summaryforglobalstep 36500: LinRegTrain1\model.ckpt-36500 Out[70]: {'average_loss': 46.48516, 'label/mean': 14.948031, 'loss': 5903.6157, 'prediction/mean': 19.212402, 'global_step': 36500}

Rounding off, we have discussed the pathway to building a TensorFlow model using a TensorFlow High-level API, tensorflow.estimator. We started by explaining the theory behind linear regression and afterward created a linear regression algorithm with python.

We took a step to build a linear regression model. The model was trained using the Boston housing dataset. In the example, we outlined the steps necessary to train and evaluate your model and how to tweak the performance of the model accordingly. In the next tutorial, we shall introduce you to the data preprocessing techniques and how to improve a model built on another TensorFlow high-end API, Keras.