TensorBoard: How to Use TensorBoard for Graph Visualization

One of the reasons TensorFlow stands out as the library for building neural networks is because it can perform fast model visualization. How do I mean? With TensorFlow, you can have a pictorial representation of how each operation flows to the next until a final output is returned. Not only that, you can as well see graphs that show how the model learns per epoch. This is especially useful in large models where it becomes difficult to inspect or debug your code, line by line. TensorBoard is the interface dedicated by Google to visualizing the computational operations in a model. And that’s what we will be discussing in this tutorial.

TensorBoard comes preinstalled, upon installing TensorFlow on your machine. Not only can TensorFlow create graphs, but it also assists you in the event of debugging or optimization. If for instance, your model is not working as you expect it to be, TensorBoard can help you fix it.

In the Tutorial, you will discover how to use TensorFlow to create the graph visualization of your model. We will explain with step-by-step examples, how to get TensorBoard running on your system and further, how to use it. We will start by visualizing the computational graph for a simple mathematical function, then take it a step further to build a neural network and visualize its graph.

By the end of the tutorial, you will learn

Graphs in Tensorflow
An overview of TensorBoard
How neural network works
How to use TensorBoard

If you’re ready, let’s jump right into it.

First, we’d need to understand what a computational graph or ‘graph’ for short is.

Graphs in TensorFlow.

In TensorFlow, all computations are represented as a scheme of dataflow. Every operation performs a mathematical operation on its input tensors to return another tensor. A graph simply shows the dependencies between the computations. In other words, the dataflow graph is a pictorial representation of the computations in a TensorFlow model, that allows you to visualize how the computations are connected.

A computational graph has a node and an edge. The edge represents the tensors in the graph, either before the operation or after the operation. The node on the other hand represents the operations performed. A node receives a tensor or combination of tensors as input, operates, and connects that node to another node. The graph is structured to reveal the connection between these nodes or operations.

It is important to point out that the computational graph does not reveal the output of each operation but rather helps to visualize how the operations are linked together. Let’s take an example.

Let’s say we want to visualize how the variables in the function, fx, y=x2+ y2+xy+2x-15 are connected. A TensorBoard creates the graph which looks like this.

TensorBoard: How to Use TensorBoard for Graph Visualization

X and Y are the tensors while the circular nodes represent the operators. Let’s see how we can build this function using TensorFlow.

We would begin by initializing the variables as well as the constant. We then combined the variables according to some operation to define the function. The code below does this.

# import the necessary library
import tensorflow as tf

#create the x and y variables 
#x is given an initial value of 1
x = tf.get_variable('x', dtype=tf.int32, initializer=tf.constant([1]))
#y is given an initial value of 2
y = tf.get_variable('y', dtype=tf.int32, initializer=tf.constant([2]))

#create the constant
c = tf.constant([15], name='constant')
two = tf.constant([2], name='two')

#create the function
function = tf.pow(x, two) + tf.pow(y, two) + tf.multiply(x, y) + tf.multiply(two, x) - c

Notice that for x and y, a tf.get_variable() method was used to create the variable, while tf.constant() was used to create the constant. For the sake of this example, x was initialized as 1 while y was initialized as 2. This was done using this initializer parameter in the tf.get_variable() method.

To define 2x, we created a TensorFlow constant of 2 and multiplied it with the x variable. After defining all the needed variables, we can now put all these together to create the function as seen in the last line.

Going further, we would need to run the session. This can be done by initializing the tf.Session() class. We also initialized the variable using the tf.init() method. Finally, we can run the initializer as well as the session to print the result.

#create an initializer 
init = tf.global_variables_initializer()

#create a session
with tf.Session() as sess:
    #initialize the x and y variable
    init.run()
    #create a file that stores the summary of the operation
    writer = tf.summary.FileWriter("output", sess.graph)
    #run the session
    result = function.eval()
#print the result
print(result)

Output:
[-6]

NB: If this process seems above your head, please refer to our last tutorial where we discussed in detail how to create constants and variables and run how to run sessions in TensorFlow.

An Overview of TensorBoard.

As mentioned earlier, TensorBoard is used to inspect the flow of tensors in your TensorFlow model and can help in debugging and optimizing your model. Its functions can be classified into two main parts.

TensorBoard for Graph Visualization
TensorBoard for writing summaries to help visualize learning.

On TensorBoard user interface, the functions are divided into tabs:

Scalars: Scalars are used to show scalar values such are accuracy and other important information during model training.
Graphs: Here, you can visualize the computational graph of your models such as a neural network or a simple mathematical function
Histogram: Here, you would be able to see distributions of the model’s training parameters such as the weight in a histogram.
Distributions: In this tab, you can visualize how your model’s data such as the weight of your neural network changes over time
Projector: It’s a great place to view word embeddings and show Principal Component Analysis for dimensionality reduction.
Image: This tab is used for visualization of image data
Audio: This tab is used for visualization of audio data
Text: This tab is used for visualization of textual data.

We will discuss TensorBoard for graph visualization, with a neural network example. We will also discuss how to use the other tabs on TensorFlow for writing summaries. Let’s begin by visualizing the graph with TensorBoard.

First, a summary of how neural networks work is vital.

How Neural Network Works

A neural network is simply a connection of neurons, stacked up in different layers. There is an input layer which receives the data and an output layer, which returns the output data. There may be a couple of hidden layers between the input and output layers whose function is to learn patterns in the data before getting to the output layer.

Source: CS231n

When an input data enters a neuron, it is multiplied by some weight which is first initialized randomly. There exists a second linear component which is added to the weight multiplication. In the end, the result from the output of a node (or neuron) is input × weight + bias. The result is passed through an activation function which determines the nature of the result to pass on to the next layers. Various activation functions can be used, some of the common activation functions include sigmoid, SoftMax, and ReLU.

After the layer returns an output, the process continues until it gets to the final or output layer. This is called the feedforward process. The neural network checks how close the predicted output is with the data actual output. The term used to define this process is called the loss function. The aim is to reduce the loss to be as low as possible. You can see it as the model trying to make fewer errors. It is done by changing the weights in each neuron. The magnitude to which the weight is changed to minimize the loss function is called the learning rate. The model goes back from the output layer to the input layer, adjusting the weights of each neuron to reduce the loss. This is called backpropagation.

A single feedforward and backpropagation step is called an epoch. The model continues to change the weights in the nodes to reduce loss function after every epoch. This is how a neural network learns.

Now going back to TensorBoard.

TensorBoard can be likened to a flashlight used to check what’s going on under the hood, in your neural network. You can see how your defined weight is computed, see a graph of the loss function, and many other insights. This guides you on the necessary steps to take to improve your model. Let’s say you notice that your learning rate is too large, this would not allow your model to converge at a global minimum. Or simply put, your model is just making random guesses. Visualizing this with TensorBoard would inform your decision as regards how to reduce the learning rate and see the effect of the changes made. The loss function, on the other hand, can help you see how well the model is learning after every epoch. With this, you can determine, very quickly, whether or not your model is performing as you expect it to.

How to Use TensorBoard?

We’d start by making sure TensorBoard is up and running on your machine. TensorBoard comes preinstalled with TensorFlow. If you do not have TensorFlow on your machine, please refer to our tutorial here, on how to install TensorFlow on your PC.

To visualize any model on TensorBoard, you will need to save the model data on your PC. These files are called event files and the data saved into the event files are called summary data. Momentarily, we will create a TensorFlow model and save summary data into the event file. Suffice to say now, after the files have been saved, TensorBoard can access the file by typing this common on your Anaconda Prompt.

TensorBoard --logdir output

where output is the name of the event file you saved. You may choose any other name but make sure you called that name as the log directory. When you run the above command, you should get a message that looks like this

TensorBoard 1.13.1 at http://dayvheed:6006 (Press CTRL+C to quit)

Copy the http://<url>:6006 to any browser on your PC and run it. The TensorBoard dashboard should appear.

Now, you know how to launch TensorBoard, let’s create a neural network and visualize the model using TensorBoard.

We’d start by importing the necessary libraries. In this example, we will be using TensorFlow and NumPy. Afterward, we create the dataset by generating random numbers for the features and labels otherwise called x and y. The random numbers were created using the np.random.sample() method.

#import the necessary libraries
import TensorFlow as tf
import numpy as np
#make sure the random numbers created remains unchanged
np.random.seed(42)
#create a set of random numbers for your data features and target
X_train = np.random.sample((8000, 7))
y_train = np.random.sample((8000, 1))

We created a seed so that the randomly generated numbers would not change

#check the shape of the X and Y data
X_train.shape, y_train.shape

Output:
((8000, 7), (8000, 1))

We would need to put the data in the form that a neural network works with. In TensorFlow, the tf.feature_column() helps to convert your input data to data that can be used by a neural network or regression model. If the data is categorical here, the tf.feature_column.indicator_column is used to convert the categorical variable to a dummy variable. For numerical data, we use the tf.feature_column.numeric_column to store the numeric data in the form of our model demands.

Since our data was numeric data, we use the tf.feature_column.numeric_column() method for this process.

#store the data using the feature column method of TensorFlow
column_data = [tf.feature_column.numeric_column('x', shape=X_train.shape[1:])]

We would be using the DNNRegressor to train our model. The algorithm can be accessed by the tf.estimator.DNNRegressor() class.

tf.estimator.DNNRegressor(
    hidden_units,
    feature_columns,
    model_dir=None,
    label_dimension=1,
    weight_column=None,
    optimizer='Adagrad',
    activation_fn=<function relu at 0x000002582A5D86A8>,
    dropout=None,
    input_layer_partitioner=None,
    config=None,
    warm_start_from=None,
    loss_reduction='weighted_sum',
    batch_norm=False,
)
Docstring:     
A regressor for TensorFlow DNN models.

The required parameters are the number of hidden units and the feature column data. The hidden unit is the number of units (or nodes) per layer. If we want 3 layers with 10, 4, and 1 unit in each layer, we pass in a list of units as in [10, 4, 1]

The other parameters such as the optimizer, activation function, loss reduction, model directory, etc. have been by default values which can be based on preference.

In our example, we will be creating a 3-layered network with 400, 300, 200 units for each layer respectively and we would store the event file as ‘train’ in our present directory.

#initialize the estimator
model = tf.estimator.DNNRegressor(hidden_units=[400, 300, 200], 
                          feature_columns=column_data,
                          #define the location to store the log file with the name output
                          model_dir=’train’)

Output:

INFO:TensorFlow:Using default config. INFO:TensorFlow:Using config: {'_model_dir': 'output', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true graph_options { rewrite_options { meta_optimizer_iterations: ONE } } , '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <TensorFlow.python.training.server_lib.ClusterSpec object at 0x000002583494B5C0>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}

Finally, we would train the model using the train() method. This method takes the data to be trained as a parameter. But note that data should be in the form of a function. We can convert our data in arrays to a function by using the numpy_input_fn() class from TensorFlow.

#create an input function from the data
train_input = tf.estimator.inputs.numpy_input_fn(    
     x={"x": X_train},    
     y=y_train, shuffle=False,num_epochs=None)
# Train the estimator
DNN_reg.train(train_input,steps=3000)

Output:
INFO:TensorFlow:Done calling model_fn. INFO:TensorFlow:Create CheckpointSaverHook. INFO:TensorFlow:Graph was finalized. INFO:TensorFlow:Running local_init_op. INFO:TensorFlow:Done running local_init_op. WARNING:TensorFlow:From C:\Anaconda3\lib\site-packages\TensorFlow\python\training\monitored_session.py:809: start_queue_runners (from TensorFlow.python.training.queue_runner_impl) is deprecated and will be removed in a future version. Instructions for updating: To construct input pipelines, use the `tf.data` module. INFO:TensorFlow:Saving checkpoints for 0 into train/linreg\model.ckpt. INFO:TensorFlow:loss = 44.715683, step = 1 INFO:TensorFlow:global_step/sec: 115.542 INFO:TensorFlow:loss = 10.235999, step = 101 (0.894 sec) INFO:TensorFlow:global_step/sec: 127.951 INFO:TensorFlow:loss = 11.152407, step = 201 (0.755 sec) INFO:TensorFlow:global_step/sec: 132.35 INFO:TensorFlow:loss = 11.163857, step = 301 (0.756 sec) INFO:TensorFlow:global_step/sec: 130.795 INFO:TensorFlow:loss = 11.949917, step = 401 (0.766 sec)
INFO:TensorFlow:global_step/sec: 134.486 INFO:TensorFlow:loss = 11.08456, step = 501 (0.743 sec) INFO:TensorFlow:global_step/sec: 132.176 INFO:TensorFlow:loss = 10.155247, step = 601 (0.758 sec) INFO:TensorFlow:global_step/sec: 122.77 INFO:TensorFlow:loss = 11.169417, step = 701 (0.817 sec) INFO:TensorFlow:global_step/sec: 119.4 INFO:TensorFlow:loss = 9.743032, step = 801 (0.835 sec) INFO:TensorFlow:global_step/sec: 132.176 INFO:TensorFlow:loss = 11.074186, step = 901 (0.758 sec) INFO:TensorFlow:global_step/sec: 133.41 INFO:TensorFlow:loss = 10.8108225, step = 1001 (0.748 sec) INFO:TensorFlow:global_step/sec: 133.409 INFO:TensorFlow:loss = 10.393364, step = 1101 (0.751 sec) INFO:TensorFlow:global_step/sec: 135.03 INFO:TensorFlow:loss = 11.3132515, step = 1201 (0.743 sec) INFO:TensorFlow:global_step/sec: 131.137 INFO:TensorFlow:loss = 8.995283, step = 1301 (0.761 sec) INFO:TensorFlow:global_step/sec: 133.945 INFO:TensorFlow:loss = 11.550421, step = 1401 (0.748 sec) INFO:TensorFlow:global_step/sec: 129.608 INFO:TensorFlow:loss = 10.767655, step = 1501 (0.771 sec) INFO:TensorFlow:global_step/sec: 132.351 INFO:TensorFlow:loss = 11.780325, step = 1601 (0.756 sec) INFO:TensorFlow:global_step/sec: 136.133 INFO:TensorFlow:loss = 10.346275, step = 1701 (0.735 sec) INFO:TensorFlow:global_step/sec: 137.253 INFO:TensorFlow:loss = 10.402365, step = 1801 (0.730 sec) INFO:TensorFlow:global_step/sec: 133.232 INFO:TensorFlow:loss = 10.282148, step = 1901 (0.751 sec) INFO:TensorFlow:global_step/sec: 136.877
INFO:TensorFlow:loss = 9.809787, step = 2001 (0.731 sec) INFO:TensorFlow:global_step/sec: 124.295 INFO:TensorFlow:loss = 10.485967, step = 2101 (0.805 sec) INFO:TensorFlow:global_step/sec: 113.961 INFO:TensorFlow:loss = 12.129381, step = 2201 (0.876 sec) INFO:TensorFlow:global_step/sec: 117.3 INFO:TensorFlow:loss = 11.338371, step = 2301 (0.854 sec) INFO:TensorFlow:global_step/sec: 120.696 INFO:TensorFlow:loss = 11.465523, step = 2401 (0.829 sec) INFO:TensorFlow:global_step/sec: 130.623 INFO:TensorFlow:loss = 11.812074, step = 2501 (0.766 sec) INFO:TensorFlow:global_step/sec: 128.608 INFO:TensorFlow:loss = 10.032124, step = 2601 (0.777 sec) INFO:TensorFlow:global_step/sec: 125.542 INFO:TensorFlow:loss = 11.300846, step = 2701 (0.798 sec) INFO:TensorFlow:global_step/sec: 115.673 INFO:TensorFlow:loss = 11.281919, step = 2801 (0.867 sec) INFO:TensorFlow:global_step/sec: 113.96 INFO:TensorFlow:loss = 11.891347, step = 2901 (0.874 sec) INFO:TensorFlow:Saving checkpoints for 3000 into train/linreg\model.ckpt. INFO:TensorFlow:Loss for final step: 10.888451.

We have successfully trained our model. Let’s now visualize the training process using TensorBoard.

Remember we saved the log file as ‘train’ in our current working directory. This is what the folder looks like on my PC

Go to your anaconda prompt and type

TensorBoard --logdir train

When you run the above command, you should get a message that looks like this

TensorBoard 1.13.1 at http://dayvheed:6006 (Press CTRL+C to quit)

Copy the http://<url>:6006 to any browser on your PC and run it. The TensorBoard dashboard will appear and it would look like this.

We can see from the loss function at every epoch. Since the loss drops very quickly and stabilizes after the first 500 epochs, it indicates that our model has learned.

We can also check how the tensors are connected by clicking on the Graphs tab

In conclusion, we have seen that TensorBoard is a fantastic visualization tool for neural networks. You can display important metrics such as loss, accuracy, and weight. We saw how to achieve this using TensorBoard. It’s important to understand how to use TensorBoard especially if you are working with big projects.