Keras is an open-source API and Python library which runs on top of Tensorflow that is used to build neural networks. Keras is one of the most popular python libraries for deep learning because it is easy to use, modular and fast. It was developed by Google Developer, Francois Chollet. He runs a blog where great articles on how to use the Keras library are published.
Keras is a high-level API that would require a backend for its computations.
Keras can use backends such as Tensorflow, Theano, CNTK, etc. If you don’t know what a high-level API is, it is an interface that allows you to access more functionality of a system easily. Using Keras, you do not need to worry about the Tensor computation, building the loss function, or deriving the activation functions. All the hard stuff such as building the computational graphs, creating and wrangling tensors, are handled in the background by the backend Keras works on. In this tutorial, you will get a rounded overview of the Keras and why it is important to use. We would also build an artificial neural network using Keras.
In specific terms, you will learn:
- What is a backend?
- Keras vs Tensorflow Comparison
- Why Keras
- Limitations of Keras
- How to Install Keras
- Creating a neural network with Keras
- Creating a Sequential Model in Keras
- Training and Evaluating the Model with Keras
- Solving a Linear Regression Problem with Keras
Let’s begin by understanding what a backend is.
What is a backend?
A backend is a term used to specify what powers a system. In Keras, the backend is what does the tensor computations, convolution building, and many other things. TensorFlow is the default backend used for Keras. In other words, upon installing Keras on your machine, it automatically uses Tensorflow as the backend (engine for operation). These settings can however be changed if you desire.
Let’s discuss some of the popular backends for Keras.
- Theano
Theano is an open-source Python library that was built by a group called (Montreal Institute of Learning Algorithms) MILA at the University of Montreal, Quebec. The library uses other Python libraries such as NumPy and Scipy for performing mathematical operations in multidimensional arrays or tensors and other machine learning operations. Theano runs on CPU architectures but also allows the use of a GPU for quicker computations. It can be used on popular platforms such as Windows OS, macOS, and Linux.
- Cognitive Toolkit (CNTK)
CNTK is another backend that can be used for Keras. It was built by a team at Microsoft but is open-source for everyone. Microsoft Cognitive Toolkit is great but training models on multiple GPUs and a large scale. The library also allows you to combine popular neural network architectures such as feedforward deep neural networks, convolutional neural networks, and recurrent neural networks. There are reports that CNTK can run faster than Theano and Tensorflow in some situations.
- Tensorflow
Tensorflow is the most popular backend for Keras. As mentioned earlier, it is the default backend once you install Keras on your machine. Tensorflow was developed by a team at Google. The library also has its APIs for building deep neural networks.
Keras vs Tensorflow Comparison
Parameter | Keras | Tensorflow |
Architecture | Simple and readable | Slightly complex and less readable |
API | Low-level API | High-level API |
Debugging | Because of the easy interface, no serious debugging is required | Debugging can be complex and difficult to do |
Performance | Relatively slower | Relatively faster |
Purpose | Good for building quick models ready for deployment | Good for building computational graphs or models by yourself |
Community | Large community with a great blog | Large community as well |
Why Keras
- Ease of use
Keras is a great tool for building deep learning. The library can be used for building neural networks in just a matter of lines. Moreso, codes in Keras can read and understand. In the example below, you would see a fully connected network that was built using Keras.
from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense, Dropout #build the neural network architecture model = Sequential() model.add(Dense(64, input_dim=(10), activation='relu')) model.add(Dense(32, activation='relu')) model.add(Dropout(0.2)) #final layer model.add(Dense(1, activation='softmax'))
We will discuss what each of these means in-depth but the example below is a simple neural network that can be trained on binary classification data with 10 features.
- Large Community
Keras has a large community. In fact, the library has an official blog run by its founder Francois Hollet. Asides from that, there are many repositories in public platforms such as GitHub and Stackoverflow where you can access codes for personal consumption.
- It is cross-platform
It can be used on various platforms such as Windows OS, Mac OS, Linux, etc. it can also be deployed on various devices including Android, cloud engines, Raspberry Pi, web browsers that support .js, and many more.
- Can be run on multiple backends
With Keras, you have the luxury of selecting what backend to use depending on your project’s needs. Theano, CNTk, and Tensorflow are popular backends that can be used, each with its unique advantage.
- Multiple GPU Support
Keras does not only work on a single GPU but multiple GPUs as well. This is because Keras was built to support data parallelism. Therefore large volumes of data can be passed into your model, trained on multiple GPUs at a go to produce a much quicker result.
- Allows Pre-trained Models
Allows you to use models that have been trained to produce pretty decent results. This can be used directly with its weights for feature extractions or making predictions. For instance, the weights of powerful models from the imagenet competition are available for use. Having not trained the model once again gives you the ability to use a powerful model yet without needing a great CPU or GPU. Examples of those models include VGG16, VGG19, ResNet, DenseNet, MobileNet, InceptionV3, etc.
Limitations of Keras
- Problems with low-level API
Because Keras is a high-level API, there are limitations to its operations. You are not given much flexibility as you would have when using a low-level API. For instance, you cannot build abstract layers for research purposes but rather are constrained to Keras pre-configured layers. An attempt to perform operations that Keras was not designed for would result in errors and debugging error logs in Keras are not the easiest thing to do.
- Relatively slower
Generally speaking, Keras is relatively slower than the backend it uses for its operation. Its use is more of a tradeoff between user-friendliness and computational speed.
How to Install Keras
Step 1: Create a virtual environment
When working in different projects in Python it is good practice to create a virtual environment/ in layman terms a virtual environment can be seen as an isolated working space where you can work with different versions of Python packages without conflict from versions in other environments.
For instance, you may have worked on a project that uses Python 3.6 with Tensorflow 1. If you work on another project that requires Tensorflow 2.0, creating a virtual environment would allow you to install Tensorflow 2.0 without causing conflicts with your other projects that were built with Tensorflow 1.
So the first step is to create a virtual environment and install Keras there.
For Windows OS
Navigate to your project root folder and type in the following command
py -m venv 'name_of_venv'
Make sure to replace name_of_venv with the name you wish to call your virtual environment. After running the command, folders named bin, lib, and include and created in that directory.
For Linux/Mac OS
Navigate to the root directory where the project resides and type the following command.
python3 -m venv name_of_venv
Step 2: Activate the environment.
The next step is to activate the created virtual environment.
For windows
Go to the path where the virtual environment was created and type the following command.
.\env\Scripts\activate
For Linux/Mac OS
Step 3: Install the necessary dependencies
Before installing the Keras, you will need to have some libraries installed on your laptop. The libraries include NumPy, pandas, matplotlib, sci-kit learn, seaborn, and scipy. If you use a Jupyter notebook and by extension, Anaconda virtual environment, you have these libraries pre-installed with Anaconda. If however, you do not have them, you can install them by typing the following commands.
Make sure to run each line, wait for it to execute, and install successfully before running the next.
pip install numpy pip install pandas pip install matplotlib pip install scipy pip install seaborn pip install scikit-learn
Step 4: Install Keras
Now, you have all the dependencies installed on your PC, you can install Keras with the following command.
pip install keras
You can confirm Keras is installed successfully by typing the following command.
import keras
You can determine the version of Keras installed by typing the following command.
print(keras.__version__)
Output:
'2.4.3'
So I have Keras version 2.4.3 installed on my machine.
Creating a neural network with Keras
Two different ways of creating a different network in Keras.
- Sequential models: This is a popular way of creating neural network architecture in Keras. The model allows you to add layers upon layers in a sequential manner. It is typically used for simple models.
- Functional API: This is a more powerful way of creating neural network architectures in Keras. Each layer is created as a function so that in cases where you want to reuse a layer alongside its weights and bias, you can simply call the function.
Here we will focus on creating sequential models as it is mostly used amongst learners.
Creating a Sequential Model in Keras
As explained earlier a sequential model is a model in Keras that allows you to stack layers on top of each other until you are satisfied with the neural network architecture. To create a sequential model in Keras, you begin by importing the class from Keras and instantiating it.
#import the model from keras.models import Sequential #instantiate the Sequential class model = Sequential()
After creating an object for the Sequential model, the next step is to add layers on top of each other. The layers could be a fully connected layer, an activation layer, a convolution layer, a dropout layer, an LSTM layer, a batch normalization layer, and many more. To use any layer, you must first import them as well. Let’s import all the layers and discuss their use.
from keras.layers import Dense, Activation, Dropout, Conv2D, LSTM, MaxPooling2D, Flatten, BatchNormalization
- Fully Connected Layer: This is called a Dense layer in Keras. It takes in the number of nodes (also called units/neurons) in the layer. When creating a fully connected layer, you should specify the activation that you wish to use. Alternatively, you could create an activation layer separately if you do not wish to specify the activation alongside the fully connected layer.
Say we wish to add a fully connected layer alongside a relu activation function, you can type
model.add(Dense(units=32, activation='relu'))
Note that if the fully connected layer is your first layer, you would be required to pass the shape of the data the model would be trained on. If the data has 12 features with 1000 samples, the shape would be (12, 1000). The shape is defined using the input_dim parameter.
model.add(Dense(units=32, input_dim=(12, 1000), activation='relu'))
- Activation Layer: This layer specifies the activation function to be used on the preceding layer. Another way of creating the fully connected layer above is shown below.
model.add(Dense(32)) model.add(Activation('relu'))
As seen, the fully connected layer, as well as the dense layer, are separate. Note that this is seldomly done. Most times, the activation function is added as a parameter to the dense layer.
- Dropout layer: The dropout layer is typically added to avoid overfitting in the model. This layer randomly shorts off some nodes in the architecture in a bid to ensure that the model generalizes well. The dropout layer takes the percentage of the node to be short off as an argument. To add a dropout of 20%, you can write the following code.
model.add(Dropout(0.2))
- Convolutional layer: A convolutional layer in Keras is used to create filters (or kernels) for the convolution operation in images, videos or audios. The layer would require the size of the filter and the number of filters you want in the future. Let’s assume you wish to create 32 filters of size 3 by 3, you can write the code below.
model.add(Conv2D(32, (3, 3), activation='relu'))
Note that if the convolutional layer is the first layer of your network, you will need to define the shape of the image that would be passed in. If the image was a 256 by 256 RGB image, the shape would be 256, 256, 3. The image size is specified using the input_shape parameter.
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(256, 256, 32)))
- Max2DPooling Layer: A pooling layer is typically added after a convolutional layer. It helps for better feature extraction by augmenting real features and reducing the noise. Like the convolutional layer, it takes in the size of the pooling kernel.
model.add(MaxPooling2D(pool_size=(2, 2)))
- Batch Normalization Layer: The batch normalization layer is used to normalize the input data before being fed to a layer. It is a given that models perform better on data that is normalized. Hence, the reason batch normalization can improve the performance of your model. This layer can be added before or after a fully connected layer, convolutional layer, and recurrent neural networks such as LSTM.
It is important to note that batch normalization reduces generalization error in some way so it is not necessary to add a dropout layer after a batch normalization layer.
Training and Evaluating the Model with Keras
After building the neural network architecture with different layers, the next step is to train the model on your data. The architecture first needs to be compiled using the .compile() method. The method takes the preferred metrics for calculating model performance and the optimizer used to improve the model. Let’s say it’s a binary classification problem, we can use the binary cross-entropy for its loss, accuracy for its metrics, and Adam for its optimizer. See the code below.
model.compile(loss='binary_crossentropy', metrics='accuracy', optimizer='adam')
Now, we can train the data and evaluate it. Let’s assume that the data has been split into train, test, and validation data. The model will be trained using the fit() method. It takes the X_train, y_train, epoch, batch_size as key arguments.
model.fit(X_train, y_train, validation_data=(X_val, y_val), epochs=20, batch_size=32)
Finally, the model’s performance can be evaluated on the validation data. The evaluate method is used for this purpose.
model.evaluate(X_test, y_test)
And that’s it. You can check how well your model has performed. Let’s now take an example.
Solving a Linear Regression Problem with Keras
In this example, we would create a data point that mimics a straight line with some noises. We would attempt to train a neural network on this model and see how well it can understand the data. We would also draw its line of best fit to visualize the result. Let’s get started with creating the datapoint.
Step 1: Create the dataset.
#import the necessary libraries import numpy as np from matplotlib import pyplot as plt #define a random seed np.random.seed(42) #create random points for the x and y axis from 0 to 20 xs = np.linspace(0, 20, 50) ys = np.linspace(20, 0, 50) # add some positive and negative noise on the y axis ys += np.random.uniform(-2, 2, 50) ys += np.random.uniform(-2, 2, 50) #plot the graph plt.plot(xs, ys, '.');
Output:
Step 2: Creating the model
model = Sequential() model.add(Dense(64, input_dim=1, activation='relu')) model.add(Dense(32, input_dim=1, activation='relu')) model.add(Dense(1, activation='linear'))
Step 3: Compile the model
model.compile(optimizer='adam', loss='mse')
Step 4: Train the model
history = model.fit(xs, ys, epochs=200)
Step 5: Check its prediction
y_pred = model.predict(xs)
Step 6: Visualize the result
plt.plot(xs, y_pred, 'r', label='Predicted') plt.scatter(xs, ys, label='True') plt.legend() plt.xlabel('X Axis') plt.ylabel('Y Axis') plt.title('A plot to show the prediction of the model') ;
Output:
As seen, the simple model makes a decent prediction. That is the power of deep learning.
In conclusion.
In this tutorial, you have learned what Keras is why it is a great choice for building deep learning models. We also discussed its disadvantages and went ahead to make a comparison with Tensorflow. In the later part of this tutorial, you learned how to build a Sequential model with Keras and we have an example for a practical demonstration. If you have any questions, feel free to leave them in the comment section and I’d do my best to answer them.