{"id":8483,"date":"2021-02-19T16:24:01","date_gmt":"2021-02-19T10:54:01","guid":{"rendered":"https:\/\/www.h2kinfosys.com\/blog\/?p=8483"},"modified":"2022-09-11T13:50:34","modified_gmt":"2022-09-11T08:20:34","slug":"kernel-methods-with-tensorflow","status":"publish","type":"post","link":"https:\/\/www.h2kinfosys.com\/blog\/kernel-methods-with-tensorflow\/","title":{"rendered":"Kernel Methods with Tensorflow"},"content":{"rendered":"\n<p>In most real-life classification problems, datasets are linearly non-separable. That is to say, the classes can not be separated by a straight line. But a linear classifier built with the LinearClassifier class of Tensorflow\u2019s estimator API attempts to learn the data with the assumption that it can be classified with a straight line. Other popular machine learning algorithms such as Support Machine Machines (SVM) also hold this assumption. While these models can produce impressive results, the reality remains that they will struggle to learn when hit by linearly non-separable data.<\/p>\n\n\n\n<p>The question then is, how do we classify models with features that are not linearly separable? You guessed right. Kernels!&nbsp;<\/p>\n\n\n\n<p>Using Kernels allows you to make data that are not linearly separable, linearly separable. In the course of this tutorial, you&#8217;ll learn how Kernels work and exactly why it is a pretty good method to use. Furthermore, we&#8217;d build a <a href=\"https:\/\/www.h2kinfosys.com\/blog\/linear-classifier-with-tensorflow-keras\/\" class=\"rank-math-link\">Tensorflow classifier <\/a>as a base model and then a second classifier, using Kernel methods.<\/p>\n\n\n\n<p>Specifically, these are what you&#8217;d learn by the end of this tutorial.&nbsp;<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>The Problem of Classification in lower Dimensional Space<\/li><li>What are Kernels and why Kernels<\/li><li>Type of Kernel Methods<\/li><li>Training a Kernel Classifier with Tensorflow.estimator<\/li><li>Building a Baseline Linear Classifier&nbsp;<\/li><li>Split Data into train and test data<\/li><li>Creating the Feature Columns<\/li><li>Defining the train input function and training the model<\/li><li>Defining the Test input function and Evaluating the model<\/li><li>Building the kernel classifier<\/li><li>Improving the Performance of the Kernel Classifier<\/li><\/ul>\n\n\n\n<p>Let\u2019s begin.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>The Problem of Classification in lower Dimensional Space<\/strong><\/h2>\n\n\n\n<p>When you build a classifier, it&#8217;s job is to predict the class of an object correctly. A logistic regression model is decent for the classification problem in which the data points of the data are not intertwined. However, if the data points are interwoven, the logistic regression model will struggle to capture the classes to the fullest.&nbsp;<\/p>\n\n\n\n<p>A way to solve this problem of non-linearly separable data is by increasing the dimensions of the data. In other words, a classifier can easily classify a data by increasing the dimensions from say 2 to 3. To understand this better, let&#8217;s take an example.&nbsp;<\/p>\n\n\n\n<p>This is an example of a dataset that can be classified with a straight line.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><strong>import<\/strong> <strong>numpy<\/strong> <strong>as<\/strong> <strong>np<\/strong>\n<strong>import<\/strong> <strong>matplotlib.pyplot<\/strong> <strong>as<\/strong> <strong>plt<\/strong>\n<strong>from<\/strong> <strong>mpl_toolkits.mplot3d<\/strong> <strong>import<\/strong> Axes3D\n<em>#create some data and plot the graph<\/em>\nx = [1, 2, 3, 6, 7, 8]\ny = [2, 4, 6, 8, 10, 12]\nlabels = [2, 2, 2, 1, 1, 1]\nplt.scatter(x, y, c=labels)\n<em>#plot a line that splits the data into 2 classes<\/em>\nplt.plot([x <strong>for<\/strong> x <strong>in<\/strong> [11, 10, 9, 8, 7, 6, 5, 4 ,3 ,2, 1]])<\/pre>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh5.googleusercontent.com\/8OxNhZ4QQX2N6gqNI6l0D8gcTsOECbX_N9EMt94XR3aUJeEKvwihPmJkh3eEuAxZtXYdpZEpV-3ivx8TrfiI4BBzU3DdxmgIXH0nMTQTUzNoaM4pZM7pW5Ib8fbjmCTDMwipbiQ\" alt=\"\" title=\"\"><\/figure>\n\n\n\n<p>What about non-linearly separable data?<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>#create some dataset<\/em>\nx = [0, 1, 2, 3, 4, 5, 6, 6, 7, 8, 9]\ny = [6, 6, 5, 3, 3, 4, 4, 6, 8, 8, 9]\nlabels = [0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0]\n\n<em>#plot the graph<\/em>\nplt.scatter(x, y, c=labels)\n<\/pre>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh6.googleusercontent.com\/dpwmb2s0FSn6ftv28TjTzqRAQDbJrtq6BOh0Ynxqlntrt9-AF6uRL1bC6zW5kkxkhi7_EhsevTaWvc1tAAld49TvRNrJRCB36w6ZdnRfbBkfWQhH4c0gToiZS66l6BdjKZuWI2c\" alt=\"\" title=\"\"><\/figure>\n\n\n\n<p>You will observe that a straight line cannot finely solve the classification problem. We can however overturn this situation by increasing the dimension of the data.&nbsp;<\/p>\n\n\n\n<p>For the data above, let&#8217;s attempt to map the 2D data into 3D data using the <\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img decoding=\"async\" src=\"https:\/\/www.h2kinfosys.com\/blog\/wp-content\/uploads\/2021\/02\/image.png\" alt=\"\" class=\"wp-image-8488\" width=\"173\" height=\"42\" title=\"\"><\/figure>\n<\/div>\n\n\n<pre class=\"wp-block-preformatted\"><strong>def<\/strong> tranformation_function(x, y):\n&nbsp;&nbsp;&nbsp;&nbsp;<em>\"\"\"This function converts the 2D data into 3D\"\"\"<\/em>\n&nbsp;&nbsp;&nbsp;&nbsp;data = np.c_[(x, y)] <em>#zips the x and y value<\/em>\n&nbsp;&nbsp;&nbsp;&nbsp;<em>#check if the data has more than 2 observations<\/em>\n&nbsp;&nbsp;&nbsp;&nbsp;<strong>if<\/strong> len(data) &gt; 2:\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;x1 = data[:, 0] ** 2\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;x2 = np.sqrt(2) * data[:, 0] * data[:, 1]\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;x3 = data[:, 1] ** 2\n&nbsp;&nbsp;&nbsp;&nbsp;<strong>else<\/strong>:\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;x1 = data[0] ** 2\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;x2 = np.sqrt(2) * data[0] * data[1]\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;x3 =&nbsp; data[1] ** 2\n&nbsp;&nbsp;&nbsp;&nbsp;translated_data = np.array([x1, x2, x3])\n&nbsp;&nbsp;&nbsp;&nbsp;\n&nbsp;&nbsp;&nbsp;&nbsp;<strong>return<\/strong> translated_data<\/pre>\n\n\n\n<p>To check if this function really works, let\u2019s print the dimension of the data before and after calling the function.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><strong>print<\/strong>(f'The shape of the data before transformation is {np.c_[(x, y)].shape}')\n<em>#call the transformation function on the data<\/em>\ndata_3d = tranformation_function(x, y)\n<em>#check the dimension of the data<\/em>\n<strong>print<\/strong>(f'The shape of the data after transformation is {data_3d.shape}')<\/pre>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>The shape of the data before transformation is (11, 2)\nThe shape of the data after transformation is (3, 11)\n<\/code><\/pre>\n\n\n\n<p>As seen, the data was transformed to 3 dimensional space. We can go ahead to graph the 3D data.&nbsp;<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em># graph the 3D data in 3D space<\/em>\n%matplotlib notebook\n<em>#create a subplot<\/em>\nfig, ax = plt.subplots()\nax = fig.add_subplot(111, projection='3d')\nax.scatter(x_1[0], x_1[1], x_1[2], c=labels)\n&nbsp;\nax.set_xlabel('X axis')\nax.set_ylabel('Y axis')\nax.set_zlabel('Z axis')\n&nbsp;\nplt.show()\n<\/pre>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter is-resized\"><img fetchpriority=\"high\" decoding=\"async\" src=\"https:\/\/lh6.googleusercontent.com\/aYan2Ywp_P7OaN6PHpnYlM_Cx19Gg8V5FJZTFf-jEZ4gMHvZKEdSTbuqgks4rrAKXpGR1dbj3k340AIDhsS17il7ZWAP2qq8LX5xfqmKBQBAVX6ECZ65qQCiZJYEu2zjmGZBvW0\" alt=\"\" width=\"420\" height=\"318\" title=\"\"><\/figure>\n<\/div>\n\n\n<p>From the 3D figure above, you&#8217;d begin to see that the data can be classified linearly by just increasing the dimension by 1.&nbsp;<\/p>\n\n\n\n<p>Problem solved right? Not yet. Notice the amount of work needed to increase the dimension by just 1. Imagine we want to increase the dimension by 1000. As the data gets larger and larger, it becomes computationally intensive to increase its dimensions. Even with a fast processor, it would take a long time for your model to be trained. In some cases, it can run out of memory. This is where the Kernel method becomes useful.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>What are Kernels and why Kernels<\/strong><\/h2>\n\n\n\n<p>Kernels provide a way of converting non-linearly separable data to linearly separable data. Kernels do not do the transformation to a higher dimension per se. It however searches for the best function that makes the model linear separable and returns the weights, irrespective of the dimensions.&nbsp;<\/p>\n\n\n\n<p>The catch is, just like in SVM, the classifier requires the inner product of the vector, which is a scalar. This implies that whether the function takes the data to the&nbsp; 3rd dimension, 1000th dimension, or even the 1000000th dimension, the kernel returns its inner product of the vector spaces. That\u2019s all we need from the higher vector space.&nbsp; And for any number of dimensions whatsoever, the inner product returns a scalar. Kernels, therefore, helps you calculate the inner product of the higher vector space without you knowing the vector space. Since this is the case, kernels are not just more accurate, they also are efficient in their operation.&nbsp;<\/p>\n\n\n\n<p>Let&#8217;s see an example. First, we convert 2D data into 3D and compute the inner product.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>#create data<\/em>\na = [3, 5]\nb = [7, 5]\n\n<em>#transform the data to 3D<\/em>\ndata_transfomed = tranformation_function(a, b)\n<em>#carry out the inner product<\/em>\n<strong>print<\/strong>(np.dot(data_transfomed[:, 0], data_transfomed[:, 1]))\n<\/pre>\n\n\n\n<p>Output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>2116.0<\/code><\/pre>\n\n\n\n<p>Now, we perform the second-degree polynomial kernel on the data and carry out the inner product. Note that we didn\u2019t have to increase the dimension of the data.&nbsp;<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>#compute the polynomial kernel of the data and perform the dot operation<\/em>\n(np.dot(a, b)) ** 2\n<\/pre>\n\n\n\n<p>Output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>2116<\/code><\/pre>\n\n\n\n<p>As seen, the inner product is the same. Kernels can thus be seen as a way of arriving at the end without caring about the means.&nbsp;<\/p>\n\n\n\n<p>In a nutshell, Kernels allows you to find the optimal dimension to use and return the inner product of the dimensions as though the dimension transformation took place. This process is computationally less demanding. This is why it&#8217;s advisable to employ Kernels for non-linearly separable data.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Type of Kernel Methods<\/strong><\/h2>\n\n\n\n<p>There is a myriad of kernel Methods. Some of the common ones include&nbsp;<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Linear Kernel: This is simply the inner product of both vectors.&nbsp;<\/li><\/ul>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img decoding=\"async\" src=\"https:\/\/www.h2kinfosys.com\/blog\/wp-content\/uploads\/2021\/02\/image-1.png\" alt=\"\" class=\"wp-image-8489\" width=\"151\" height=\"39\" title=\"\"><\/figure>\n<\/div>\n\n\n<p>Where x and y are the two vectors<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Polynomial kernel<\/li><\/ul>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.h2kinfosys.com\/blog\/wp-content\/uploads\/2021\/02\/image-2.png\" alt=\"\" class=\"wp-image-8490\" width=\"175\" height=\"44\" title=\"\"><\/figure>\n<\/div>\n\n\n<p>Where d is given as the dimension of the polynomial<\/p>\n\n\n\n<p>Other types of kernels include&nbsp;<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Exponential kernel<\/li><li>Gaussian kernel<\/li><li>Laplacian kernel<\/li><li>Anova radial basis kernel<\/li><li>Hyperbolic or sigmoid kernel&nbsp;<\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Training a Kernel Classifier with Tensorflow.estimator<\/strong><\/h2>\n\n\n\n<p>In Tensorflow, there is a built-in function in tf.estimator that can be used to compute more feature space. The function, called Random Fourier, is largely an approximation of the Gaussian Kernel. This class will be used in this tutorial to build the Kernel classifier.&nbsp;<\/p>\n\n\n\n<p>We will start to create a baseline model using the Tensorflow LinearClassifier class. The model will simply classify whether or not a person has a credit card. Afterward, we will build and train a second model using the Gaussian Kernel in Tensorflow. Let&#8217;s begin with the baseline model.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Building a Baseline Linear Classifier&nbsp;<\/strong><\/h2>\n\n\n\n<p>The dataset obtained from the UCI Machine Learning Repository can be downloaded <a class=\"rank-math-link\" href=\"https:\/\/archive.ics.uci.edu\/ml\/datasets\/Credit+Approval\" rel=\"nofollow noopener\" target=\"_blank\">here<\/a>.&nbsp; Here\u2019s a brief description of the data features.&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td>ID<\/td><td>Customer ID<\/td><\/tr><tr><td>Age<\/td><td>Customer&#8217;s age in completed years<\/td><\/tr><tr><td>Experience<\/td><td>#years of professional experience<\/td><\/tr><tr><td>Income<\/td><td>The annual income of the customer ($000)<\/td><\/tr><tr><td>ZIPCode<\/td><td>Home Address ZIP code.<\/td><\/tr><tr><td>Family<\/td><td>Family size of the customer<\/td><\/tr><tr><td>CCAvg<\/td><td>Avg. spending on credit cards per month ($000)<\/td><\/tr><tr><td>Education<\/td><td>Education Level. 1: Undergrad; 2: Graduate; 3: Advanced\/Professional<\/td><\/tr><tr><td>Mortgage<\/td><td>Value of house mortgage if any. ($000)<\/td><\/tr><tr><td>Personal Loan<\/td><td>Did this customer accept the personal loan offered in the last campaign?<\/td><\/tr><tr><td>Securities Account<\/td><td>Does the customer have a securities account with the bank?<\/td><\/tr><tr><td>CD Account<\/td><td>Does the customer have a certificate of deposit (CD) account with the bank?<\/td><\/tr><tr><td>Online<\/td><td>Does the customer use internet banking facilities?<\/td><\/tr><tr><td>CreditCard<\/td><td>Does the customer use a credit card issued by UniversalBank?<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>We begin by importing the necessary libraries.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><strong>import<\/strong> <strong>pandas<\/strong> <strong>as<\/strong> <strong>pd<\/strong>\n<strong>import<\/strong> <strong>numpy<\/strong> <strong>as<\/strong> <strong>np<\/strong>\n<strong>import<\/strong> <strong>matplotlib.pyplot<\/strong> <strong>as<\/strong> <strong>plt<\/strong>\n<strong>import<\/strong> <strong>tensorflow<\/strong> <strong>as<\/strong> <strong>tf<\/strong>\n<\/pre>\n\n\n\n<p>Let\u2019s now import the dataset using the read_csv() method of pandas.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em># load the dataset<\/em>\ndf = pd.read_csv(\"Bank_Personal_Loan_Modelling.csv\")\n<em>#print the first 5 rows of the data<\/em>\n\n<strong>print<\/strong>(df.head())\n<\/pre>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ID  Age  Experience  Income  ZIP Code  Family  CCAvg  Education  Mortgage  \\\n0   1   25           1      49     91107       4    1.6          1         0   \n1   2   45          19      34     90089       3    1.5          1         0   \n2   3   39          15      11     94720       1    1.0          1         0   \n3   4   35           9     100     94112       1    2.7          2         0   \n4   5   35           8      45     91330       4    1.0          2         0   \n\n   Personal Loan  Securities Account  CD Account  Online  CreditCard  \n0              0                   1           0       0           0  \n1              0                   1           0       0           0  \n2              0                   0           0       0           0  \n3              0                   0           0       0           0  \n4              0                   0           0       0           1  <\/code><\/pre>\n\n\n\n<p>We can check the number of observations and features of the dataset using the shape attribute of a dataframe.&nbsp;<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>#check the number of samples and features<\/em>\ndf.shape\n<\/pre>\n\n\n\n<p>Output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>(5000, 14)\n<\/code><\/pre>\n\n\n\n<p>Next, we check if missing values exist in the data.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>#check missing values in the data<\/em>\ndf.isnull().sum()\n<\/pre>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ID                    0\nAge                   0\nExperience            0\nIncome                0\nZIP Code              0\nFamily                0\nCCAvg                 0\nEducation             0\nMortgage              0\nPersonal Loan         0\nSecurities Account    0\nCD Account            0\nOnline                0\nCreditCard            0\ndtype: int64\n<\/code><\/pre>\n\n\n\n<p>It appears that the data is clean, with no missing values at all.<\/p>\n\n\n\n<p>Going forward, we must separate the features from the target. On this particular date, the CreditCard is the target variable while the other columns are the independent variables. However, we take out the ID column from the features. This is because every entry in the ID column is a unique number. Therefore, this column does not help the model learn any pattern. It is a dummy feature and thus, must be removed.&nbsp;<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>#split the data into targets(y) and features (X)<\/em>\ntarget = df.CreditCard&nbsp;\nfeatures = df.drop(['ID', 'CreditCard'], axis=1)<\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Splitting Data into train and test data<\/strong><\/h2>\n\n\n\n<p>Next, we will need to split the data into train and test data. The model is trained on the train data after which its performance is evaluated on the test data.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><strong>from<\/strong> <strong>sklearn.preprocessing<\/strong> <strong>import<\/strong> StandardScaler\n<strong>from<\/strong> <strong>sklearn.preprocessing<\/strong> <strong>import<\/strong> LabelEncoder&nbsp;\n<strong>from<\/strong> <strong>sklearn.model_selection<\/strong> <strong>import<\/strong> train_test_split\n\n<em>#split the data into train and test data<\/em>\nX_train, X_test, y_train, y_test = train_test_split(features,\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;target, test_size=0.2, random_state=42)\n<\/pre>\n\n\n\n<p>Finally, we will need to standardize the data. This is an important step before feeding your data into the model. Since each column is at a different scale, the model will find it difficult to get trained on the divergent values. Standardization helps to shrink the dataset to values between 0 and 1. This is done using the StandardScaler class. Also, the labels were encoded using the LabelEncoder class.&nbsp;<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>instantiate the Standard Scaler and Label Encoder class<\/em>\nscaler = StandardScaler()&nbsp;&nbsp;\nencoder = LabelEncoder()\n<em>#encode the dependent variable (label)<\/em>\n\ntarget = encoder.fit_transform(target)\n<em>#standardize the independent features<\/em>\nX_train = scaler.fit_transform(X_train).astype(np.float32)\nX_test = scaler.transform(X_test).astype(np.float32)\n<strong>print<\/strong>(X_train.shape)\n<strong>print<\/strong>(X_test.shape)\n<\/pre>\n\n\n\n<p>Note that to avoid data leakage of any sort, the scaling parameters are found with the train data only. The parameters are then used to transform the test data.&nbsp;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>(4000, 12)\n(1000, 12)\n<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Creating the Feature Columns<\/strong><\/h2>\n\n\n\n<p>Real world data can come in different forms. They could be strings, images, videos, numeric values, categorical values, etc. Tensorflow however works with Tensors alone. This implies that the data to be fed into the TensorFlow models must be converted to tensors<\/p>\n\n\n\n<p>There are various approaches to convert a column to a feature column depending on the type of data the column holds. Since all the columns in our data are numeric values, we can use the real_valued_column method to convert all the columns to feature columns.&nbsp;<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>#create a feature column<\/em>\nfeature_columns = tf.contrib.layers.real_valued_column('x', dimension=12)\nfeature_columns\n<\/pre>\n\n\n\n<p>Output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>_RealValuedColumn(column_name='x', dimension=12, default_value=None, dtype=tf.float32, normalizer=None)\n\n<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Instantiating the Model<\/strong><\/h2>\n\n\n\n<p>After defining the feature columns, the model can then be instantiated. Recall that we are using the LinearClassifier for this example. We pass the feature column, number of classes in the target data and the model directory as arguments when calling the class.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>#instantiate the linear classifier<\/em>\nestimator = tf.estimator.LinearClassifier(feature_columns=[feature_columns],\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;n_classes=2,\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;model_dir = \"base_model1\"\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;)\n<\/pre>\n\n\n\n<p>Output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>INFO:tensorflow:Using default config.\nINFO:tensorflow:Using config: {'_model_dir': 'base_model1', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true\ngraph_options {\n  rewrite_options {\n    meta_optimizer_iterations: ONE\n  }\n}\n, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': &lt;tensorflow.python.training.server_lib.ClusterSpec object at 0x0000021D56338630&gt;, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}\n<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Defining the train input function and training the model<\/strong><\/h2>\n\n\n\n<p>To pass the data into the TensorFlow model, you need to pass the features and target using a defined function. This function is called an input function.&nbsp;<\/p>\n\n\n\n<p>The function takes parameters such as the features, target, batch size, number of epochs, and whether or not the data should be shuffled. The input function can be defined with Tensorflow&#8217;s nunpy_input_fn or pandas_input_fn. Here, we&#8217;d use the numpy_input_fn since the train and test data are already in NumPy arrays. We can therefore define an input function for the train data with a batch size of 32 so my machine is not overworked. In addition, the shuffle argument is set to true so that the model does not learn patterns in the train data verbatim.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>#define the training input function&nbsp;<\/em>\ntrain_input_fn = tf.estimator.inputs.numpy_input_fn(\n&nbsp;&nbsp;&nbsp;&nbsp;x = {'x': X_train},\n&nbsp;&nbsp;&nbsp;&nbsp;y = y_train,\n&nbsp;&nbsp;&nbsp;&nbsp;batch_size=32,\n&nbsp;&nbsp;&nbsp;&nbsp;num_epochs=None,\n&nbsp;&nbsp;&nbsp;&nbsp;shuffle=True,\n&nbsp;&nbsp;&nbsp;&nbsp;)\n<\/pre>\n\n\n\n<p>Notice that the epoch was set to None. This allows the model to see the data for as many iterations as defined by the number of steps.&nbsp;<\/p>\n\n\n\n<p>It&#8217;s finally time to train the model. The model is trained using the train method of the estimator. We define a start and end time to check how long the model spends during training.&nbsp;<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><strong>import<\/strong> <strong>time<\/strong>\n<em>#set the start timer<\/em>\nstart_time = time.time()\n<em>#train the model on the training data<\/em>\nestimator.train(input_fn=train_input_fn, steps=2000)\n<em>#set the stop timer<\/em>\nend_time = time.time()\ntimetaken = end_time - start_time\n<strong>print<\/strong>()\n<strong>print<\/strong>(f\"The model gets trained in {timetaken} seconds\")\n<\/pre>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>WARNING:tensorflow:From C:\\Anaconda3\\lib\\site-packages\\tensorflow\\python\\framework\\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.\nInstructions for updating:\nCollocations handled automatically by placer.\nWARNING:tensorflow:From C:\\Anaconda3\\lib\\site-packages\\tensorflow_estimator\\python\\estimator\\inputs\\queues\\feeding_queue_runner.py:62: QueueRunner.__init__ (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.\nInstructions for updating:\nTo construct input pipelines, use the `tf.data` module.\nWARNING:tensorflow:From C:\\Anaconda3\\lib\\site-packages\\tensorflow_estimator\\python\\estimator\\inputs\\queues\\feeding_functions.py:500: add_queue_runner (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.\nInstructions for updating:\nTo construct input pipelines, use the `tf.data` module.\nINFO:tensorflow:Calling model_fn.\nWARNING:tensorflow:From C:\\Anaconda3\\lib\\site-packages\\tensorflow\\contrib\\layers\\python\\layers\\feature_column.py:1901: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.\nInstructions for updating:\nUse tf.cast instead.\nINFO:tensorflow:Done calling model_fn.\nINFO:tensorflow:Create CheckpointSaverHook.\nINFO:tensorflow:Graph was finalized.\nINFO:tensorflow:Running local_init_op.\nINFO:tensorflow:Done running local_init_op.\nWARNING:tensorflow:From C:\\Anaconda3\\lib\\site-packages\\tensorflow\\python\\training\\monitored_session.py:809: start_queue_runners (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.\nInstructions for updating:\nTo construct input pipelines, use the `tf.data` module.\nINFO:tensorflow:Saving checkpoints for 0 into base_model1\\model.ckpt.\nINFO:tensorflow:loss = 22.18071, step = 1\nINFO:tensorflow:global_step\/sec: 280.272\nINFO:tensorflow:loss = 19.401627, step = 101 (0.359 sec)\nINFO:tensorflow:global_step\/sec: 510.493\nINFO:tensorflow:loss = 14.620773, step = 201 (0.196 sec)\nINFO:tensorflow:global_step\/sec: 521.133\nINFO:tensorflow:loss = 22.846588, step = 301 (0.192 sec)\nINFO:tensorflow:global_step\/sec: 510.495\nINFO:tensorflow:loss = 18.471878, step = 401 (0.197 sec)\nINFO:tensorflow:global_step\/sec: 546.763\nINFO:tensorflow:loss = 19.982285, step = 501 (0.181 sec)\nINFO:tensorflow:global_step\/sec: 523.86\nINFO:tensorflow:loss = 19.685211, step = 601 (0.191 sec) INFO:tensorflow:global_step\/sec: 483.367 INFO:tensorflow:loss = 15.895845, step = 701 (0.208 sec) INFO:tensorflow:global_step\/sec: 465.385 INFO:tensorflow:loss = 19.809559, step = 801 (0.215 sec) INFO:tensorflow:global_step\/sec: 483.369 INFO:tensorflow:loss = 18.505947, step = 901 (0.207 sec) INFO:tensorflow:global_step\/sec: 502.799 INFO:tensorflow:loss = 14.306513, step = 1001 (0.199 sec) INFO:tensorflow:global_step\/sec: 546.759 INFO:tensorflow:loss = 21.479156, step = 1101 (0.183 sec) INFO:tensorflow:global_step\/sec: 361.218 INFO:tensorflow:loss = 14.587541, step = 1201 (0.284 sec) INFO:tensorflow:global_step\/sec: 318.655 INFO:tensorflow:loss = 16.127178, step = 1301 (0.307 sec) INFO:tensorflow:global_step\/sec: 489.912 INFO:tensorflow:loss = 18.31077, step = 1401 (0.206 sec) INFO:tensorflow:global_step\/sec: 334.272 INFO:tensorflow:loss = 18.162086, step = 1501 (0.307 sec) INFO:tensorflow:global_step\/sec: 300.79 INFO:tensorflow:loss = 15.54518, step = 1601 (0.324 sec) INFO:tensorflow:global_step\/sec: 264.306 INFO:tensorflow:loss = 20.591423, step = 1701 (0.377 sec) INFO:tensorflow:global_step\/sec: 328.056 INFO:tensorflow:loss = 15.356109, step = 1801 (0.309 sec) INFO:tensorflow:global_step\/sec: 332.416 INFO:tensorflow:loss = 18.233725, step = 1901 (0.297 sec) INFO:tensorflow:Saving checkpoints for 2000 into base_model1\\model.ckpt. INFO:tensorflow:Loss for final step: 18.70794. \n\nThe model gets trained in 9.365769863128662 seconds\n<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Defining the Test input function and Evaluating the model<\/strong><\/h2>\n\n\n\n<p>Before proceeding to evaluate the model, we must define another input function to take in the test data. The input function is defined with a batch size of 128 and 1 epoch since it just needs to check whether its prediction is right or wrong. The shuffle argument was also set to False because it was not necessary to shuffle the data this time.&nbsp;<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>#define the test input data<\/em>\ntest_input_fn = tf.estimator.inputs.numpy_input_fn(\n&nbsp;&nbsp;&nbsp;&nbsp;x = {'x': X_test},\n&nbsp;&nbsp;&nbsp;&nbsp;y = y_test,\n&nbsp;&nbsp;&nbsp;&nbsp;batch_size=128,\n&nbsp;&nbsp;&nbsp;&nbsp;num_epochs=1,\n&nbsp;&nbsp;&nbsp;&nbsp;shuffle=False\n&nbsp;&nbsp;&nbsp;&nbsp;)\n<\/pre>\n\n\n\n<p>Now we can see how well the model learns. We check its performance by evaluating the model on the test data.&nbsp;<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>#evaluate the model\u2019s performance on the test data<\/em>\nestimator.evaluate(input_fn=test_input_fn, steps=1)<\/pre>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>INFO:tensorflow:Calling model_fn.\nWARNING:tensorflow:From C:\\Anaconda3\\lib\\site-packages\\tensorflow\\python\\ops\\metrics_impl.py:2002: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.\nInstructions for updating:\nDeprecated in favor of operator or tf.math.divide.\nWARNING:tensorflow:Trapezoidal rule is known to produce incorrect PR-AUCs; please switch to \"careful_interpolation\" instead.\nWARNING:tensorflow:Trapezoidal rule is known to produce incorrect PR-AUCs; please switch to \"careful_interpolation\" instead.\nINFO:tensorflow:Done calling model_fn.\nINFO:tensorflow:Starting evaluation at 2020-12-12T17:01:44Z\nINFO:tensorflow:Graph was finalized.\nWARNING:tensorflow:From C:\\Anaconda3\\lib\\site-packages\\tensorflow\\python\\training\\saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.\nInstructions for updating:\nUse standard file APIs to check for files with this prefix.\nINFO:tensorflow:Restoring parameters from base_model1\\model.ckpt-2000\nINFO:tensorflow:Running local_init_op.\nINFO:tensorflow:Done running local_init_op.\nINFO:tensorflow:Evaluation &#91;1\/1]\nINFO:tensorflow:Finished evaluation at 2020-12-12-17:01:46\nINFO:tensorflow:Saving dict for global step 2000: accuracy = 0.7890625, accuracy_baseline = 0.7265625, auc = 0.6102919, auc_precision_recall = 0.4829452, average_loss = 0.52276087, global_step = 2000, label\/mean = 0.2734375, loss = 66.91339, precision = 1.0, prediction\/mean = 0.2957184, recall = 0.22857143\nINFO:tensorflow:Saving 'checkpoint_path' summary for global step 2000: base_model1\\model.ckpt-2000\nOut&#91;20]:\n{'accuracy': 0.7890625,\n 'accuracy_baseline': 0.7265625,\n 'auc': 0.6102919,\n 'auc_precision_recall': 0.4829452,\n 'average_loss': 0.52276087,\n 'label\/mean': 0.2734375,\n 'loss': 66.91339,\n 'precision': 1.0,\n 'prediction\/mean': 0.2957184,\n 'recall': 0.22857143,\n 'global_step': 2000}\n<\/code><\/pre>\n\n\n\n<p>As seen the model has an accuracy of 78.9% which is fairly okay. The loss of 66.9 however seems to be quite high. Other parameters such as the recall seem to be low.<\/p>\n\n\n\n<p>We would now turn our attention to the Kernel classifier. Let&#8217;s see if we can up the numbers with a kernel model.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Building the Kernel Classifier<\/strong><\/h2>\n\n\n\n<p>The preprocessing steps in building the Kernel model is virtually the same as building the linear classifier. In fact, the train and test input function remains unchanged. The major difference is in building the kernel model itself.&nbsp;<\/p>\n\n\n\n<p>To build the kernel model, we first must define a Kernel mapper. We use the RandomFourierFeatureMapper class from the tf.contrib.kernel_methods module to define the mapper. The mapper takes in the input dimension function of the data as well as the expected output dimension. So if you wish to increase the data dimension to 200, the output_dim is set to 200.&nbsp;<\/p>\n\n\n\n<p>Once that&#8217;s done, the Kernel model can be built using the KernelLinearClassifier.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>#define a kernel<\/em>\nkernel_mapper = tf.contrib.kernel_methods.RandomFourierFeatureMapper(input_dim=12, output_dim=5000, stddev=4.5, name='k_mapper1')\n<em>#map the kernels to the feature columns<\/em>\nkernel_mappers = {feature_columns: [kernel_mapper]}\n\n<em>#define an optimizer<\/em>\noptimizer = tf.train.FtrlOptimizer(learning_rate=50, l2_regularization_strength=0.001)\n\n<em>#instantiate the kernel classifier<\/em>\nkernel_estimator = tf.contrib.kernel_methods.KernelLinearClassifier(\n&nbsp;&nbsp;&nbsp;&nbsp;n_classes=3,\n&nbsp;&nbsp;&nbsp;&nbsp;optimizer=optimizer,\n&nbsp;&nbsp;&nbsp;&nbsp;kernel_mappers=kernel_mappers,&nbsp;\n&nbsp;&nbsp;&nbsp;&nbsp;model_dir=\"Kernel_model1\")\n<\/pre>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>WARNING:tensorflow:From C:\\Anaconda3\\lib\\site-packages\\tensorflow\\contrib\\kernel_methods\\python\\kernel_estimators.py:305: multi_class_head (from tensorflow.contrib.learn.python.learn.estimators.head) is deprecated and will be removed in a future version.\nInstructions for updating:\nPlease switch to tf.contrib.estimator.*_head.\nWARNING:tensorflow:From C:\\Anaconda3\\lib\\site-packages\\tensorflow\\contrib\\learn\\python\\learn\\estimators\\estimator.py:1179: BaseEstimator.__init__ (from tensorflow.contrib.learn.python.learn.estimators.estimator) is deprecated and will be removed in a future version.\nInstructions for updating:\nPlease replace uses of any Estimator from tf.contrib.learn with an Estimator from tf.estimator.*\nWARNING:tensorflow:From C:\\Anaconda3\\lib\\site-packages\\tensorflow\\contrib\\learn\\python\\learn\\estimators\\estimator.py:427: RunConfig.__init__ (from tensorflow.contrib.learn.python.learn.estimators.run_config) is deprecated and will be removed in a future version.\nInstructions for updating:\nWhen switching to tf.estimator.Estimator, use tf.estimator.RunConfig instead.\nINFO:tensorflow:Using default config.\nINFO:tensorflow:Using config: {'_task_type': None, '_task_id': 0, '_cluster_spec': &lt;tensorflow.python.training.server_lib.ClusterSpec object at 0x000002738D99C0B8&gt;, '_master': '', '_num_ps_replicas': 0, '_num_worker_replicas': 0, '_environment': 'local', '_is_chief': True, '_evaluation_master': '', '_train_distribute': None, '_eval_distribute': None, '_device_fn': None, '_tf_config': gpu_options {\n  per_process_gpu_memory_fraction: 1.0\n}\n, '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_secs': 600, '_log_step_count_steps': 100, '_protocol': None, '_session_config': None, '_save_checkpoints_steps': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_model_dir': 'Kernel_model1'}\n<\/code><\/pre>\n\n\n\n<p>The model can then be trained and evaluated. The training is done with the fit method of the instantiated kernel model. It similarly takes the input function and the number of steps as argument. To maintain evenness, the same input function that was passed for the base model is passed for this one. Also, the model was set to be trained for 2000 iterations.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><strong>import<\/strong> <strong>time<\/strong>\nstart_time = time.time()\n<em>#traine the kernel classifier&nbsp;<\/em>\nkernel_estimator.fit(input_fn=train_input_fn, steps=2000)\nend_time = time.time()\n\ntimetaken = end_time - start_time\n<strong>print<\/strong>(f\"The model gets trained in {timetaken} seconds\")<\/pre>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>INFO:tensorflow:Create CheckpointSaverHook.\nINFO:tensorflow:Graph was finalized.\nWARNING:tensorflow:From C:\\Anaconda3\\lib\\site-packages\\tensorflow\\python\\training\\saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.\nInstructions for updating:\nUse standard file APIs to check for files with this prefix.\nINFO:tensorflow:Restoring parameters from Kernel_model1\\model.ckpt-2000\nWARNING:tensorflow:From C:\\Anaconda3\\lib\\site-packages\\tensorflow\\python\\training\\saver.py:1070: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.\nInstructions for updating:\nUse standard file utilities to get mtimes.\nINFO:tensorflow:Running local_init_op.\nINFO:tensorflow:Done running local_init_op.\nINFO:tensorflow:Saving checkpoints for 2000 into Kernel_model1\\model.ckpt.\nINFO:tensorflow:loss = 10.088472, step = 2001\nINFO:tensorflow:global_step\/sec: 205.505\nINFO:tensorflow:loss = 5.561243, step = 2101 (0.490 sec)\nINFO:tensorflow:global_step\/sec: 250.77\nINFO:tensorflow:loss = 1.7062492, step = 2201 (0.399 sec)\nINFO:tensorflow:global_step\/sec: 267.533\nINFO:tensorflow:loss = 14.24986, step = 2301 (0.374 sec)\nINFO:tensorflow:global_step\/sec: 263.168\nINFO:tensorflow:loss = 5.7791023, step = 2401 (0.380 sec)\nINFO:tensorflow:global_step\/sec: 275.64\nINFO:tensorflow:loss = 1.8302681, step = 2501 (0.364 sec)\nINFO:tensorflow:global_step\/sec: 191.289\nINFO:tensorflow:loss = 6.062792, step = 2601 (0.522 sec)\nINFO:tensorflow:global_step\/sec: 215.641\nINFO:tensorflow:loss = 1.9746958, step = 2701 (0.465 sec)\nINFO:tensorflow:global_step\/sec: 268.869\nINFO:tensorflow:loss = 3.9464078, step = 2801 (0.371 sec)\nINFO:tensorflow:global_step\/sec: 249.517\nINFO:tensorflow:loss = 27.863808, step = 2901 (0.401 sec)\nINFO:tensorflow:global_step\/sec: 282.651\nINFO:tensorflow:loss = 1.5290673, step = 3001 (0.354 sec)\nINFO:tensorflow:global_step\/sec: 265.404\nINFO:tensorflow:loss = 14.24245, step = 3101 (0.378 sec)\nINFO:tensorflow:global_step\/sec: 271.895\nINFO:tensorflow:loss = 10.880495, step = 3201 (0.367 sec)\nINFO:tensorflow:global_step\/sec: 277.167\nINFO:tensorflow:loss = 22.52391, step = 3301 (0.361 sec)\nINFO:tensorflow:global_step\/sec: 272.635\nINFO:tensorflow:loss = 19.375069, step = 3401 (0.368 sec)\nINFO:tensorflow:global_step\/sec: 249.52\nINFO:tensorflow:loss = 3.5978289, step = 3501 (0.400 sec)\nINFO:tensorflow:global_step\/sec: 281.852\nINFO:tensorflow:loss = 4.502479, step = 3601 (0.355 sec)\nINFO:tensorflow:global_step\/sec: 272.635\nINFO:tensorflow:loss = 13.293575, step = 3701 (0.367 sec)\nINFO:tensorflow:global_step\/sec: 282.648\nINFO:tensorflow:loss = 16.384726, step = 3801 (0.354 sec)\nINFO:tensorflow:global_step\/sec: 280.465\nINFO:tensorflow:loss = 17.880123, step = 3901 (0.357 sec)\nINFO:tensorflow:Saving checkpoints for 4000 into Kernel_model1\\model.ckpt.\nINFO:tensorflow:Loss for final step: 10.802145.\nThe model gets trained in 10.8771390914917 seconds<\/code><\/pre>\n\n\n\n<p>This time, the model gets trained in 10.8 seconds. This small increase is understandable since the kernel model is more sophisticated. Recall the model assessed the inner products of 5000 dimensions. Finally, we check the performance of the model by evaluating it on the test data.&nbsp;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># Evaluate the kernel classifier\neval_metrics = kernel_estimator.evaluate(input_fn=test_input_fn, steps=1)<\/code><\/pre>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>INFO:tensorflow:Starting evaluation at 2020-12-12T17:55:02Z\nINFO:tensorflow:Graph was finalized.\nINFO:tensorflow:Restoring parameters from Kernel_model1\\model.ckpt-4000\nINFO:tensorflow:Running local_init_op.\nINFO:tensorflow:Done running local_init_op.\nINFO:tensorflow:Evaluation &#91;1\/1]\nINFO:tensorflow:Finished evaluation at 2020-12-12-17:55:03\nINFO:tensorflow:Saving dict for global step 4000: accuracy = 0.78125, global_step = 4000, loss = 10.940855<\/code><\/pre>\n\n\n\n<p>The kernel model has managed to reduce the loss from 66.9 to 10.9 in the same number of training steps. This goes to show that it outperforms our earlier created based model. We can, in fact, attempt to improve this number by tweaking some of its training parameters.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Improving the Performance of the Kernel Classifier<\/strong><\/h2>\n\n\n\n<p>The Kernel classifier is sensitive to the defined stddev and the output dimensions. In the earlier model, the stddev was set to 5 and the output dimension was set to 5000. A higher output dimension means the inner product of the two vectors gets closer. Thus, an increase in dimension would lead to an increased degree of freedom until it saturates at some point.&nbsp;<\/p>\n\n\n\n<p>Here, we will attempt to tweak these parameters, setting the stddev to 4.5 and the output dimension to 400. We would also change the optimizer parameters, particularly the learning rate to 5 and an L2 regularizer strength to 0.01 to prevent overfitting.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em>#define a kernel<\/em>\nkernel_mapper = tf.contrib.kernel_methods.RandomFourierFeatureMapper(input_dim=12, output_dim=4000, stddev=5, name='k_mapper')\n<em>#map the kernels to the feature columns<\/em>\nkernel_mappers = {feature_columns: [kernel_mapper]}\n\n<em>#define an optimizer<\/em>\noptimizer = tf.train.FtrlOptimizer(learning_rate=50, l2_regularization_strength=0.01)\n\n<em>#instantiate the kernel classifier<\/em>\nkernel_estimator = tf.contrib.kernel_methods.KernelLinearClassifier(\n&nbsp;&nbsp;&nbsp;&nbsp;n_classes=3,\n&nbsp;&nbsp;&nbsp;&nbsp;optimizer=optimizer,\n&nbsp;&nbsp;&nbsp;&nbsp;kernel_mappers=kernel_mappers,&nbsp;\n&nbsp;&nbsp;&nbsp;&nbsp;model_dir=\"Kernel_model\")<\/pre>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>WARNING:tensorflow:From C:\\Anaconda3\\lib\\site-packages\\tensorflow\\contrib\\kernel_methods\\python\\kernel_estimators.py:305: multi_class_head (from tensorflow.contrib.learn.python.learn.estimators.head) is deprecated and will be removed in a future version.\nInstructions for updating:\nPlease switch to tf.contrib.estimator.*_head.\nWARNING:tensorflow:From C:\\Anaconda3\\lib\\site-packages\\tensorflow\\contrib\\learn\\python\\learn\\estimators\\estimator.py:1179: BaseEstimator.__init__ (from tensorflow.contrib.learn.python.learn.estimators.estimator) is deprecated and will be removed in a future version.\nInstructions for updating:\nPlease replace uses of any Estimator from tf.contrib.learn with an Estimator from tf.estimator.*\nWARNING:tensorflow:From C:\\Anaconda3\\lib\\site-packages\\tensorflow\\contrib\\learn\\python\\learn\\estimators\\estimator.py:427: RunConfig.__init__ (from tensorflow.contrib.learn.python.learn.estimators.run_config) is deprecated and will be removed in a future version.\nInstructions for updating:\nWhen switching to tf.estimator.Estimator, use tf.estimator.RunConfig instead.\nINFO:tensorflow:Using default config.\nINFO:tensorflow:Using config: {'_task_type': None, '_task_id': 0, '_cluster_spec': &lt;tensorflow.python.training.server_lib.ClusterSpec object at 0x0000021D599E5E48&gt;, '_master': '', '_num_ps_replicas': 0, '_num_worker_replicas': 0, '_environment': 'local', '_is_chief': True, '_evaluation_master': '', '_train_distribute': None, '_eval_distribute': None, '_device_fn': None, '_tf_config': gpu_options {\n  per_process_gpu_memory_fraction: 1.0\n}\n, '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_secs': 600, '_log_step_count_steps': 100, '_protocol': None, '_session_config': None, '_save_checkpoints_steps': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_model_dir': 'Kernel_model'}\n<\/code><\/pre>\n\n\n\n<p>After defining the parameters, let&#8217;s train the model.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">start_time = time.time()\n<em>#traine the kernel classifier&nbsp;<\/em>\nkernel_estimator.fit(input_fn=train_input_fn, steps=2000)\nend_time = time.time()\n\ntimetaken = end_time - start_time\n<strong>print<\/strong>(f\"The model gets trained in {timetaken} seconds\")\n<\/pre>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>WARNING:tensorflow:From C:\\Anaconda3\\lib\\site-packages\\tensorflow\\contrib\\learn\\python\\learn\\estimators\\head.py:677: ModelFnOps.__new__ (from tensorflow.contrib.learn.python.learn.estimators.model_fn) is deprecated and will be removed in a future version.\nInstructions for updating:\nWhen switching to tf.estimator.Estimator, use tf.estimator.EstimatorSpec. You can use the `estimator_spec` method to create an equivalent one.\nINFO:tensorflow:Create CheckpointSaverHook.\nINFO:tensorflow:Graph was finalized.\nINFO:tensorflow:Running local_init_op.\nINFO:tensorflow:Done running local_init_op.\nINFO:tensorflow:Saving checkpoints for 0 into Kernel_model\\model.ckpt.\nINFO:tensorflow:loss = 1.0986123, step = 1\nINFO:tensorflow:global_step\/sec: 139.152\nINFO:tensorflow:loss = 0.9671518, step = 101 (0.727 sec)\nINFO:tensorflow:global_step\/sec: 237.121\nINFO:tensorflow:loss = 1.6124909, step = 201 (0.415 sec)\nINFO:tensorflow:global_step\/sec: 255.145\nINFO:tensorflow:loss = 8.096902, step = 301 (0.390 sec)\nINFO:tensorflow:global_step\/sec: 267.797\nINFO:tensorflow:loss = 2.1676967, step = 401 (0.373 sec)\nINFO:tensorflow:global_step\/sec: 271.93\nINFO:tensorflow:loss = 2.9555616, step = 501 (0.368 sec)\nINFO:tensorflow:global_step\/sec: 239.576\nINFO:tensorflow:loss = 4.1158886, step = 601 (0.422 sec)\nINFO:tensorflow:global_step\/sec: 257.063\nINFO:tensorflow:loss = 1.1513939, step = 701 (0.386 sec)\nINFO:tensorflow:global_step\/sec: 275.379\nINFO:tensorflow:loss = 1.4807884, step = 801 (0.366 sec)\nINFO:tensorflow:global_step\/sec: 259.215\nINFO:tensorflow:loss = 3.0109026, step = 901 (0.383 sec)\nINFO:tensorflow:global_step\/sec: 176.894\nINFO:tensorflow:loss = 3.5773954, step = 1001 (0.573 sec)\nINFO:tensorflow:global_step\/sec: 159.864\nINFO:tensorflow:loss = 5.449258, step = 1101 (0.620 sec)\nINFO:tensorflow:global_step\/sec: 176.148\nINFO:tensorflow:loss = 0.9399394, step = 1201 (0.563 sec)\nINFO:tensorflow:global_step\/sec: 198.512\nINFO:tensorflow:loss = 6.211024, step = 1301 (0.506 sec)\nINFO:tensorflow:global_step\/sec: 188.923\nINFO:tensorflow:loss = 10.430393, step = 1401 (0.532 sec)\nINFO:tensorflow:global_step\/sec: 172.681\nINFO:tensorflow:loss = 2.431583, step = 1501 (0.577 sec)\nINFO:tensorflow:global_step\/sec: 180.112\nINFO:tensorflow:loss = 1.4712481, step = 1601 (0.552 sec)\nINFO:tensorflow:global_step\/sec: 189.118\nINFO:tensorflow:loss = 3.9613595, step = 1701 (0.532 sec)\nINFO:tensorflow:global_step\/sec: 209.762\nINFO:tensorflow:loss = 3.3975558, step = 1801 (0.479 sec)\nINFO:tensorflow:global_step\/sec: 255.382\nINFO:tensorflow:loss = 7.3193703, step = 1901 (0.387 sec)\nINFO:tensorflow:Saving checkpoints for 2000 into Kernel_model\\model.ckpt.\nINFO:tensorflow:Loss for final step: 3.1488423.\nThe model gets trained in 12.914040088653564 seconds<\/code><\/pre>\n\n\n\n<p>After finally, we evaluate its performance.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><em># Evaluate the kernel classifier<\/em>\neval_metrics = kernel_estimator.evaluate(input_fn=test_input_fn, steps=1)<\/pre>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>INFO:tensorflow:Starting evaluation at 2020-12-12T17:02:21Z\nINFO:tensorflow:Graph was finalized.\nINFO:tensorflow:Restoring parameters from Kernel_model\\model.ckpt-2000\nINFO:tensorflow:Running local_init_op.\nINFO:tensorflow:Done running local_init_op.\nINFO:tensorflow:Evaluation &#91;1\/1]\nINFO:tensorflow:Finished evaluation at 2020-12-12-17:02:21\nINFO:tensorflow:Saving dict for global step 2000: accuracy = 0.7890625, global_step = 2000, loss = 4.6506495\n<\/code><\/pre>\n\n\n\n<p>As seen, the loss has been reduced to 4.65. This is a decent improvement from the first kernel classifier that recorded a loss of 10.94<\/p>\n\n\n\n<p>Conclusion<\/p>\n\n\n\n<p>In this tutorial, we have seen how to build a linear classifier with TensorFlow and a kernel classifier. We saw that the Kernel betters how the model performed by taking the dimension of the vector into a higher dimension and returns the inner product. This way has proven to work well for data that cannot be linearly separable, which is the case for most real-life data.&nbsp;<\/p>\n\n\n\n<p>We then went ahead to build a kernel classifier and juxtaposed the result with the linear classifier.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In most real-life classification problems, datasets are linearly non-separable. That is to say, the classes can not be separated by a straight line. But a linear classifier built with the LinearClassifier class of Tensorflow\u2019s estimator API attempts to learn the data with the assumption that it can be classified with a straight line. Other popular [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":8497,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[498],"tags":[],"class_list":["post-8483","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-artificial-intelligence-tutorials"],"_links":{"self":[{"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/posts\/8483","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/comments?post=8483"}],"version-history":[{"count":0,"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/posts\/8483\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/media\/8497"}],"wp:attachment":[{"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/media?parent=8483"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/categories?post=8483"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/tags?post=8483"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}