{"id":4868,"date":"2020-09-16T18:19:30","date_gmt":"2020-09-16T12:49:30","guid":{"rendered":"https:\/\/www.h2kinfosys.com\/blog\/?p=4868"},"modified":"2020-09-16T18:40:56","modified_gmt":"2020-09-16T13:10:56","slug":"data-reading-and-data-inspection-using-pandas","status":"publish","type":"post","link":"https:\/\/www.h2kinfosys.com\/blog\/data-reading-and-data-inspection-using-pandas\/","title":{"rendered":"Data Reading and Data Inspection Using Pandas"},"content":{"rendered":"\n<p>In the previous article, we have discussed what pandas is? It&#8217;s importance in data science, how to install it, and perform basic operations like adding and deleting index, rows, and columns in a DataFrame. Now we will dive deeper into the applications of pandas in real-time situations like Data Reading and Data Inspection Using Pandas<\/p>\n\n\n\n<p>As a data scientist or an analyst, you\u2019ll probably come across many file types to import and use in your Python scripts. Some analysts use Microsoft Excel, but the application limits what you can do with large data imports. The better option is pandas. It is a powerful analysis toolkit that\u2019s much more intuitive for a data scientist.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>What file formats can pandas use\u2026?<\/strong><\/h2>\n\n\n\n<p>Python can handle virtually any data file format much more than Microsoft Excel. That\u2019s the strength of Python. It\u2019s <a href=\"https:\/\/opensource.com\/article\/18\/3\/what-open-source-programming\" rel=\"nofollow noopener\" target=\"_blank\">open-source<\/a>, and there\u2019s probably a library out there to handle it, so you get a vastly more compatible system.&nbsp;<\/p>\n\n\n\n<p>These are the most common types of Data which we will come across<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Comma-separated values (CSV)<\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li>XLSX<\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li>JSON<\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li>XML<\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li>HTML<\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li>Images<\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li>PDF<\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li>DOCX<\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li>SQL<\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>How to read and write tabular data ?<\/strong><img fetchpriority=\"high\" decoding=\"async\" width=\"719\" height=\"195\" src=\"https:\/\/lh4.googleusercontent.com\/zZWurH8JJ5jbogsbRbay6gYyvwVLL9A8_dFFTR8wQxQzRIrx8B4NGxyFpBunHoYNdutVIk0hpJDxMGRo9EcHKe2ugj86nTcG84s2x9M0Fq742FmxkYc0_t0YhSgVWwZHWc6NhAZb9VfDuU4xoQ\" alt=\"\" title=\"\"><\/h2>\n\n\n\n<p>Now we will learn to read and write data using pandas functions. We will use pandas read_csv() and .to_csv() functions<\/p>\n\n\n\n<p>A comma-separated values (CSV) file is a plaintext file with a .csv extension that holds tabular data. This is one of the most popular file formats for storing large amounts of data. Each row of the CSV file represents a single table row. The values in the same row are by default separated with commas, but you could change the separator to a semicolon, tab, space, or some other character.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Write a CSV File<\/strong><\/h3>\n\n\n\n<p>You can save your Pandas DataFrame as a CSV file with <strong>.to_csv()<\/strong>:<\/p>\n\n\n\n<p>df.to_csv(\u2018data.csv&#8217;)<\/p>\n\n\n\n<p>That\u2019s it! You\u2019ve created the file data.csv in your current working directory. You can expand the code block below to see how your CSV file should look:<\/p>\n\n\n\n<p>data.csv<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Read a CSV File<\/strong><\/h3>\n\n\n\n<p>Once your data is saved in a CSV file, you\u2019ll likely want to load and use it from time to time. You can do that with the Pandas read_csv() function:<\/p>\n\n\n\n<p>df = pd.read_csv(&#8216;data.csv&#8217;, index_col=0)<\/p>\n\n\n\n<p>df<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td><strong>COUNTRY&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/strong><\/td><td><strong>POP&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/strong><\/td><td><strong>&nbsp; AREA&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/strong><\/td><td><strong>&nbsp; CONT&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/strong><\/td><td><strong>IND_DAY&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/strong><\/td><\/tr><tr><td><strong>China&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/strong><\/td><td>1398.72<\/td><td>9596.96<\/td><td>Asia&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/td><td>NaN&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/td><\/tr><tr><td><strong>India&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/strong><\/td><td>1351.16<\/td><td>3287.26<\/td><td>Asia&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/td><td>1947-08-15&nbsp;<\/td><\/tr><tr><td><strong>US&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/strong><\/td><td>329.74<\/td><td>9833.52<\/td><td>N.America&nbsp;&nbsp;<\/td><td>1776-07-04&nbsp;<\/td><\/tr><tr><td><strong>Indonesia&nbsp;&nbsp;&nbsp;&nbsp;<\/strong><\/td><td>268.07<\/td><td>1910.93<\/td><td>Asia&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/td><td>1945-08-17&nbsp;<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Write an Excel File<\/strong><\/h3>\n\n\n\n<p>You can save your Pandas DataFrame as a CSV file with <strong>.to_excel()<\/strong>:<\/p>\n\n\n\n<p>df.to_csv(\u2018data.csv\u2019)<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Read an Excel File<\/strong><\/h3>\n\n\n\n<p>You can do that with the Pandas read_excel() function:<\/p>\n\n\n\n<p><code>df = pd.read_excel('data.xlsx', index_col=0)<\/code><\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Write an Json File<\/strong><\/h3>\n\n\n\n<p>You can save your Pandas DataFrame as a CSV file with <strong>.to_json()<\/strong>:<\/p>\n\n\n\n<p><code>df.to_json('data-index.json', orient='index')<\/code><\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Read an Json File<\/strong><\/h3>\n\n\n\n<p>You can do that with the Pandas read_json() function:<\/p>\n\n\n\n<p><code>df = pd.read_json('data.xlsx', index_col=0)<\/code><\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Write Files<\/strong><\/h3>\n\n\n\n<p>Series and <a href=\"https:\/\/www.h2kinfosys.com\/blog\/getting-started-with-pandas\/\">DataFrame objects<\/a> have methods that enable writing data and labels to the clipboard or files. They\u2019re named with the pattern <strong>.to_&lt;file-type&gt;()<\/strong>, where &lt;file-type&gt; is the type of the target file.<\/p>\n\n\n\n<p>You\u2019ve learned about .to_csv() and .to_excel(), but there are others, including:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>.to_json()<\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li>.to_html()<\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li>.to_sql()<\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li>.to_pickle()<\/li><\/ul>\n\n\n\n<p>There are still more file types that you can write to, so this list is not exhaustive.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Read Files<\/strong><\/h3>\n\n\n\n<p>Pandas functions for reading the contents of files are named using the pattern <strong>.read_&lt;file-type&gt;()<\/strong>, where &lt;file-type&gt; indicates the type of the file to read. You\u2019ve already seen the Pandas read_csv() and read_excel() functions. Here are a few others:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>read_json()<\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li>read_html()<\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li>read_sql()<\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li>read_pickle()<\/li><\/ul>\n\n\n\n<p>These functions have a parameter that specifies the target file path. It can be any valid string that represents the path, either on a local machine or in a URL. Other objects are also acceptable depending on the file type.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>How to view and inspect data in a DataFrame ?<\/strong><\/h2>\n\n\n\n<p>For checking the data of pandas.DataFrame and pandas.Series with many rows and columns <strong>head()<\/strong> and <strong>tail()<\/strong> methods are useful.<\/p>\n\n\n\n<p>Now we will use Iris Data set from kaggle for this tutorial<\/p>\n\n\n\n<p>\u201c <a href=\"https:\/\/www.kaggle.com\/uciml\/iris\" rel=\"nofollow noopener\" target=\"_blank\">https:\/\/www.kaggle.com\/uciml\/iris<\/a> \u201c&nbsp;<\/p>\n\n\n\n<p>import pandas as pd&nbsp;<\/p>\n\n\n\n<p><code>df <strong>=<\/strong> sns<strong>.<\/strong>load_dataset(\"iris\")<\/code><\/p>\n\n\n\n<p><strong>Get first n rows of DataFrame:<\/strong> head()<\/p>\n\n\n\n<p>The head() method returns the first n rows.<\/p>\n\n\n\n<p><code>print(df<strong>.<\/strong>head(5))<\/code><\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td><strong>&nbsp;&nbsp;&nbsp;&nbsp;<\/strong><\/td><td><strong>sepal_length&nbsp;&nbsp;<\/strong><\/td><td><strong>sepal_width&nbsp;&nbsp;<\/strong><\/td><td><strong>petal_length&nbsp;&nbsp;<\/strong><\/td><td><strong>petal_width&nbsp;<\/strong><\/td><td><strong>species&nbsp;<\/strong><\/td><\/tr><tr><td><strong>0<\/strong><\/td><td>5.1<\/td><td>3.5<\/td><td>1.4<\/td><td>0.2<\/td><td>setosa&nbsp;&nbsp;<\/td><\/tr><tr><td><strong>1<\/strong><\/td><td>4.9<\/td><td>3.0<\/td><td>1.4<\/td><td>0.2<\/td><td>setosa&nbsp;&nbsp;<\/td><\/tr><tr><td><strong>2<\/strong><\/td><td>4.7<\/td><td>3.2<\/td><td>1.3<\/td><td>0.2<\/td><td>setosa&nbsp;&nbsp;<\/td><\/tr><tr><td><strong>3<\/strong><\/td><td>4.6<\/td><td>3.1<\/td><td>1.5<\/td><td>0.2<\/td><td>setosa&nbsp;&nbsp;<\/td><\/tr><tr><td><strong>4<\/strong><\/td><td>5.0<\/td><td>3.6<\/td><td>1.4<\/td><td>0.2<\/td><td>setosa&nbsp;&nbsp;<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p><strong>Get first n rows of DataFrame:<\/strong> tail()<\/p>\n\n\n\n<p>The tail() method returns the first n rows.<\/p>\n\n\n\n<p><code>print(df<strong>.<\/strong>tail(5))<\/code><\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td><strong>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/strong><\/td><td><strong>sepal_length&nbsp;<\/strong><\/td><td><strong>sepal_width&nbsp;&nbsp;<\/strong><\/td><td><strong>petal_length&nbsp;<\/strong><\/td><td><strong>petal_width&nbsp;&nbsp;<\/strong><\/td><td><strong>species&nbsp;&nbsp;<\/strong><\/td><\/tr><tr><td><strong>145<\/strong><\/td><td>6.7<\/td><td>3.0<\/td><td>5.2<\/td><td>2.3<\/td><td>virginica<\/td><\/tr><tr><td><strong>146<\/strong><\/td><td>6.3<\/td><td>2.5<\/td><td>5.0<\/td><td>1.9<\/td><td>virginica<\/td><\/tr><tr><td><strong>147<\/strong><\/td><td>26.5<\/td><td>3.0<\/td><td>5.2<\/td><td>2.0<\/td><td>virginica<\/td><\/tr><tr><td><strong>148<\/strong><\/td><td>6.2<\/td><td>3.4<\/td><td>5.4<\/td><td>2.3<\/td><td>virginica<\/td><\/tr><tr><td><strong>159<\/strong><\/td><td>5.9<\/td><td>3.0<\/td><td>5.1<\/td><td>1.8<\/td><td>virginica<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>Pandas <strong>.shape,.size<\/strong> and <strong>.ndim<\/strong> are used to return size, shape and dimensions of data frames and series.<\/p>\n\n\n\n<p>Create a DataFrame<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">import pandas as pd\nimport numpy as np\nd={\u2018Name\u2019:pd.Series(['Tom','James','Ricky','Vin',&nbsp; &nbsp; 'Steve']),'Age':pd.Series([25,26,25,23,30]),'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20])}\n#Create a DataFrame&nbsp;\ndf = pd.DataFrame(d)\nprint df<\/pre>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td><strong>&nbsp;&nbsp;&nbsp;&nbsp;<\/strong><\/td><td>Age&nbsp;&nbsp;&nbsp;<\/td><td>Name&nbsp;&nbsp;&nbsp;&nbsp;<\/td><td>Rating<\/td><\/tr><tr><td><strong>0<\/strong><\/td><td>25<\/td><td>Tom&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/td><td>4.23<\/td><\/tr><tr><td><strong>1<\/strong><\/td><td>26<\/td><td>James&nbsp;&nbsp;&nbsp;<\/td><td>3.24<\/td><\/tr><tr><td><strong>2<\/strong><\/td><td>25<\/td><td>Ricky&nbsp;&nbsp;&nbsp;<\/td><td>3.98<\/td><\/tr><tr><td><strong>3<\/strong><\/td><td>23<\/td><td>Vin&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/td><td>2.56<\/td><\/tr><tr><td><strong>4<\/strong><\/td><td>30<\/td><td>Steve&nbsp;&nbsp;&nbsp;<\/td><td>3.20<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p><strong>.shape <\/strong>Returns a tuple representing the dimensionality of the DataFrame. Tuple (a,b), where a represents the number of rows and <strong>b<\/strong> represents the number of columns.<\/p>\n\n\n\n<p><code>df.shape<\/code><\/p>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<p>(5, 3) \/\/5 Rows &amp; 3 Columns<\/p>\n\n\n\n<p><strong>.size <\/strong>Returns the number of elements in the DataFrame.<\/p>\n\n\n\n<p>df.size<\/p>\n\n\n\n<p><strong>Output:&nbsp;<\/strong><\/p>\n\n\n\n<p>21 \/\/ The total number of elements in our object is:<\/p>\n\n\n\n<p><strong>.ndim <\/strong>Returns the number of dimensions of the object. By definition, DataFrame is a 2D object<\/p>\n\n\n\n<p>df.ndim<\/p>\n\n\n\n<p><strong>Output:&nbsp;<\/strong><\/p>\n\n\n\n<p>2 \/\/ The dimension of the object is<\/p>\n\n\n\n<p>Pandas <strong>.info() <\/strong>function is used to print a concise summary of a DataFrame. This method prints information about a DataFrame including the index dtype and column dtypes, non-null values and memory usage.<\/p>\n\n\n\n<p>Consider the following DataFrame df<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td><strong>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/strong><\/td><td><strong>int_col&nbsp;<\/strong><\/td><td><strong>text_col&nbsp;&nbsp;&nbsp;<\/strong><\/td><td><strong>float_col&nbsp;<\/strong><\/td><\/tr><tr><td><strong>0<\/strong><\/td><td>1<\/td><td>&nbsp; alpha<\/td><td>0.00<\/td><\/tr><tr><td><strong>1<\/strong><\/td><td>2<\/td><td>&nbsp; beta&nbsp;<\/td><td>0.25<\/td><\/tr><tr><td><strong>2<\/strong><\/td><td>3<\/td><td>&nbsp; gamma<\/td><td>0.50<\/td><\/tr><tr><td><strong>3<\/strong><\/td><td>4<\/td><td>&nbsp; delta<\/td><td>0.75<\/td><\/tr><tr><td><strong>4<\/strong><\/td><td>5<\/td><td>&nbsp; Epsilon<\/td><td>1.00<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p><code>df.info()<\/code><\/p>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">&lt;class \u2018pandas.core.frame.DataFrame\u2019&gt;&nbsp;&nbsp;\nRangeIndex: 5 entries, 0 to 4&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\nData columns (total 3 columns):&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\nfloat_col&nbsp; &nbsp; 5 non-null float64&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\nint_col&nbsp; &nbsp; &nbsp; 5 non-null int64&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\ntext_col &nbsp; &nbsp; 5 non-null object&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\ndtypes: float64(1), int64(1), object(1)\nmemory usage: 192.0+ bytes&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/pre>\n\n\n\n<p>Pandas <strong>.describe() <\/strong>function computes a summary of statistics pertaining to the DataFrame columns. This function gives the mean, std and IQR values. And, function excludes the character columns and given summary about numeric columns<\/p>\n\n\n\n<p><code>df.describe()<\/code><\/p>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td><strong>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/strong><\/td><td><strong>float_col&nbsp;&nbsp;<\/strong><\/td><td><strong>int_col<\/strong><\/td><\/tr><tr><td><strong>count&nbsp;&nbsp;<\/strong><\/td><td>5.000000<\/td><td>5.000000<\/td><\/tr><tr><td><strong>mean&nbsp;&nbsp;&nbsp;<\/strong><\/td><td>0.500000<\/td><td>3.000000<\/td><\/tr><tr><td><strong>std&nbsp;&nbsp;&nbsp;&nbsp;<\/strong><\/td><td>0.395285<\/td><td>1.581139<\/td><\/tr><tr><td><strong>min&nbsp;&nbsp;&nbsp;&nbsp;<\/strong><\/td><td>0.000000<\/td><td>1.000000<\/td><\/tr><tr><td><strong>25%<\/strong><\/td><td>0.250000<\/td><td>2.000000<\/td><\/tr><tr><td><strong>50%<\/strong><\/td><td>0.500000<\/td><td>3.000000<\/td><\/tr><tr><td><strong>75%<\/strong><\/td><td>0.750000<\/td><td>4.000000<\/td><\/tr><tr><td><strong>max&nbsp;&nbsp;&nbsp;&nbsp;<\/strong><\/td><td>1.000000<\/td><td>5.000000<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>Pandas .value_counts() function returns object containing counts of unique values. The resulting object will be in descending order so that the first element is the most frequently-occurring element. Excludes NA values by default.<\/p>\n\n\n\n<p>Consider a DataFrame<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td><strong>&nbsp;&nbsp;&nbsp;&nbsp;<\/strong><\/td><td>Student<\/td><\/tr><tr><td><strong>0<\/strong><\/td><td>Harry&nbsp;&nbsp;<\/td><\/tr><tr><td><strong>1<\/strong><\/td><td>Mike&nbsp;&nbsp;&nbsp;<\/td><\/tr><tr><td><strong>2<\/strong><\/td><td>Arther&nbsp;<\/td><\/tr><tr><td><strong>3<\/strong><\/td><td>Harry&nbsp;&nbsp;<\/td><\/tr><tr><td><strong>4<\/strong><\/td><td>Arther&nbsp;<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">Harry &nbsp; &nbsp; 2&nbsp;\nArther&nbsp; &nbsp; 2&nbsp;\nMike&nbsp; &nbsp; &nbsp; 1&nbsp;\nNick &nbsp; &nbsp; &nbsp;1&nbsp;\nName: Student, dtype: int64<\/pre>\n\n\n\n<p>Pandas is a huge concept where we will learn all its components in a step by step manner. In the next article we will discuss about data selection, data cleaning, filtering, sorting, group-by, joining and combining of the dataset<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the previous article, we have discussed what pandas is? It&#8217;s importance in data science, how to install it, and perform basic operations like adding and deleting index, rows, and columns in a DataFrame. Now we will dive deeper into the applications of pandas in real-time situations like Data Reading and Data Inspection Using Pandas [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":4896,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[500],"tags":[1371,1370,1372],"class_list":["post-4868","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-science-using-python-tutorials","tag-data-inspection-using-pandas","tag-data-reading","tag-read-and-write-tabular-data"],"_links":{"self":[{"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/posts\/4868","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/comments?post=4868"}],"version-history":[{"count":0,"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/posts\/4868\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/media\/4896"}],"wp:attachment":[{"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/media?parent=4868"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/categories?post=4868"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/tags?post=4868"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}