{"id":6737,"date":"2020-11-11T16:05:30","date_gmt":"2020-11-11T10:35:30","guid":{"rendered":"https:\/\/www.h2kinfosys.com\/blog\/?p=6737"},"modified":"2020-11-11T16:05:32","modified_gmt":"2020-11-11T10:35:32","slug":"introduction-to-structured-multi-plot-grids","status":"publish","type":"post","link":"https:\/\/www.h2kinfosys.com\/blog\/introduction-to-structured-multi-plot-grids\/","title":{"rendered":"Introduction to structured multi-plot grids"},"content":{"rendered":"\n<p>The FacetGrid class is useful when you want to visualize the distribution of a variable or the relationship between multiple variables separately within subsets of your dataset. A FacetGrid can be drawn with up to three dimensions: row, col, and hue. The first two have obvious correspondence with the resulting array of axes.&nbsp;<\/p>\n\n\n\n<p>It can also represent levels of a third variable with the hue parameter, which plots different subsets of data in different colors. This uses color to resolve elements on a third dimension, but only draws subsets on top of each other and will not tailor the hue parameter for the specific visualization of the way that <a href=\"https:\/\/community.insaid.co\/hc\/en-us\/community\/posts\/360042368114-Axes-level-function\" rel=\"nofollow noopener\" target=\"_blank\">axes-level functions<\/a> that accept hue will.\u00a0<\/p>\n\n\n\n<p>This class maps a dataset into multiple axes arrayed in a grid of rows and columns that correspond to levels of variables in the dataset. The plots it produces are often called \u201clattice\u201d, \u201ctrellis\u201d, or \u201csmall-multiple\u201d graphics.&nbsp;<\/p>\n\n\n\n<p>The basic workflow is to initialize the FacetGrid object with the dataset and the variables that are used to structure the grid. Then one or more plotting functions can be applied to each subset by calling FacetGrid.map() or FacetGrid.map_dataframe().&nbsp;&nbsp;<\/p>\n\n\n\n<p>Finally, the plot can be tweaked with other methods to do things like changing the axis labels, use different ticks, or add a legend. See the detailed code examples below for more information. We will use the tips data set for this example.&nbsp;<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">import seaborn as sns\u00a0\u00a0<br>tips = sns.load_dataset(\u201ctips\")\u00a0<\/pre>\n\n\n\n<p><strong>Output:&nbsp;&nbsp;<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td><\/td><td><strong>total_bill<\/strong><\/td><td><strong>tip<\/strong><\/td><td><strong>sex<\/strong><\/td><td><strong>smoker<\/strong><\/td><td><strong>day<\/strong><\/td><td><strong>time<\/strong><\/td><td><strong>size<\/strong><\/td><\/tr><tr><td>0<\/td><td>16.99<\/td><td>1.01<\/td><td>Female<\/td><td>No<\/td><td>Sun<\/td><td>Dinner<\/td><td>2<\/td><\/tr><tr><td>1<\/td><td>10.34<\/td><td>1.66<\/td><td>Male<\/td><td>No<\/td><td>Sun<\/td><td>Dinner<\/td><td>3<\/td><\/tr><tr><td>2<\/td><td>21.01<\/td><td>3.50<\/td><td>Male<\/td><td>No<\/td><td>Sun<\/td><td>Dinner<\/td><td>3<\/td><\/tr><tr><td>3<\/td><td>23.68<\/td><td>3.31<\/td><td>Male<\/td><td>No<\/td><td>Sun<\/td><td>Dinner<\/td><td>2<\/td><\/tr><tr><td>4<\/td><td>24.59<\/td><td>3.61<\/td><td>Female<\/td><td>No<\/td><td>Sun<\/td><td>Dinner<\/td><td>4<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>sns.FacetGrid(tips)&nbsp;&nbsp;<\/p>\n\n\n\n<p><strong>Output:&nbsp;<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh6.googleusercontent.com\/ZA5-Ft3v0b1aF0i_mERFyim_I_r3Uo5fymV8P0IfU3tCxBcgXumNOkpm7_gNly3lK2cntOw9B3K2HDn1PmSqHNT--Ip8XWla5ailHgA2c8h2HZXUG1ToxCyqfiiniqw37ztqcRoh\" alt=\"\" title=\"\"><\/figure>\n\n\n\n<p>To draw a plot on every facet, pass a function and the name of one or&nbsp; more columns in the dataframe to <strong>FacetGrid.map()&nbsp;&nbsp;<\/strong><\/p>\n\n\n\n<p>g = sns.FacetGrid(tips, col=&#8221;time&#8221;, row=\u201csex\u201d)&nbsp; g.map(sns.scatterplot, &#8220;total_bill&#8221;, \u201ctip&#8221;)&nbsp;<\/p>\n\n\n\n<p>The variable specification in <strong>FacetGrid.map() <\/strong>requires a positional\u00a0 argument mapping, but if the function has a data parameter and\u00a0 accepts named variable assignments, you can also use\u00a0 multi-plot grids <strong>FacetGrid.map_dataframe().\u00a0\u00a0<\/strong><\/p>\n\n\n\n<p>One difference between the two methods is that <strong>FacetGrid.map_dataframe() <\/strong>does not add axis labels.&nbsp;<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">g = sns.FacetGrid(tips, col=\"time\", row=\u201csex\u201d)&nbsp;&nbsp;<\/pre>\n\n\n\n<p>g.map_dataframe(sns.histplot, x=\u201ctotal_bill&#8221;)&nbsp; <img fetchpriority=\"high\" decoding=\"async\" src=\"https:\/\/lh6.googleusercontent.com\/IgA-7NUNaUfmzlF9IZnocIb4RnmZzDf9eAxz_8oKHcX40CIf3JprCAIug0fYEFvHWhqZ4YEOF14Ul2aLOXnBYYIZQLa5DhVum3tXSOC9CaZ44Uj6gaH7cWJG9s74GxpNX1jbWswX\" width=\"454\" height=\"454\" alt=\"\" title=\"\"><\/p>\n\n\n\n<p>The <strong>FacetGrid <\/strong>constructor accepts a hue parameter. Setting this will\u00a0 condition the data on another variable and make multi-plot grids in\u00a0 different colors. Where possible, label information is tracked so that a\u00a0 single legend can be drawn\u00a0<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">g = sns.FacetGrid(tips, col=\"time\", hue=\u201csex\")&nbsp; g.map_dataframe(sns.scatterplot, x=\u201ctotal_bill\",&nbsp; y=\u201ctip\")&nbsp;<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh5.googleusercontent.com\/ze5mQ1ldcWvgCypX0U7I6wNm0Uf3bTb0KUhHYUMYqFd-i8MJwSgCwUMJnvByEBVUanXngc858LR7oagtz8Qi0HSS6cRzHjfznA4rprRgQSdl2pec6K9ajhjhJFuYuiCBhrT1Cjbd\" alt=\"structured multi-plot grids\" title=\"\"><\/figure>\n\n\n\n<p>The size and shape of the plot are specified at the level of each subplot using multi-plot grids the height and aspect parameters. Change the height and aspect ratio of each facet.\u00a0<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">g = sns.FacetGrid(tips, col=\"day\", height=3.5, &nbsp;aspect=.65)&nbsp;&nbsp;<br>g.map(sns.histplot, \u201ctotal_bill\u201d)&nbsp;<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh4.googleusercontent.com\/85M22WYonU-fx8QGOoeEQLWA3_rRbyPLgtvG26ARAPXJAvOVlpCrHJ5ao6R6xl6U-y3pd3hc9EnI08QHCfYIOHiQg3OSNj0XSvxE-5QAh3Hua_x6XHoNiV7C9yRdHYNpqOeB7QAU\" alt=\"structured multi-plot grids\" title=\"\"><\/figure>\n\n\n\n<p>Note that margin_titles isn\u2019t formally supported by the <a href=\"https:\/\/www.h2kinfosys.com\/blog\/introduction-to-data-visualization-using-matplot\/\">matplotlib <\/a>API,\u00a0 and may not work well in all cases. In particular, it currently can\u2019t be used with a legend that lies outside of the plot.\u00a0<\/p>\n\n\n\n<p>The size of the figure is set by providing the height of each facet, along&nbsp; with the aspect ratio<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">g =sns.FacetGrid(tips, col=\"day\", height=4,aspect=.5)&nbsp; g.map(sns.barplot, \"sex\", \"total_bill\", &nbsp;<\/pre>\n\n\n\n<p>order=[&#8220;Male&#8221;, \u201cFemale&#8221;])&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh3.googleusercontent.com\/rrJgBpsm94upK5uzNe1xkpGHqu1C75vxtm-SB_ThuR62aVxq6U3DOtafNuPa0MdD3jnhwEHs73q8T0cd19-htg2CDR_P6i42KArruTz0qG72xxfvKlWJku_If8YmYkzkpXbAbWrZ\" alt=\"multi-plot grids\" title=\"\"><\/figure>\n\n\n\n<p>The default ordering of the facets is derived from the information in the\u00a0 multi-plot grids DataFrame. If the variable used to define facets has a categorical type,\u00a0 then the order of the categories is used.\u00a0\u00a0<\/p>\n\n\n\n<p>Otherwise, the facets will be in the order of appearance of the category&nbsp; levels. It is possible, however, to specify an ordering of any facet&nbsp; dimension with the appropriate *_order parameter&nbsp;&nbsp;<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">ordered_days = tips.sex.value_counts().index&nbsp; g = sns.FacetGrid(tips, row=\"sex\", &nbsp;<\/pre>\n\n\n\n<pre class=\"wp-block-preformatted\">row_order=ordered_days,height=1.7, aspect=4,) g.map(sns.kdeplot, \u201ctotal_bill\u201d)&nbsp;<\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh6.googleusercontent.com\/pp30OKi_5JYnIcRx8oK8gN7wvWDNJr7DPShH4yN-xw6fRIIuRX5JwKVSe3DNrUTbkSGm8YDcAR1Zt-K6I9Hqp7X4xkyJs_AjAcnZhcwT83nVqIY-tCTg7dqCM1-CGKT0yDGi2Ly1\" alt=\"multi-plot grids\" title=\"\"><\/figure>\n\n\n\n<p>If you have many levels of one variable, you can plot it along with the columns but \u201cwrap\u201d them so that they span multiple rows. When doing this, you cannot use a row variable.&nbsp;<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">attend = sns.load_dataset(\u201cattention\").query(\"subject&nbsp; &lt;= 12\u201d)&nbsp;&nbsp;<\/pre>\n\n\n\n<p><strong>Unnamed: 0 subject attention solutions score<\/strong><strong>&nbsp;<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td><\/td><td><strong>Unnamed: 0<\/strong><\/td><td><strong>subject<\/strong><\/td><td><strong>attention<\/strong><\/td><td><strong>solutions<\/strong><\/td><td><strong>score<\/strong><\/td><\/tr><tr><td>0<\/td><td>0<\/td><td>1<\/td><td>divided<\/td><td>1<\/td><td>2.0<\/td><\/tr><tr><td>1<\/td><td>1<\/td><td>2<\/td><td>divided<\/td><td>1<\/td><td>3.0<\/td><\/tr><tr><td>2<\/td><td>2<\/td><td>3<\/td><td>divided<\/td><td>1<\/td><td>3.0<\/td><\/tr><tr><td>3<\/td><td>3<\/td><td>4<\/td><td>divided<\/td><td>1<\/td><td>5.0<\/td><\/tr><tr><td>4<\/td><td>4<\/td><td>5<\/td><td>divided<\/td><td>1<\/td><td>4.0<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>g = sns.FacetGrid(attend, col=&#8221;subject&#8221;, col_wrap=4,&nbsp; height=2, ylim=(0, 10))&nbsp;&nbsp;<\/p>\n\n\n\n<p>g.map(sns.pointplot, &#8220;solutions&#8221;, &#8220;score&#8221;, order=[1,&nbsp; 2, 3], color=&#8221;.3&#8243;, ci=<strong>None<\/strong>)&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh4.googleusercontent.com\/j3pEjar8lbArbPvQQzMESV-OqHDD80gyL75rjU62zjqZtDev1nQMnAPNe26pkwkqvSChv71fkEI1I3FgEtukueJHKS_2y4GP6P6Sukz78ClRDwUW6lyep2FZn5wkhFyyUPFzT_o3\" alt=\"multi-plot grids\" title=\"\"><\/figure>\n\n\n\n<p><strong>Using custom functions&nbsp;&nbsp;<\/strong><\/p>\n\n\n\n<p>You\u2019re not limited to existing matplotlib and seaborn functions when&nbsp; using <strong>FacetGrid<\/strong>. However, to work properly, any function you use must&nbsp; follow a few rules:&nbsp;<\/p>\n\n\n\n<p>1. It must plot onto the \u201ccurrently active\u201d matplotlib Axes. This will be true of functions in the matplotlib.pyplot namespace, and you can call <strong>matplotlib.pyplot.gca() <\/strong>to get a reference to the current Axes if you want to work directly with its methods.&nbsp;<\/p>\n\n\n\n<p>2. It must accept the data that it plots in positional arguments.&nbsp; Internally, <strong>FacetGrid <\/strong>will pass a series of data for each of the named positional arguments passed to <strong>FacetGrid.map()<\/strong>.&nbsp;<\/p>\n\n\n\n<p>3. It must be able to accept color and label keyword arguments, and,\u00a0 ideally, it will do something useful with them. In most cases, it\u2019s easiest to catch a generic dictionary of **kwargs and pass it along to the underlying plotting function multi-plot grids.\u00a0<\/p>\n\n\n\n<p>Let\u2019s look at a minimal example of a function you can plot with. This&nbsp; function will just take a single vector of data for each facet&nbsp;<\/p>\n\n\n\n<p><strong>from scipy import <\/strong>stats&nbsp;&nbsp;<\/p>\n\n\n\n<p><strong>def <\/strong>quantile_plot(x, **kwargs):&nbsp;&nbsp;<\/p>\n\n\n\n<p>&nbsp;quantiles, xr = stats.probplot(x, fit=<strong>False<\/strong>)&nbsp; &nbsp;plt.scatter(xr, quantiles, **kwargs)&nbsp;&nbsp;<\/p>\n\n\n\n<p>g = sns.FacetGrid(tips, col=&#8221;sex&#8221;, height=4) g.map(quantile_plot, \u201ctotal_bill&#8221;)&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh6.googleusercontent.com\/gZ-AQ8dkGDL_Cl0PUCI1AZdkq3fDHPhhmcgTD7aozZdkRmEatKp-KPmVmAqoRKltvXmNjhEyCJUizEmPl_54Q6Gtsqv4NoHP0R9Vd-V57X3XjkzwiCgOGF3ZFFKoJ5OFgmL8VKbr\" alt=\"\" title=\"\"><\/figure>\n\n\n\n<p><strong>Plotting pairwise data relationships&nbsp;&nbsp;<\/strong><\/p>\n\n\n\n<p>PPairGrid also allows you to quickly draw a grid of small subplots using the same plot type to visualize data in each. In a PairGrid, each row and column is assigned to a different variable, so the resulting plot shows each pairwise relationship in the dataset. This style of the plot is sometimes called a \u201cscatterplot matrix\u201d, as this is the most common way to show each relationship, but PairGrid is not limited to scatterplots.&nbsp;<\/p>\n\n\n\n<p>It\u2019s important to understand the differences between a FacetGrid and a PairGrid. In the former, each facet shows the same relationship conditioned on different levels of other variables. In the latter, each plot shows a different relationship (although the upper and lower triangles will have mirrored plots). Using PairGrid can give you a very quick, very high-level summary of interesting relationships in your dataset.&nbsp;<\/p>\n\n\n\n<p>The basic usage of the class is very similar to FacetGrid. First, you initialize the grid, then you pass the plotting function to a map method and it will be called on each subplot. There is also a companion function, pairplot() that trades off some flexibility for faster plotting.&nbsp;<\/p>\n\n\n\n<p>We will use iris dataset for this example&nbsp;<\/p>\n\n\n\n<p>iris = sns.load_dataset(\u201ciris\u201d)&nbsp;&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td><\/td><td><strong>sepal_length<\/strong><\/td><td><strong>sepal_width<\/strong><\/td><td><strong>petal_length<\/strong><\/td><td><strong>petal_width<\/strong><\/td><td><strong>species<\/strong><\/td><\/tr><tr><td>0<\/td><td>5.1<\/td><td>3.5<\/td><td>1.4<\/td><td>0.2<\/td><td>setosa<\/td><\/tr><tr><td>1<\/td><td>4.9<\/td><td>3.0<\/td><td>1.4<\/td><td>0.2<\/td><td>setosa<\/td><\/tr><tr><td>2<\/td><td>4.7<\/td><td>3.2<\/td><td>1.3<\/td><td>0.2<\/td><td>setosa<\/td><\/tr><tr><td>3<\/td><td>4.6<\/td><td>3.1<\/td><td>1.5<\/td><td>0.2<\/td><td>setosa<\/td><\/tr><tr><td>4<\/td><td>5.0<\/td><td>3.6<\/td><td>1.4<\/td><td>0.2<\/td><td>setosa<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>Now we plot pairplot() for this iris dataset&nbsp;<\/p>\n\n\n\n<p>g = sns.PairGrid(iris)&nbsp;&nbsp;<\/p>\n\n\n\n<p>g.map(sns.scatterplot)&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh3.googleusercontent.com\/AN2zlfToAgLd0JUg31xo02GAGdQ7eNdS0uew9WM_ywJ9tH4boO1pYIie-dITk5Heq2U-5cRG_uQJS8uRgqxKFg1_pH3-d8Iwm-xA0PHM61-lUkShV40y8IGTV74OxOgcSDF_mmal\" alt=\"\" title=\"\"><\/figure>\n\n\n\n<p>By default every numeric column in the dataset is used, but you can&nbsp; focus on particular relationships if you want.&nbsp;<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">g = sns.PairGrid(iris, vars=[\u201csepal_length\", &nbsp;\"sepal_width\"], hue=\u201cspecies\")&nbsp;&nbsp;<\/pre>\n\n\n\n<p>g.map(sns.scatterplot)&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh6.googleusercontent.com\/PYZ4N1cAE01Xgu-7hLifw2BEB7ace3uKPsW5LXtliMLPk065i5lfERA6S4xt0mo1FTkIWb8WQL6PWdDt5gZcTNArMvXxb0YXaC6VpyuBsYifWOjzlrdqG_F4tYhR4X9xBAdXtnJ_\" alt=\"\" title=\"\"><\/figure>\n\n\n\n<p>The square grid with identity relationships on the diagonal is actually just&nbsp; a special case, and you can plot with different variables in the rows and&nbsp; columns.&nbsp;<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">g = sns.PairGrid ( t i p s , y _ v a r s = [ \" t i p \" ] , &nbsp;x_vars=[\"total_bill\", \"size\"], height=4)&nbsp; g.map(sns.regplot, color=\u201c.3\")&nbsp;<\/pre>\n\n\n\n<p>g.set(ylim=(-1, 11), yticks=[0, 5, 10])&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh3.googleusercontent.com\/5wJHfMKth8skp7F3eC9n-XtCVcnM3Dl1h7TbwNvww9N0g-g362Ys7osgIQsmNfQ8IonfiAEbrTIvhyLxJJHgrLnOjkwKNc7FF1tUB8OnL_dPKu6UxybsLjJ0B40luB63W3vpTv_l\" alt=\"\" title=\"\"><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>The FacetGrid class is useful when you want to visualize the distribution of a variable or the relationship between multiple variables separately within subsets of your dataset. A FacetGrid can be drawn with up to three dimensions: row, col, and hue. The first two have obvious correspondence with the resulting array of axes.&nbsp; It can [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":6788,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[500],"tags":[],"class_list":["post-6737","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-science-using-python-tutorials"],"_links":{"self":[{"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/posts\/6737","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/comments?post=6737"}],"version-history":[{"count":0,"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/posts\/6737\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/media\/6788"}],"wp:attachment":[{"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/media?parent=6737"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/categories?post=6737"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.h2kinfosys.com\/blog\/wp-json\/wp\/v2\/tags?post=6737"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}