<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Data Sources Archives - relataly.com</title>
	<atom:link href="https://www.relataly.com/category/data-sources/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.relataly.com/category/data-sources/</link>
	<description>The Business AI Blog</description>
	<lastBuildDate>Sat, 27 May 2023 10:38:22 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=7.0</generator>

<image>
	<url>https://www.relataly.com/wp-content/uploads/2023/04/cropped-AI-cat-Icon-White.png</url>
	<title>Data Sources Archives - relataly.com</title>
	<link>https://www.relataly.com/category/data-sources/</link>
	<width>32</width>
	<height>32</height>
</image> 
<site xmlns="com-wordpress:feed-additions:1">175977316</site>	<item>
		<title>Using Pandas DataReader to Access Online Data Sources in Python</title>
		<link>https://www.relataly.com/using-pandas-datareader-in-python/10934/</link>
					<comments>https://www.relataly.com/using-pandas-datareader-in-python/10934/#respond</comments>
		
		<dc:creator><![CDATA[Florian Follonier]]></dc:creator>
		<pubDate>Sat, 15 Oct 2022 20:14:00 +0000</pubDate>
				<category><![CDATA[Data Science]]></category>
		<category><![CDATA[Data Sources]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Tech]]></category>
		<category><![CDATA[Yahoo Finance API]]></category>
		<category><![CDATA[API Tutorials]]></category>
		<category><![CDATA[Beginner Tutorials]]></category>
		<category><![CDATA[Requesting Data via REST APIs]]></category>
		<guid isPermaLink="false">https://www.relataly.com/?p=10934</guid>

					<description><![CDATA[<p>Pandas DataReader is a library that allows data scientists to easily read data from a variety of sources into a Pandas DataFrame. This is especially useful for accessing data that resides outside of their local development environment and needs to be accessed via APIs. The Pandas DataReader provides functions for loading data from various online ... <a title="Using Pandas DataReader to Access Online Data Sources in Python" class="read-more" href="https://www.relataly.com/using-pandas-datareader-in-python/10934/" aria-label="Read more about Using Pandas DataReader to Access Online Data Sources in Python">Read more</a></p>
<p>The post <a href="https://www.relataly.com/using-pandas-datareader-in-python/10934/">Using Pandas DataReader to Access Online Data Sources in Python</a> appeared first on <a href="https://www.relataly.com">relataly.com</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">Pandas DataReader is a library that allows data scientists to easily read data from a variety of sources into a Pandas DataFrame. This is especially useful for accessing data that resides outside of their local development environment and needs to be accessed via APIs. The Pandas DataReader provides functions for loading data from various online sources, including Yahoo Finance and the NASDAQ. This can be incredibly helpful for tasks such as financial analysis, data visualization, and machine learning. In this tutorial, we will give a brief overview of the library and show how to use it in Python to access financial data from the Yahoo Finance API.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<h2 class="wp-block-heading">What is Pandas Data Reader?</h2>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">The Pandas DataReader library provides functions that extract data from various Internet sources into a pandas DataFrame. The pandas DataReader supports several remote data providers, including <a href="https://pandas-datareader.readthedocs.io/en/latest/remote_data.html#remote-data-alphavantage" target="_blank" rel="noreferrer noopener">Alpha Vantage</a>, <a href="https://pandas-datareader.readthedocs.io/en/latest/remote_data.html#remote-data-wb" target="_blank" rel="noreferrer noopener">World Bank</a>, <a href="https://pandas-datareader.readthedocs.io/en/latest/remote_data.html#eurostat" target="_blank" rel="noreferrer noopener">Eurostat</a>, the <a href="https://pandas-datareader.readthedocs.io/en/latest/remote_data.html#oecd" target="_blank" rel="noreferrer noopener">OECD, </a>and several stock markets such as the <a href="https://pandas-datareader.readthedocs.io/en/latest/remote_data.html#nasdaq-trader-symbol-definitions" target="_blank" rel="noreferrer noopener">NASDAQ</a>, <a href="https://pandas-datareader.readthedocs.io/en/latest/remote_data.html#remote-data-yahoo" target="_blank" rel="noreferrer noopener">Yahoo Finance</a>, and<a href="https://pandas-datareader.readthedocs.io/en/latest/remote_data.html#remote-data-naver" target="_blank" rel="noreferrer noopener"> Naver Finance</a>. A complete list of available sources is available from the pandas DataReader <a href="https://pandas-datareader.readthedocs.io/en/latest/remote_data.html" target="_blank" rel="noreferrer noopener">API documentation</a>.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%">
<figure class="wp-block-image size-large"><img fetchpriority="high" decoding="async" width="1024" height="252" data-attachment-id="11792" data-permalink="https://www.relataly.com/using-pandas-datareader-in-python/10934/image-10/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2023/01/image.png" data-orig-size="1246,307" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2023/01/image.png" src="https://www.relataly.com/wp-content/uploads/2023/01/image-1024x252.png" alt="Pandas datareader is a useful Python library for accessing remote data via an API" class="wp-image-11792" srcset="https://www.relataly.com/wp-content/uploads/2023/01/image.png 1024w, https://www.relataly.com/wp-content/uploads/2023/01/image.png 300w, https://www.relataly.com/wp-content/uploads/2023/01/image.png 768w, https://www.relataly.com/wp-content/uploads/2023/01/image.png 1246w" sizes="(max-width: 1024px) 100vw, 1024px" /><figcaption class="wp-element-caption">Pandas DataReader is a useful Python library for accessing remote data via an API</figcaption></figure>
</div>
</div>



<p class="wp-block-paragraph"></p>



<div style="height:24px" aria-hidden="true" class="wp-block-spacer"></div>



<h2 class="wp-block-heading" id="h-access-financial-data-using-pandas-datareader-and-the-yahoo-finance-rest-api-in-python">Access Financial Data using Pandas DataReader and the Yahoo Finance REST API in Python</h2>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">In this tutorial, we will learn how to use the pandas library to retrieve data for the German stock market index DAX from the Yahoo finance API. Specifically, we will use the pandas_datareader package, which provides a convenient interface for accessing various online data sources. We will carry out the following steps:</p>



<ol class="wp-block-list">
<li>Install the pandas_datareader package.</li>



<li>Import the necessary libraries in our Python script.</li>



<li>Use the data.DataReader function to request data for the DAX index from the Yahoo finance API. Specify the start and end dates for the data you want to retrieve. The returned data will be stored in a pandas DataFrame. </li>



<li>Finally, we use the plot() method from the matplotlib library to visualize the data.</li>
</ol>



<p class="wp-block-paragraph">The code is available on the GitHub repository.</p>



<div class="wp-block-kadence-advancedbtn kb-buttons-wrap kb-btns_368b60-d7"><a class="kb-button kt-button button kb-btn_1809f5-70 kt-btn-size-standard kt-btn-width-type-full kb-btn-global-inherit kt-btn-has-text-true kt-btn-has-svg-true wp-block-button__link wp-block-kadence-singlebtn" href="https://github.com/flo7up/relataly-public-python-API-tutorials/blob/main/101%20Pulling%20COVID-19%20Data%20via%20the%20Statworx%20API%20to%20a%20DataFrame.ipynb" target="_blank" rel="noreferrer noopener"><span class="kb-svg-icon-wrap kb-svg-icon-fe_eye kt-btn-icon-side-left"><svg viewBox="0 0 24 24"  fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"  aria-hidden="true"><path d="M1 12s4-8 11-8 11 8 11 8-4 8-11 8-11-8-11-8z"/><circle cx="12" cy="12" r="3"/></svg></span><span class="kt-btn-inner-text">View on GitHub </span></a>

<a class="kb-button kt-button button kb-btn_ad4d40-fa kt-btn-size-standard kt-btn-width-type-full kb-btn-global-inherit kt-btn-has-text-true kt-btn-has-svg-true wp-block-button__link wp-block-kadence-singlebtn" href="https://github.com/flo7up/relataly-public-python-API-tutorials" target="_blank" rel="noreferrer noopener"><span class="kb-svg-icon-wrap kb-svg-icon-fa_github kt-btn-icon-side-left"><svg viewBox="0 0 496 512"  fill="currentColor" xmlns="http://www.w3.org/2000/svg"  aria-hidden="true"><path d="M165.9 397.4c0 2-2.3 3.6-5.2 3.6-3.3.3-5.6-1.3-5.6-3.6 0-2 2.3-3.6 5.2-3.6 3-.3 5.6 1.3 5.6 3.6zm-31.1-4.5c-.7 2 1.3 4.3 4.3 4.9 2.6 1 5.6 0 6.2-2s-1.3-4.3-4.3-5.2c-2.6-.7-5.5.3-6.2 2.3zm44.2-1.7c-2.9.7-4.9 2.6-4.6 4.9.3 2 2.9 3.3 5.9 2.6 2.9-.7 4.9-2.6 4.6-4.6-.3-1.9-3-3.2-5.9-2.9zM244.8 8C106.1 8 0 113.3 0 252c0 110.9 69.8 205.8 169.5 239.2 12.8 2.3 17.3-5.6 17.3-12.1 0-6.2-.3-40.4-.3-61.4 0 0-70 15-84.7-29.8 0 0-11.4-29.1-27.8-36.6 0 0-22.9-15.7 1.6-15.4 0 0 24.9 2 38.6 25.8 21.9 38.6 58.6 27.5 72.9 20.9 2.3-16 8.8-27.1 16-33.7-55.9-6.2-112.3-14.3-112.3-110.5 0-27.5 7.6-41.3 23.6-58.9-2.6-6.5-11.1-33.3 2.6-67.9 20.9-6.5 69 27 69 27 20-5.6 41.5-8.5 62.8-8.5s42.8 2.9 62.8 8.5c0 0 48.1-33.6 69-27 13.7 34.7 5.2 61.4 2.6 67.9 16 17.7 25.8 31.5 25.8 58.9 0 96.5-58.9 104.2-114.8 110.5 9.2 7.9 17 22.9 17 46.4 0 33.7-.3 75.4-.3 83.6 0 6.5 4.6 14.4 17.3 12.1C428.2 457.8 496 362.9 496 252 496 113.3 383.5 8 244.8 8zM97.2 352.9c-1.3 1-1 3.3.7 5.2 1.6 1.6 3.9 2.3 5.2 1 1.3-1 1-3.3-.7-5.2-1.6-1.6-3.9-2.3-5.2-1zm-10.8-8.1c-.7 1.3.3 2.9 2.3 3.9 1.6 1 3.6.7 4.3-.7.7-1.3-.3-2.9-2.3-3.9-2-.6-3.6-.3-4.3.7zm32.4 35.6c-1.6 1.3-1 4.3 1.3 6.2 2.3 2.3 5.2 2.6 6.5 1 1.3-1.3.7-4.3-1.3-6.2-2.2-2.3-5.2-2.6-6.5-1zm-11.4-14.7c-1.6 1-1.6 3.6 0 5.9 1.6 2.3 4.3 3.3 5.6 2.3 1.6-1.3 1.6-3.9 0-6.2-1.4-2.3-4-3.3-5.6-2z"/></svg></span><span class="kt-btn-inner-text">Relataly GitHub Repo </span></a></div>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%">
<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="11808" data-permalink="https://www.relataly.com/using-pandas-datareader-in-python/10934/image-2-4/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2023/01/image-2.png" data-orig-size="1442,1216" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-2" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2023/01/image-2.png" src="https://www.relataly.com/wp-content/uploads/2023/01/image-2-1024x864.png" alt="Pandas DataReader provides access to a wide range of public data sources. " class="wp-image-11808" width="382" height="323" srcset="https://www.relataly.com/wp-content/uploads/2023/01/image-2.png 1024w, https://www.relataly.com/wp-content/uploads/2023/01/image-2.png 300w, https://www.relataly.com/wp-content/uploads/2023/01/image-2.png 768w, https://www.relataly.com/wp-content/uploads/2023/01/image-2.png 1442w" sizes="(max-width: 382px) 100vw, 382px" /><figcaption class="wp-element-caption">Pandas DataReader provides access to a wide range of public data sources.</figcaption></figure>
</div>
</div>



<div style="height:29px" aria-hidden="true" class="wp-block-spacer"></div>



<h3 class="wp-block-heading" id="h-prerequisites">Prerequisites</h3>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">Before starting the coding part, ensure you have set up your <a href="https://www.python.org/downloads/" target="_blank" rel="noreferrer noopener">Python 3</a> environment and required libraries. If you don&#8217;t have an environment, consider following&nbsp;the steps in <a href="https://www.relataly.com/anaconda-python-environment-machine-learning/1663/" target="_blank" rel="noreferrer noopener">this tutorial</a>&nbsp;to set up the&nbsp;<a href="https://www.anaconda.com/products/individual" target="_blank" rel="noreferrer noopener">Anaconda environment</a>.</p>



<p class="wp-block-paragraph">Also, make sure you install all required packages. In this tutorial, we will be working with the following standard packages:&nbsp;</p>



<ul class="wp-block-list">
<li><em><a href="https://pandas.pydata.org/" target="_blank" rel="noreferrer noopener">pandas</a></em></li>



<li><em><a href="https://numpy.org/" target="_blank" rel="noreferrer noopener">NumPy</a></em></li>



<li><a href="https://docs.python.org/3/library/math.html" target="_blank" rel="noreferrer noopener">math</a></li>



<li><em><a href="https://matplotlib.org/" target="_blank" rel="noreferrer noopener">matplotlib</a></em></li>
</ul>



<p class="wp-block-paragraph">You can install packages using console commands:</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:false,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;null&quot;,&quot;mime&quot;:&quot;text/plain&quot;,&quot;theme&quot;:&quot;3024-day&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Plain Text&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;text&quot;}">pip install &lt;package name&gt;
conda install &lt;package name&gt; (if you are using the anaconda packet manager)</pre></div>



<p class="wp-block-paragraph">In addition, we will be using the pandas DataReader library, which you can install with the following command:</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:false,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;null&quot;,&quot;mime&quot;:&quot;text/plain&quot;,&quot;theme&quot;:&quot;3024-day&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;disableCopy&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Plain Text&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;text&quot;}">pip install pandas-datareader</pre></div>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<div style="height:24px" aria-hidden="true" class="wp-block-spacer"></div>



<h3 class="wp-block-heading" id="h-step-1-define-the-api-request-parameters">Step #1: Define the API Request Parameters</h3>



<p class="wp-block-paragraph">We begin by setting up imports and adjusting the request parameters. The parameters in an API request will depend on the API and the library used for making the request. Also, some parameters may be optional, while others are mandatory.</p>



<p class="wp-block-paragraph">The Yahoo Finance API allows us to limit the period we want to retrieve price data, an example of an optional parameter. Furthermore, we need to define the ticker symbol for the financial instrument if we wish to request the price data. This parameter is mandatory.</p>



<p class="wp-block-paragraph">The ticker symbol for the German stock market index is  <strong>^GDAXI</strong>. If you want to retrieve price data for other stocks or indices, you can search for the respective ticker symbols on <a href="https://de.finance.yahoo.com/quote/%5EGDAXI/?guccounter=1&amp;guce_referrer=aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbS8&amp;guce_referrer_sig=AQAAAJH4Onl0hOq1uOzUs7uiZWttbmU1Lw3XWqfXzhqMIUNypCiocw3d_hbUWI92G9TMZn3_M9q4RnaoNYjbWte3RM2iyGc1U_iPquEwan_ezsgKxiLDidFUB2R3zuF46IOvGIqueLikt8Znl-4yDCn_o_50qCUmCr3uZTJ8p8Eaf-MI" target="_blank" rel="noreferrer noopener">Yahoo finance</a>. </p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}">import pandas_datareader as webreader
import pandas as pd
import matplotlib.pyplot as plt

# Set the API 
data_source = &quot;yahoo&quot;

# Set the API parameters
date_today = &quot;2020-01-01&quot; # period start date
date_start = &quot;2010-01-01&quot; # period end date
symbol = &quot;^GDAXI&quot; # asset symbol - For more symbols check yahoo.finance.com</pre></div>



<h3 class="wp-block-heading" id="h-step-2-send-the-request-to-the-rest-api-endpoint">Step #2: Send the Request to the REST API Endpoint</h3>



<p class="wp-block-paragraph">Once we have defined the request parameters, we can make the request via the DataReader function and print out the result. If you request a REST API, the response will come back in JSON format. However, DataReader will directly convert the API response into a DataFrame, which makes using APIs much simpler.</p>



<p class="wp-block-paragraph">This will retrieve the DAX stock market index from Yahoo Finance and print the first few rows of the resulting DataFrame. We have specified a date range to retrieve data for a specific period of time. </p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Send the request to the yahoo finance api endpoint
df = webreader.DataReader(symbol, start=date_start, end=date_today, data_source=data_source)
df.head(5)</pre></div>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:false,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;null&quot;,&quot;mime&quot;:&quot;text/plain&quot;,&quot;theme&quot;:&quot;3024-day&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Plain Text&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;text&quot;}">			High		Low			Open		Close		Volume		Adj Close
Date						
2010-01-04	6048.299805	5974.430176	5975.520020	6048.299805	104344400.0	6048.299805
2010-01-05	6058.020020	6015.669922	6043.939941	6031.859863	117572100.0	6031.859863
2010-01-06	6047.569824	5997.089844	6032.390137	6034.330078	108742400.0	6034.330078
2010-01-07	6037.569824	5961.250000	6016.799805	6019.359863	133704300.0	6019.359863
2010-01-08	6053.040039	5972.240234	6028.620117	6037.609863	126099000.0	6037.609863</pre></div>



<p class="wp-block-paragraph">Dataframe with the price data from yahoo finance.</p>



<h3 class="wp-block-heading" id="h-step-3-plot-the-data">Step #3 Plot the Data</h3>



<p class="wp-block-paragraph">Let&#8217;s quickly print the data to check if everything looks ok.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Plot the closing prices
fig, ax1 = plt.subplots(figsize=(12, 8))
plt.plot(df.index, df.Close)
plt.show()</pre></div>



<figure class="wp-block-image size-full"><img decoding="async" width="1003" height="659" data-attachment-id="10950" data-permalink="https://www.relataly.com/using-pandas-datareader-in-python/10934/image-38-3/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2022/12/image-38.png" data-orig-size="1003,659" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-38" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2022/12/image-38.png" src="https://www.relataly.com/wp-content/uploads/2022/12/image-38.png" alt="plot for financial data requested via the pandas datareader python library" class="wp-image-10950" srcset="https://www.relataly.com/wp-content/uploads/2022/12/image-38.png 1003w, https://www.relataly.com/wp-content/uploads/2022/12/image-38.png 300w, https://www.relataly.com/wp-content/uploads/2022/12/image-38.png 768w" sizes="(max-width: 1003px) 100vw, 1003px" /></figure>



<p class="wp-block-paragraph">Everything looks good, so let&#8217;s proceed.</p>



<h3 class="wp-block-heading" id="h-step-4-save-the-data-to-a-csv-file">Step #4: Save the Data to a CSV File</h3>



<p class="wp-block-paragraph">To save the data from a Pandas DataFrame to a CSV file, you can use the to_csv method. The to_csv method takes a few optional arguments that you can use to customize the output. For example, you can use the &#8220;sep&#8221; argument to specify a different delimiter to use in the CSV file or the &#8220;index&#8221; argument for including or excluding the DataFrame&#8217;s index in the output.</p>



<p class="wp-block-paragraph">Here&#8217;s an example of how you can use the to_csv method with the index parameter set to False:</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Save the data to a CSV file
df.to_csv(&quot;price_quotes.csv&quot;, index=False)</pre></div>



<p class="wp-block-paragraph">Now you have the data on your local machine and can load it later. So unless you require more actual data, there is no need to call the API again.</p>



<figure class="wp-block-image size-full is-resized"><img decoding="async" data-attachment-id="11794" data-permalink="https://www.relataly.com/using-pandas-datareader-in-python/10934/image-1-4/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2023/01/image-1.png" data-orig-size="902,220" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-1" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2023/01/image-1.png" src="https://www.relataly.com/wp-content/uploads/2023/01/image-1.png" alt="price quotes csv file downloaded with Pandas DataReader library for Python" class="wp-image-11794" width="918" height="224" srcset="https://www.relataly.com/wp-content/uploads/2023/01/image-1.png 902w, https://www.relataly.com/wp-content/uploads/2023/01/image-1.png 300w, https://www.relataly.com/wp-content/uploads/2023/01/image-1.png 768w" sizes="(max-width: 918px) 100vw, 918px" /></figure>



<div style="height:29px" aria-hidden="true" class="wp-block-spacer"></div>



<h2 class="wp-block-heading" id="h-summary">Summary</h2>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">This article has shown how to use the Pandas DataReader library. We learned how to use the library to request data from the Yahoo Finance API and save the data to a Pandas DataFrame. The Pandas DataReader library is a helpful tool for importing financial data into a Pandas DataFrame and working with it in Python. You can use it to retrieve data from a wide range of sources, including stock prices from major stock exchanges, economic data from the Federal Reserve, and cryptocurrency prices. Once you have the data in a DataFrame, you can use the various methods and functions provided by Pandas to analyze and manipulate the data, and save the results to a CSV file using the to_csv method.</p>



<p class="wp-block-paragraph">I hope this post was helpful. If you have any remarks or questions, let me know.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%">
<figure class="wp-block-image size-full"><img decoding="async" width="504" height="502" data-attachment-id="12632" data-permalink="https://www.relataly.com/pandas-data-library-panda-midjourney-python-relataly-tutorial-min/" data-orig-file="https://www.relataly.com/wp-content/uploads/2023/03/pandas-data-library-panda-midjourney-python-relataly-tutorial-min.png" data-orig-size="504,502" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="pandas data library panda midjourney python relataly tutorial-min" data-image-description="&lt;p&gt;This panda just loaded a lot of data into his python project. &lt;/p&gt;
" data-image-caption="&lt;p&gt;This panda just loaded a lot of data into his python project. &lt;/p&gt;
" data-large-file="https://www.relataly.com/wp-content/uploads/2023/03/pandas-data-library-panda-midjourney-python-relataly-tutorial-min.png" src="https://www.relataly.com/wp-content/uploads/2023/03/pandas-data-library-panda-midjourney-python-relataly-tutorial-min.png" alt="This panda just loaded a lot of data into his python project. " class="wp-image-12632" srcset="https://www.relataly.com/wp-content/uploads/2023/03/pandas-data-library-panda-midjourney-python-relataly-tutorial-min.png 504w, https://www.relataly.com/wp-content/uploads/2023/03/pandas-data-library-panda-midjourney-python-relataly-tutorial-min.png 300w, https://www.relataly.com/wp-content/uploads/2023/03/pandas-data-library-panda-midjourney-python-relataly-tutorial-min.png 140w" sizes="(max-width: 504px) 100vw, 504px" /><figcaption class="wp-element-caption">This panda looks happy because it just loaded data into his python project. Image created with <a href="http://www.midjourney.com" target="_blank" rel="noreferrer noopener">Midjourney</a>.</figcaption></figure>
</div>
</div>



<p class="wp-block-paragraph"></p>



<h2 class="wp-block-heading">Sources and Further Reading</h2>



<p class="wp-block-paragraph"><a href="https://pandas-datareader.readthedocs.io/en/latest/index.html" target="_blank" rel="noreferrer noopener">pandas-datareader.readthedocs.io/</a></p>



<p class="wp-block-paragraph">Images created with Midjourney AI</p>



<h3 class="wp-block-heading">Further API Tutorials</h3>


<ul class="wp-block-kadence-posts kb-posts kadence-posts-list kb-posts-id-_69433d-4d content-wrap grid-cols kb-posts-style-boxed grid-sm-col-1 grid-lg-col-3 item-image-style-above"><li class="kb-post-list-item">
	<article class="entry content-bg loop-entry post-12143 post type-post status-publish format-standard has-post-thumbnail hentry category-language-generation category-machine-learning-marketing-automation category-natural-language-processing-nlp category-openai category-python-programming category-rest-apis tag-api-tutorials tag-beginner-tutorials tag-deep-learning">
				<a aria-hidden="true" tabindex="-1" role="presentation" class="post-thumbnail kadence-thumbnail-ratio-2-3" href="https://www.relataly.com/automated-prompt-generation-for-dall-e-using-chatgpt-in-python-a-step-by-step-api-tutorial/12143/" aria-label="Generating Detailed Images with OpenAI DALL-E and ChatGPT in Python: A Step-By-Step API Tutorial">
			<div class="post-thumbnail-inner">
				<img decoding="async" width="768" height="382" src="https://www.relataly.com/wp-content/uploads/2023/01/Flo7up_a_robot_painting_a_picture_with_data_technology_and_ai_i_5e7ffa5e-06c3-436b-b4fa-3fc5af58e546-Copy-min.png" class="attachment-medium_large size-medium_large wp-post-image" alt="OpenAI Dall-E ChatGPT Prompt Design Detailed Images Combining ChatGPT and Dall-E Midjourney" srcset="https://www.relataly.com/wp-content/uploads/2023/01/Flo7up_a_robot_painting_a_picture_with_data_technology_and_ai_i_5e7ffa5e-06c3-436b-b4fa-3fc5af58e546-Copy-min.png 1530w, https://www.relataly.com/wp-content/uploads/2023/01/Flo7up_a_robot_painting_a_picture_with_data_technology_and_ai_i_5e7ffa5e-06c3-436b-b4fa-3fc5af58e546-Copy-min.png 300w, https://www.relataly.com/wp-content/uploads/2023/01/Flo7up_a_robot_painting_a_picture_with_data_technology_and_ai_i_5e7ffa5e-06c3-436b-b4fa-3fc5af58e546-Copy-min.png 512w, https://www.relataly.com/wp-content/uploads/2023/01/Flo7up_a_robot_painting_a_picture_with_data_technology_and_ai_i_5e7ffa5e-06c3-436b-b4fa-3fc5af58e546-Copy-min.png 768w" sizes="(max-width: 768px) 100vw, 768px" data-attachment-id="13511" data-permalink="https://www.relataly.com/automated-prompt-generation-for-dall-e-using-chatgpt-in-python-a-step-by-step-api-tutorial/12143/flo7up_a_robot_painting_a_picture_with_data_technology_and_ai_i_5e7ffa5e-06c3-436b-b4fa-3fc5af58e546-copy-min/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2023/01/Flo7up_a_robot_painting_a_picture_with_data_technology_and_ai_i_5e7ffa5e-06c3-436b-b4fa-3fc5af58e546-Copy-min.png" data-orig-size="1530,762" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="Flo7up_a_robot_painting_a_picture_with_data_technology_and_ai_i_5e7ffa5e-06c3-436b-b4fa-3fc5af58e546 &amp;#8211; Copy-min" data-image-description="&lt;p&gt;OpenAI Dall-E ChatGPT Prompt Design Detailed Images Combining ChatGPT and Dall-E Midjourney&lt;/p&gt;
" data-image-caption="&lt;p&gt;OpenAI Dall-E ChatGPT Prompt Design Detailed Images Combining ChatGPT and Dall-E Midjourney&lt;/p&gt;
" data-large-file="https://www.relataly.com/wp-content/uploads/2023/01/Flo7up_a_robot_painting_a_picture_with_data_technology_and_ai_i_5e7ffa5e-06c3-436b-b4fa-3fc5af58e546-Copy-min.png" />			</div>
		</a><!-- .post-thumbnail -->
				<div class="entry-content-wrap">
			<header class="entry-header">
	<h2 class="entry-title"><a href="https://www.relataly.com/automated-prompt-generation-for-dall-e-using-chatgpt-in-python-a-step-by-step-api-tutorial/12143/" rel="bookmark">Generating Detailed Images with OpenAI DALL-E and ChatGPT in Python: A Step-By-Step API Tutorial</a></h2></header><!-- .entry-header -->
<footer class="entry-footer">
	</footer><!-- .entry-footer -->		</div>
	</article>
</li>
<li class="kb-post-list-item">
	<article class="entry content-bg loop-entry post-12068 post type-post status-publish format-standard has-post-thumbnail hentry category-natural-language-processing-nlp category-openai category-rest-apis tag-api-tutorials tag-beginner-tutorials tag-chatgpt tag-deep-learning">
				<a aria-hidden="true" tabindex="-1" role="presentation" class="post-thumbnail kadence-thumbnail-ratio-2-3" href="https://www.relataly.com/using-chatgpt-and-other-openai-models-via-apis-in-python/12068/" aria-label="Unleashing the Power of ChatGPT and Other OpenAI GPT Language Models in Python A Guide to Using APIs">
			<div class="post-thumbnail-inner">
				<img decoding="async" width="768" height="300" src="https://www.relataly.com/wp-content/uploads/2023/02/unleashing-the-power-of-openai-super-hero-robot-gpt-python-ai-min.png" class="attachment-medium_large size-medium_large wp-post-image" alt="unleashing the power of openai super hero robot gpt python ai value proposition chatgpt" srcset="https://www.relataly.com/wp-content/uploads/2023/02/unleashing-the-power-of-openai-super-hero-robot-gpt-python-ai-min.png 1614w, https://www.relataly.com/wp-content/uploads/2023/02/unleashing-the-power-of-openai-super-hero-robot-gpt-python-ai-min.png 300w, https://www.relataly.com/wp-content/uploads/2023/02/unleashing-the-power-of-openai-super-hero-robot-gpt-python-ai-min.png 512w, https://www.relataly.com/wp-content/uploads/2023/02/unleashing-the-power-of-openai-super-hero-robot-gpt-python-ai-min.png 768w, https://www.relataly.com/wp-content/uploads/2023/02/unleashing-the-power-of-openai-super-hero-robot-gpt-python-ai-min.png 1536w" sizes="(max-width: 768px) 100vw, 768px" data-attachment-id="13197" data-permalink="https://www.relataly.com/openai-gpt-chatgpt-in-a-business-context-whats-the-value-proposition/12282/unleashing-the-power-of-openai-super-hero-robot-gpt-python-ai-min/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2023/02/unleashing-the-power-of-openai-super-hero-robot-gpt-python-ai-min.png" data-orig-size="1614,631" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="unleashing the power of openai super hero robot gpt python ai-min" data-image-description="&lt;p&gt;unleashing the power of openai super hero robot gpt python ai value proposition chatgpt&lt;/p&gt;
" data-image-caption="&lt;p&gt;unleashing the power of openai super hero robot gpt python ai value proposition chatgpt&lt;/p&gt;
" data-large-file="https://www.relataly.com/wp-content/uploads/2023/02/unleashing-the-power-of-openai-super-hero-robot-gpt-python-ai-min.png" />			</div>
		</a><!-- .post-thumbnail -->
				<div class="entry-content-wrap">
			<header class="entry-header">
	<h2 class="entry-title"><a href="https://www.relataly.com/using-chatgpt-and-other-openai-models-via-apis-in-python/12068/" rel="bookmark">Unleashing the Power of ChatGPT and Other OpenAI GPT Language Models in Python A Guide to Using APIs</a></h2></header><!-- .entry-header -->
<footer class="entry-footer">
	</footer><!-- .entry-footer -->		</div>
	</article>
</li>
<li class="kb-post-list-item">
	<article class="entry content-bg loop-entry post-10351 post type-post status-publish format-standard has-post-thumbnail hentry category-cryptocompare-api category-facebook-prophet category-finance category-python-programming category-rest-apis category-seaborn category-stock-market-forecasting category-time-series-forecasting category-use-case category-yahoo-finance-api tag-ai-in-finance tag-intermediate-tutorials tag-supervised-learning">
				<a aria-hidden="true" tabindex="-1" role="presentation" class="post-thumbnail kadence-thumbnail-ratio-2-3" href="https://www.relataly.com/time-series-forecasting-using-facebook-prophet-in-python/10351/" aria-label="Univariate Stock Market Forecasting using Facebook Prophet in Python">
			<div class="post-thumbnail-inner">
				<img decoding="async" width="768" height="307" src="https://www.relataly.com/wp-content/uploads/2023/03/stock-market-forecasting-python-relataly-midjourney-3-min.png" class="attachment-medium_large size-medium_large wp-post-image" alt="Univariate Stock Market Forecasting using Facebook Prophet in Python" srcset="https://www.relataly.com/wp-content/uploads/2023/03/stock-market-forecasting-python-relataly-midjourney-3-min.png 1455w, https://www.relataly.com/wp-content/uploads/2023/03/stock-market-forecasting-python-relataly-midjourney-3-min.png 300w, https://www.relataly.com/wp-content/uploads/2023/03/stock-market-forecasting-python-relataly-midjourney-3-min.png 512w, https://www.relataly.com/wp-content/uploads/2023/03/stock-market-forecasting-python-relataly-midjourney-3-min.png 768w" sizes="(max-width: 768px) 100vw, 768px" data-attachment-id="13377" data-permalink="https://www.relataly.com/stock-market-forecasting-python-relataly-midjourney-3-min/" data-orig-file="https://www.relataly.com/wp-content/uploads/2023/03/stock-market-forecasting-python-relataly-midjourney-3-min.png" data-orig-size="1455,582" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="stock market forecasting python relataly midjourney 3-min" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2023/03/stock-market-forecasting-python-relataly-midjourney-3-min.png" />			</div>
		</a><!-- .post-thumbnail -->
				<div class="entry-content-wrap">
			<header class="entry-header">
	<h2 class="entry-title"><a href="https://www.relataly.com/time-series-forecasting-using-facebook-prophet-in-python/10351/" rel="bookmark">Univariate Stock Market Forecasting using Facebook Prophet in Python</a></h2></header><!-- .entry-header -->
<footer class="entry-footer">
	</footer><!-- .entry-footer -->		</div>
	</article>
</li>
<li class="kb-post-list-item">
	<article class="entry content-bg loop-entry post-10098 post type-post status-publish format-standard has-post-thumbnail hentry category-blockchain-crypto-analytics category-correlation-machine-learning category-crypto-exchange-apis category-cryptocompare-api category-data-science category-finance category-python-programming category-rest-apis category-seaborn category-use-case tag-ai-in-finance tag-intermediate-tutorials">
				<a aria-hidden="true" tabindex="-1" role="presentation" class="post-thumbnail kadence-thumbnail-ratio-2-3" href="https://www.relataly.com/seven-metrics-for-on-chain-analysis-in-python/10098/" aria-label="On-Chain Analytics: Metrics for Analyzing Blockchains in Python">
			<div class="post-thumbnail-inner">
				<img decoding="async" width="768" height="314" src="https://www.relataly.com/wp-content/uploads/2023/02/blockchain-analysis-python-min.png" class="attachment-medium_large size-medium_large wp-post-image" alt="onchain-analysis - tutorial blockchain data in python CryptoCompare api" srcset="https://www.relataly.com/wp-content/uploads/2023/02/blockchain-analysis-python-min.png 1262w, https://www.relataly.com/wp-content/uploads/2023/02/blockchain-analysis-python-min.png 300w, https://www.relataly.com/wp-content/uploads/2023/02/blockchain-analysis-python-min.png 1024w, https://www.relataly.com/wp-content/uploads/2023/02/blockchain-analysis-python-min.png 768w" sizes="(max-width: 768px) 100vw, 768px" data-attachment-id="12339" data-permalink="https://www.relataly.com/blockchain-analysis-python-min/" data-orig-file="https://www.relataly.com/wp-content/uploads/2023/02/blockchain-analysis-python-min.png" data-orig-size="1262,516" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="blockchain analysis python-min" data-image-description="&lt;p&gt;onchain-analysis &amp;#8211; tutorial blockchain data in python  CryptoCompare api&lt;/p&gt;
" data-image-caption="&lt;p&gt;onchain-analysis &amp;#8211; tutorial blockchain data in python  CryptoCompare api&lt;/p&gt;
" data-large-file="https://www.relataly.com/wp-content/uploads/2023/02/blockchain-analysis-python-min.png" />			</div>
		</a><!-- .post-thumbnail -->
				<div class="entry-content-wrap">
			<header class="entry-header">
	<h2 class="entry-title"><a href="https://www.relataly.com/seven-metrics-for-on-chain-analysis-in-python/10098/" rel="bookmark">On-Chain Analytics: Metrics for Analyzing Blockchains in Python</a></h2></header><!-- .entry-header -->
<footer class="entry-footer">
	</footer><!-- .entry-footer -->		</div>
	</article>
</li>
<li class="kb-post-list-item">
	<article class="entry content-bg loop-entry post-3982 post type-post status-publish format-standard has-post-thumbnail hentry category-finance category-gate-io-api category-python-programming category-rest-apis tag-ai-in-finance tag-api-tutorials tag-beginner-tutorials tag-bitcoin tag-cryptocurrencies">
				<a aria-hidden="true" tabindex="-1" role="presentation" class="post-thumbnail kadence-thumbnail-ratio-2-3" href="https://www.relataly.com/streaming-crypto-prices-via-the-gate-io-api-with-python/3982/" aria-label="Requesting Crypto Price Data from the Gate.io REST API in Python">
			<div class="post-thumbnail-inner">
				<img decoding="async" width="768" height="305" src="https://www.relataly.com/wp-content/uploads/2021/05/gatio-cryptocurrency-data-api-midjourney-relataly-min.png" class="attachment-medium_large size-medium_large wp-post-image" alt="gatio cryptocurrency data api midjourney relataly-min" srcset="https://www.relataly.com/wp-content/uploads/2021/05/gatio-cryptocurrency-data-api-midjourney-relataly-min.png 1358w, https://www.relataly.com/wp-content/uploads/2021/05/gatio-cryptocurrency-data-api-midjourney-relataly-min.png 300w, https://www.relataly.com/wp-content/uploads/2021/05/gatio-cryptocurrency-data-api-midjourney-relataly-min.png 512w, https://www.relataly.com/wp-content/uploads/2021/05/gatio-cryptocurrency-data-api-midjourney-relataly-min.png 768w" sizes="(max-width: 768px) 100vw, 768px" data-attachment-id="12769" data-permalink="https://www.relataly.com/streaming-crypto-prices-via-the-gate-io-api-with-python/3982/gatio-cryptocurrency-data-api-midjourney-relataly-min/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2021/05/gatio-cryptocurrency-data-api-midjourney-relataly-min.png" data-orig-size="1358,540" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="gatio cryptocurrency data api midjourney relataly-min" data-image-description="&lt;p&gt;gatio cryptocurrency data api midjourney relataly-min&lt;/p&gt;
" data-image-caption="&lt;p&gt;gatio cryptocurrency data api midjourney relataly-min&lt;/p&gt;
" data-large-file="https://www.relataly.com/wp-content/uploads/2021/05/gatio-cryptocurrency-data-api-midjourney-relataly-min.png" />			</div>
		</a><!-- .post-thumbnail -->
				<div class="entry-content-wrap">
			<header class="entry-header">
	<h2 class="entry-title"><a href="https://www.relataly.com/streaming-crypto-prices-via-the-gate-io-api-with-python/3982/" rel="bookmark">Requesting Crypto Price Data from the Gate.io REST API in Python</a></h2></header><!-- .entry-header -->
<footer class="entry-footer">
	</footer><!-- .entry-footer -->		</div>
	</article>
</li>
<li class="kb-post-list-item">
	<article class="entry content-bg loop-entry post-3925 post type-post status-publish format-standard has-post-thumbnail hentry category-rest-apis category-twitter-api tag-ai-in-e-commerce tag-api-tutorials tag-automated-twitter-posts tag-beginner-tutorials tag-social-media-data tag-tweepy">
				<a aria-hidden="true" tabindex="-1" role="presentation" class="post-thumbnail kadence-thumbnail-ratio-2-3" href="https://www.relataly.com/posting-tweets-on-twitter-using-python-and-tweepy/3925/" aria-label="Posting Tweets On Twitter using Python and Tweepy">
			<div class="post-thumbnail-inner">
				<img decoding="async" width="768" height="306" src="https://www.relataly.com/wp-content/uploads/2023/03/twitter-api-gate-to-social-mediadata-relataly-tutorial-python-min.png" class="attachment-medium_large size-medium_large wp-post-image" alt="twitter api gate to social mediadata relataly tutorial python" srcset="https://www.relataly.com/wp-content/uploads/2023/03/twitter-api-gate-to-social-mediadata-relataly-tutorial-python-min.png 1390w, https://www.relataly.com/wp-content/uploads/2023/03/twitter-api-gate-to-social-mediadata-relataly-tutorial-python-min.png 300w, https://www.relataly.com/wp-content/uploads/2023/03/twitter-api-gate-to-social-mediadata-relataly-tutorial-python-min.png 512w, https://www.relataly.com/wp-content/uploads/2023/03/twitter-api-gate-to-social-mediadata-relataly-tutorial-python-min.png 768w" sizes="(max-width: 768px) 100vw, 768px" data-attachment-id="12599" data-permalink="https://www.relataly.com/twitter-api-gate-to-social-mediadata-relataly-tutorial-python-min/" data-orig-file="https://www.relataly.com/wp-content/uploads/2023/03/twitter-api-gate-to-social-mediadata-relataly-tutorial-python-min.png" data-orig-size="1390,554" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="twitter api gate to social mediadata relataly tutorial python" data-image-description="&lt;p&gt;twitter api gate to social mediadata relataly tutorial python&lt;/p&gt;
" data-image-caption="&lt;p&gt;twitter api gate to social mediadata relataly tutorial python&lt;/p&gt;
" data-large-file="https://www.relataly.com/wp-content/uploads/2023/03/twitter-api-gate-to-social-mediadata-relataly-tutorial-python-min.png" />			</div>
		</a><!-- .post-thumbnail -->
				<div class="entry-content-wrap">
			<header class="entry-header">
	<h2 class="entry-title"><a href="https://www.relataly.com/posting-tweets-on-twitter-using-python-and-tweepy/3925/" rel="bookmark">Posting Tweets On Twitter using Python and Tweepy</a></h2></header><!-- .entry-header -->
<footer class="entry-footer">
	</footer><!-- .entry-footer -->		</div>
	</article>
</li>
</ul><p>The post <a href="https://www.relataly.com/using-pandas-datareader-in-python/10934/">Using Pandas DataReader to Access Online Data Sources in Python</a> appeared first on <a href="https://www.relataly.com">relataly.com</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.relataly.com/using-pandas-datareader-in-python/10934/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">10934</post-id>	</item>
		<item>
		<title>Color-Coded Cryptocurrency Price Charts in Python</title>
		<link>https://www.relataly.com/cryptocurrency-price-charts-with-color-overlay-python/2820/</link>
					<comments>https://www.relataly.com/cryptocurrency-price-charts-with-color-overlay-python/2820/#respond</comments>
		
		<dc:creator><![CDATA[Florian Follonier]]></dc:creator>
		<pubDate>Tue, 19 Jan 2021 21:03:16 +0000</pubDate>
				<category><![CDATA[Coinbase API]]></category>
		<category><![CDATA[Correlation]]></category>
		<category><![CDATA[Data Science]]></category>
		<category><![CDATA[Data Sources]]></category>
		<category><![CDATA[Data Visualization]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Seaborn]]></category>
		<category><![CDATA[Stock Market Forecasting]]></category>
		<category><![CDATA[Bitcoin]]></category>
		<category><![CDATA[Chart Analysis]]></category>
		<category><![CDATA[Cryptocurrencies]]></category>
		<category><![CDATA[Financial Analysis]]></category>
		<category><![CDATA[Intermediate Tutorials]]></category>
		<guid isPermaLink="false">https://www.relataly.com/?p=2820</guid>

					<description><![CDATA[<p>Are you intrigued by the fascinating world of cryptocurrency and looking to visually decipher its price trends? Welcome aboard! In this comprehensive tutorial, we will explore creating color-coded line charts using Python and Matplotlib, a powerful tool for effective analysis of changes along a third dimension. The past few years have witnessed a meteoric rise ... <a title="Color-Coded Cryptocurrency Price Charts in Python" class="read-more" href="https://www.relataly.com/cryptocurrency-price-charts-with-color-overlay-python/2820/" aria-label="Read more about Color-Coded Cryptocurrency Price Charts in Python">Read more</a></p>
<p>The post <a href="https://www.relataly.com/cryptocurrency-price-charts-with-color-overlay-python/2820/">Color-Coded Cryptocurrency Price Charts in Python</a> appeared first on <a href="https://www.relataly.com">relataly.com</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph"><br>Are you intrigued by the fascinating world of cryptocurrency and looking to visually decipher its price trends? Welcome aboard! In this comprehensive tutorial, we will explore creating color-coded line charts using Python and Matplotlib, a powerful tool for effective analysis of changes along a third dimension.</p>



<p class="wp-block-paragraph">The past few years have witnessed a meteoric rise in the prices of cryptocurrencies, underscoring the need for accurate analysis and visualization of their price trends. An outstanding illustration of this is the color-coded Bitcoin stock-to-flow chart, a popular choice in the crypto space that uses color differentiation to denote time until the next Bitcoin halving event.</p>



<p class="wp-block-paragraph">Drawing inspiration from this, our tutorial will guide you to create a similar dynamic color-coded line chart, tracing the price trends of two leading cryptocurrencies &#8211; Bitcoin and Ethereum. This visual aid will provide a deeper insight into their price trajectories over time, enabling you to make informed investment decisions.</p>



<p class="wp-block-paragraph">As we dive in, we&#8217;ll break down the process into digestible chunks, making it easier for beginners to follow along. By the end of this tutorial, you&#8217;ll not only have a profound understanding of how to create and interpret such color-coded charts but also gain valuable insights into the world of cryptocurrency price trends.</p>



<p class="has-accent-color has-blush-light-purple-gradient-background has-text-color has-background wp-block-paragraph"><strong>Disclaimer</strong>: This article does not constitute financial advice. Stock markets can be very volatile and are generally difficult to predict. Predictive models and other forms of analytics applied in this article only illustrate machine learning use cases.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<h2 class="wp-block-heading" id="h-what-are-color-coded-price-charts">What are Color-coded Price Charts?</h2>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">Color coding is beneficial for visualizing trading signals and statistical indicators in technical chart analysis. The idea of color-coding in chart analysis is to create visually comprehensible charts that let the user quickly interpret how price develops under certain conditions. A simple example is a candlestick chart, which uses color to signal whether the price moves up (green) or down (red). Candlestick charts visualize more as regular line charts, providing additional information on the opening and closing prices. </p>



<p class="wp-block-paragraph">We can use color codings in line plots to visualize conditions of various types. We can derive them from the price itself and, for example, illustrate the price development independence of oscillation indicators or moving averages. Or they can be independent of the price and represent some other conditions, such as, for example, the spread of COVID-19 cases worldwide. These are just a few examples, and there are no limits to your creativity in choosing the conditions. </p>



<p class="wp-block-paragraph">Also: <a href="https://www.relataly.com/streaming-crypto-prices-via-the-gate-io-api-with-python/3982/" target="_blank" rel="noreferrer noopener">Requesting Crypto Price Data from the Gate.io REST API in Python</a></p>



<h2 class="wp-block-heading">Use Cases for Color-coded Price Charts</h2>



<p class="wp-block-paragraph">There are various use cases for color-coded line plots in the crypto space. For example, crypto enthusiasts employ them to visualize relationships between the price of bitcoin and statistical indicators, including momentum indicators such as the RSI. Color-coded line plots have also been used to show dependencies between price and specific events that develop parallel to the bitcoin price. For example, we can use color-coding to highlight the lag between the price and the bitcoin halving every four years.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"><div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><img decoding="async" data-attachment-id="8060" data-permalink="https://www.relataly.com/cryptocurrency-price-charts-with-color-overlay-python/2820/image-3-3/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2022/05/image-3.png" data-orig-size="1410,819" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-3" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2022/05/image-3.png" src="https://www.relataly.com/wp-content/uploads/2022/05/image-3-1024x595.png" alt="Example of a color-coded line plot that shows the Bitcoin stock to flow model. In this article, we will create a similar chart using Python." class="wp-image-8060" width="416" height="241" srcset="https://www.relataly.com/wp-content/uploads/2022/05/image-3.png 1024w, https://www.relataly.com/wp-content/uploads/2022/05/image-3.png 300w, https://www.relataly.com/wp-content/uploads/2022/05/image-3.png 768w, https://www.relataly.com/wp-content/uploads/2022/05/image-3.png 1410w" sizes="(max-width: 416px) 100vw, 416px" /><figcaption class="wp-element-caption">The Stock to Flow Model is an example of a Color-coded price chart (Source: <a href="https://www.lookintobitcoin.com/charts/stock-to-flow-model/" target="_blank" rel="noreferrer noopener">lookintobitcoin.com</a>)</figcaption></figure>
</div></div>
</div>



<h2 class="wp-block-heading" id="h-implementing-color-coded-price-charts-in-python">Implementing Color-coded Price Charts in Python</h2>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">Are you ready to elevate your data visualization skills and create visually striking price charts with Python? In this tutorial, we&#8217;ll be walking you through the creation of two dynamic line charts that use color to reveal intriguing trends and patterns. The first chart will feature a color overlay on the price line to showcase how Bitcoin prices fluctuate based on RSI. The second chart will unveil the shifting correlation between Bitcoin and Ethereum over time. Buckle up, and let&#8217;s dive in!</p>



<p class="wp-block-paragraph">We&#8217;ll start by using the Coinbase Pro API to download historical price data on BTC and ETH. We&#8217;ll then calculate two well-established indicators in financial analysis: the Relative Strength Index (RSI) and the Pearson Correlation between Bitcoin and Ethereum. Finally, we&#8217;ll use Matplotlib to create stunning color-coded line charts that highlight the changes in the indicators over extended periods.</p>



<p class="wp-block-paragraph">Also: <a href="https://www.relataly.com/visualize-covid-19-data-on-a-geographic-heat-maps/291/" target="_blank" rel="noreferrer noopener">Geographic Heat Maps with GeoPandas: Visualizing COVID-19</a></p>



<p class="wp-block-paragraph">The code is available on the GitHub repository.</p>



<div class="wp-block-kadence-advancedbtn kb-buttons-wrap kb-btns_2cdf46-d5"><a class="kb-button kt-button button kb-btn_1609d5-b1 kt-btn-size-standard kt-btn-width-type-full kb-btn-global-inherit kt-btn-has-text-true kt-btn-has-svg-true wp-block-button__link wp-block-kadence-singlebtn" href="https://github.com/flo7up/relataly-public-python-tutorials/blob/master/00%20Data%20Visualization/071%20Color-Coded%20Cryptocurrency%20Price%20Charts.ipynb" target="_blank" rel="noreferrer noopener"><span class="kb-svg-icon-wrap kb-svg-icon-fe_eye kt-btn-icon-side-left"><svg viewBox="0 0 24 24"  fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"  aria-hidden="true"><path d="M1 12s4-8 11-8 11 8 11 8-4 8-11 8-11-8-11-8z"/><circle cx="12" cy="12" r="3"/></svg></span><span class="kt-btn-inner-text">View on GitHub </span></a>

<a class="kb-button kt-button button kb-btn_d58e18-e3 kt-btn-size-standard kt-btn-width-type-full kb-btn-global-inherit kt-btn-has-text-true kt-btn-has-svg-true wp-block-button__link wp-block-kadence-singlebtn" href="https://github.com/flo7up/relataly-public-python-API-tutorials" target="_blank" rel="noreferrer noopener"><span class="kb-svg-icon-wrap kb-svg-icon-fa_github kt-btn-icon-side-left"><svg viewBox="0 0 496 512"  fill="currentColor" xmlns="http://www.w3.org/2000/svg"  aria-hidden="true"><path d="M165.9 397.4c0 2-2.3 3.6-5.2 3.6-3.3.3-5.6-1.3-5.6-3.6 0-2 2.3-3.6 5.2-3.6 3-.3 5.6 1.3 5.6 3.6zm-31.1-4.5c-.7 2 1.3 4.3 4.3 4.9 2.6 1 5.6 0 6.2-2s-1.3-4.3-4.3-5.2c-2.6-.7-5.5.3-6.2 2.3zm44.2-1.7c-2.9.7-4.9 2.6-4.6 4.9.3 2 2.9 3.3 5.9 2.6 2.9-.7 4.9-2.6 4.6-4.6-.3-1.9-3-3.2-5.9-2.9zM244.8 8C106.1 8 0 113.3 0 252c0 110.9 69.8 205.8 169.5 239.2 12.8 2.3 17.3-5.6 17.3-12.1 0-6.2-.3-40.4-.3-61.4 0 0-70 15-84.7-29.8 0 0-11.4-29.1-27.8-36.6 0 0-22.9-15.7 1.6-15.4 0 0 24.9 2 38.6 25.8 21.9 38.6 58.6 27.5 72.9 20.9 2.3-16 8.8-27.1 16-33.7-55.9-6.2-112.3-14.3-112.3-110.5 0-27.5 7.6-41.3 23.6-58.9-2.6-6.5-11.1-33.3 2.6-67.9 20.9-6.5 69 27 69 27 20-5.6 41.5-8.5 62.8-8.5s42.8 2.9 62.8 8.5c0 0 48.1-33.6 69-27 13.7 34.7 5.2 61.4 2.6 67.9 16 17.7 25.8 31.5 25.8 58.9 0 96.5-58.9 104.2-114.8 110.5 9.2 7.9 17 22.9 17 46.4 0 33.7-.3 75.4-.3 83.6 0 6.5 4.6 14.4 17.3 12.1C428.2 457.8 496 362.9 496 252 496 113.3 383.5 8 244.8 8zM97.2 352.9c-1.3 1-1 3.3.7 5.2 1.6 1.6 3.9 2.3 5.2 1 1.3-1 1-3.3-.7-5.2-1.6-1.6-3.9-2.3-5.2-1zm-10.8-8.1c-.7 1.3.3 2.9 2.3 3.9 1.6 1 3.6.7 4.3-.7.7-1.3-.3-2.9-2.3-3.9-2-.6-3.6-.3-4.3.7zm32.4 35.6c-1.6 1.3-1 4.3 1.3 6.2 2.3 2.3 5.2 2.6 6.5 1 1.3-1.3.7-4.3-1.3-6.2-2.2-2.3-5.2-2.6-6.5-1zm-11.4-14.7c-1.6 1-1.6 3.6 0 5.9 1.6 2.3 4.3 3.3 5.6 2.3 1.6-1.3 1.6-3.9 0-6.2-1.4-2.3-4-3.3-5.6-2z"/></svg></span><span class="kt-btn-inner-text">Relataly GitHub Repo </span></a></div>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<h3 class="wp-block-heading" id="h-prerequisites">Prerequisites</h3>



<p class="wp-block-paragraph">Before starting the coding part, make sure that you have set up your Python 3 environment and required packages. If you don&#8217;t have an environment, you can follow&nbsp;this tutorial&nbsp;to set up the&nbsp;<a href="https://www.anaconda.com/products/individual" target="_blank" rel="noreferrer noopener">Anaconda environment</a>. Also, make sure you install all required packages. In this tutorial, we will be working with the following standard packages:&nbsp;</p>



<ul class="wp-block-list">
<li><a href="https://pandas.pydata.org/" target="_blank" rel="noreferrer noopener">pandas</a></li>



<li><a href="https://numpy.org/" target="_blank" rel="noreferrer noopener">NumPy</a></li>



<li><a href="https://docs.python.org/3/library/math.html" target="_blank" rel="noreferrer noopener">math</a></li>



<li><a href="https://matplotlib.org/" target="_blank" rel="noreferrer noopener">matplotlib</a></li>
</ul>



<p class="wp-block-paragraph">In addition, we will be using the <a href="https://github.com/David-Woroniuk/Historic_Crypto" target="_blank" rel="noreferrer noopener">Historic-Crypto Python Package</a>, which lets us easily interact with the <a href="https://pro.coinbase.com/" target="_blank" rel="noreferrer noopener">Coinbase Pro</a> API.</p>



<p class="wp-block-paragraph">You can install packages using console commands:</p>



<ul class="wp-block-list">
<li><em>pip install &lt;package name&gt;</em></li>



<li><em>conda install &lt;package name&gt;</em>&nbsp;(if you are using the anaconda packet manager)</li>
</ul>



<h3 class="wp-block-heading" id="h-step-1-load-the-price-data-via-the-coinbase-api">Step #1 Load the Price Data via the Coinbase API</h3>



<p class="wp-block-paragraph">We begin by downloading the historical price data on Bitcoin (BTC-USD) and Ethereum (BTC-USD) from Coinbase Pro. Don&#8217;t worry; you don&#8217;t need to download the data manually. Instead, we will use the Historic_Crypto Python package to access the data via an API. </p>



<p class="wp-block-paragraph">Accessing the data via the Coinbase Pro API requires us to specify several API parameters. We define a frequency of 21600 seconds so that we will obtain price points on a 6-hour basis. In addition, we define a from_date of &#8220;2017-01-01&#8221; and add &#8220;ETH-USD&#8221; and &#8220;BTC-USD&#8221; to a list of coins for which we want to obtain the historical price data. </p>



<p class="wp-block-paragraph">We query the API separately for each of the two coins in our coin list. Depending on your internet connection, this can take several minutes. The response contains three different price values:</p>



<ul class="wp-block-list">
<li><strong>high</strong>: the daily price high</li>



<li><strong>low</strong>: the daily price low</li>



<li><strong>close</strong>: the daily closing price</li>
</ul>



<p class="wp-block-paragraph">Later in this article, we will require all three variables to calculate the indicator values. We will therefore add the variables as columns to a new dataframe. </p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Tested with Python 3.8.8, Matplotlib 3.5, Seaborn 0.11.1, numpy 1.19.5

from Historic_Crypto import HistoricalData
import pandas as pd 
from scipy.stats import pearsonr
import matplotlib.pyplot as plt 
import matplotlib.colors as col 
import numpy as np 
import datetime

# the price frequency in seconds: 21600 = 6 hour price data, 86400 = daily price data
frequency = 21600

# The beginning of the period for which prices will be retrieved
from_date = '2017-01-01-00-00'
# The currency price pairs for which the data will be retrieved
coinlist = ['ETH-USD', 'BTC-USD']

# Query the data
for i in range(len(coinlist)):
    coinname = coinlist[i]
    pricedata = HistoricalData(coinname, frequency, from_date).retrieve_data()
    pricedf = pricedata[['close', 'low', 'high']]
    if i == 0:
        df = pd.DataFrame(pricedf.copy())
    else:
        df = pd.merge(left=df, right=pricedf, how='left', left_index=True, right_index=True)   
    df.rename(columns={&quot;close&quot;: &quot;close-&quot; + coinname}, inplace=True)
    df.rename(columns={&quot;low&quot;: &quot;low-&quot; + coinname}, inplace=True)
    df.rename(columns={&quot;high&quot;: &quot;high-&quot; + coinname}, inplace=True)
df.head()</pre></div>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:false,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;null&quot;,&quot;mime&quot;:&quot;text/plain&quot;,&quot;theme&quot;:&quot;3024-day&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Plain Text&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;text&quot;}">			time	close-ETH-USD	low-ETH-USD	high-ETH-USD	close-BTC-USD	low-BTC-USD	high-BTC-USD
2017-01-01 06:00:00	8.23			8.16		8.49			975.00			964.54		975.00
2017-01-01 12:00:00	8.33			8.20		8.44			994.42			974.01		994.97
2017-01-01 18:00:00	8.18			8.08		8.37			992.95			986.86		1000.00
2017-01-02 00:00:00	8.13			8.05		8.22			1003.64			990.52		1012.00
2017-01-02 06:00:00	8.10			8.09		8.20			1024.84			1002.92		102</pre></div>



<h3 class="wp-block-heading" id="h-step-2-visualizing-the-time-series">Step #2 Visualizing the Time Series</h3>



<p class="wp-block-paragraph">At this point, we have created a dataframe that contains the price &#8220;close,&#8221; &#8220;low,&#8221; and &#8220;high&#8221; for BTC-USD and ETH-USD. Next, let&#8217;s take a quick look at what the data looks like:</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Create a Price Chart on BTC and ETH
x = df.index
fig, ax1 = plt.subplots(figsize=(16, 8), sharex=False)

# Price Chart for BTC-USD Close
color = 'tab:blue'
y = df['close-BTC-USD']
ax1.set_xlabel('time (s)')
ax1.set_ylabel('BTC-Close in $', color=color, fontsize=18)
ax1.plot(x, y, color=color)
ax1.tick_params(axis='y', labelcolor=color)
ax1.text(0.02, 0.95, 'BTC-USD',  transform=ax1.transAxes, color=color, fontsize=16)

# Price Chart for ETH-USD Close
color = 'tab:red'
y = df['close-ETH-USD']
ax2 = ax1.twinx()  # instantiate a second axes that shares the same x-axis
ax2.set_ylabel('ETH-Close in $', color=color, fontsize=18)  # we already handled the x-label with ax1
ax2.plot(x, y, color=color)
ax2.tick_params(axis='y', labelcolor=color)
ax2.text(0.02, 0.9, 'ETH-USD',  transform=ax2.transAxes, color=color, fontsize=16)</pre></div>



<figure class="wp-block-image size-full"><img decoding="async" width="1021" height="480" data-attachment-id="11736" data-permalink="https://www.relataly.com/cryptocurrency-price-charts-with-color-overlay-python/2820/image-1-2/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2022/12/image-1.png" data-orig-size="1021,480" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-1" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2022/12/image-1.png" src="https://www.relataly.com/wp-content/uploads/2022/12/image-1.png" alt="Price charts of Bitcoin and Ethereum created in Python" class="wp-image-11736" srcset="https://www.relataly.com/wp-content/uploads/2022/12/image-1.png 1021w, https://www.relataly.com/wp-content/uploads/2022/12/image-1.png 300w, https://www.relataly.com/wp-content/uploads/2022/12/image-1.png 768w" sizes="(max-width: 1021px) 100vw, 1021px" /></figure>



<p class="wp-block-paragraph">Next, we add two indicator values to our dataframe that we can later use to color the price chart. </p>



<h3 class="wp-block-heading" id="h-step-3-calculate-indicator-values">Step #3 Calculate Indicator Values</h3>



<p class="wp-block-paragraph">The color overlay of the price chart is typically used to illustrate the relation between price and another variable, such as a statistical indicator. To demonstrate how this works, we will calculate two indicators and add them to our dataframe:</p>



<h4 class="wp-block-heading" id="h-3-1-the-relative-strength-index">3.1 The Relative Strength Index</h4>



<p class="wp-block-paragraph">The Relative Strength Index (RSI) is a momentum indicator that signals the strength of a price trend. Its value range from 0 to 100%. A value above 70% signals that an asset is likely overbought. An overbought level is an area where the market is highly bullish and might decline. A value below 30% is typically a sign of an oversold condition. An oversold level is where the market is extremely bearish, and the price tends to reverse to the upper side.</p>



<h4 class="wp-block-heading" id="h-3-2-the-pearson-correlation-coefficient">3.2 The Pearson Correlation Coefficient</h4>



<p class="wp-block-paragraph">Pearson Correlation Coefficient: This indicator measures the correlation between two sets of stochastic variables. Its values range from -1 to 1. A value of 1 would imply a perfect stochastic correlation. For example, if the price of BTC changes by X percentage in a given period, we can expect ETH to experience the exact price change. A value of -1 would imply a perfect inverse correlation. For example, if the price of BTC were to increase by Y percent, we would also expect the ETH price to decrease by Y percent. A value of 0 implies no correlation. To learn more about correlation, check out my article about <a href="https://www.relataly.com/category/data-science/pearson-correlation/" target="_blank" rel="noreferrer noopener">correlation in Python</a>.</p>



<p class="wp-block-paragraph">We embed the logic for calculating the two indicators in a different method called &#8220;add_indicators.&#8221;</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}">def add_indicators(df):
    # Calculate the 30 day Pearson Correlation 
    cor_period = 30 #this corresponds to a monthly correlation period
    columntobeadded = [0] * cor_period
    df = df.fillna(0) 
    for i in range(len(df)-cor_period):
        btc = df['close-BTC-USD'][i:i+cor_period]
        eth = df['close-ETH-USD'][i:i+cor_period]
        corr, _ = pearsonr(btc, eth)
        columntobeadded.append(corr)    
    # insert the colours into our original dataframe    
    df.insert(2, &quot;P_Correlation&quot;, columntobeadded, True)

    # Calculate the RSI
    # Moving Averages on high, lows, and std - different periods
    df['MA200_low'] = df['low-BTC-USD'].rolling(window=200).min()
    df['MA14_low'] = df['low-BTC-USD'].rolling(window=14).min()
    df['MA200_high'] = df['high-BTC-USD'].rolling(window=200).max()
    df['MA14_high'] = df['high-BTC-USD'].rolling(window=14).max()

    # Relative Strength Index (RSI)
    df['K-ratio'] = 100*((df['close-BTC-USD'] - df['MA14_low']) / (df['MA14_high'] - df['MA14_low']) )
    df['RSI'] = df['K-ratio'].rolling(window=3).mean() 

    # Replace nas 
    #nareplace = df.at[df.index.max(), 'close-BTC-USD']    
    df.fillna(0, inplace=True)    
    return df
    
dfcr = add_indicators(df)</pre></div>



<p class="wp-block-paragraph">At this point, we have added the RSI and the Correlation Coefficient to our dataframe. Let&#8217;s quickly visualize the two indicators in a line chart. </p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Visualize measures
fig, ax1 = plt.subplots(figsize=(22, 4), sharex=False)
plt.ylabel('ETH-BTC Price Correlation', color=color)  # we already handled the x-label with ax1
x = y = dfcr.index
ax1.plot(x, dfcr['P_Correlation'], color='black')
ax2 = ax1.twinx()
ax2.plot(x, dfcr['RSI'], color='blue')
plt.tick_params(axis='y', labelcolor=color)

plt.show()</pre></div>



<figure class="wp-block-image size-large"><img decoding="async" width="1024" height="193" data-attachment-id="2838" data-permalink="https://www.relataly.com/cryptocurrency-price-charts-with-color-overlay-python/2820/image-11-7/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2021/01/image-11.png" data-orig-size="1319,248" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-11" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2021/01/image-11.png" src="https://www.relataly.com/wp-content/uploads/2021/01/image-11-1024x193.png" alt="RSI Cryptocurrency chart analysis created with Python" class="wp-image-2838" srcset="https://www.relataly.com/wp-content/uploads/2021/01/image-11.png 1024w, https://www.relataly.com/wp-content/uploads/2021/01/image-11.png 300w, https://www.relataly.com/wp-content/uploads/2021/01/image-11.png 768w, https://www.relataly.com/wp-content/uploads/2021/01/image-11.png 1319w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<p class="wp-block-paragraph">You may have noticed that the indicators remain at 0 at the time series beginning. However, this is perfectly fine. Since both indicators are calculated retrospectively, no values are available initially. </p>



<h3 class="wp-block-heading" id="h-step-4-converting-indicator-values-to-color-codes">Step #4 Converting Indicator Values to Color Codes</h3>



<p class="wp-block-paragraph">Before creating the price charts, we have to color code the indicator values. We normalize the values and then assign a color to each indicator value using a color scale. We attach the colors to our existing dataframe to quickly access them when creating the plots.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># function that converts a given set of indicator values to colors
def get_colors(ind, colormap):
    colorlist = []
    norm = col.Normalize(vmin=ind.min(), vmax=ind.max())
    for i in ind:
        colorlist.append(list(colormap(norm(i))))
    return colorlist

# convert the RSI                         
y = np.array(dfcr['RSI'])
colormap = plt.get_cmap('plasma')
dfcr['rsi_colors'] = get_colors(y, colormap)

# convert the Pearson Correlation
y = np.array(dfcr['P_Correlation'])
colormap = plt.get_cmap('plasma')
dfcr['cor_colors'] = get_colors(y, colormap)</pre></div>



<p class="wp-block-paragraph">In our dataframe, two additional columns contain the color values for the two indicators. Now that we have all the data in our dataframe, the next step is creating the price charts.</p>



<h3 class="wp-block-heading" id="h-step-5-creating-color-coded-price-charts">Step #5 Creating Color-Coded Price Charts</h3>



<p class="wp-block-paragraph">Next, we use the color values to create two different color-coded price charts.</p>



<h4 class="wp-block-heading" id="h-5-1-bitcoin-price-chart-colored-by-rsi">5.1 Bitcoin Price Chart Colored by RSI</h4>



<p class="wp-block-paragraph">We color the chart with the strength of the correlation between Bitcoin and Ethereum. Light-colored fields signal phases of a strong correlation. Price points colored in dark blue indicate phases where the correlation between the price movements of the two cryptocurrencies was negative.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Create a Price Chart
pd.plotting.register_matplotlib_converters()
fig, ax1 = plt.subplots(figsize=(18, 10), sharex=False)
x = dfcr.index
y = dfcr['close-BTC-USD']
z = dfcr['rsi_colors']

# draw points
for i in range(len(dfcr)):
    ax1.plot(x[i], np.array(y[i]), 'o',  color=z[i], alpha = 0.5, markersize=5)
ax1.set_ylabel('BTC-Close in $')
ax1.tick_params(axis='y', labelcolor='black')
ax1.set_xlabel('Date')
ax1.text(0.02, 0.95, 'BTC-USD - Colored by RSI',  transform=ax1.transAxes, fontsize=16)

# plot the color bar
pos_neg_clipped = ax2.imshow(list(z), cmap='plasma', vmin=0, vmax=100, interpolation='none')
cb = plt.colorbar(pos_neg_clipped)</pre></div>



<figure class="wp-block-image size-large"><img decoding="async" width="1024" height="682" data-attachment-id="9534" data-permalink="https://www.relataly.com/cryptocurrency-price-charts-with-color-overlay-python/2820/image-35/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2022/09/image.png" data-orig-size="1073,715" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2022/09/image.png" src="https://www.relataly.com/wp-content/uploads/2022/09/image-1024x682.png" alt="color-coded bitcoin chart with halving dates; seaborn, python" class="wp-image-9534" srcset="https://www.relataly.com/wp-content/uploads/2022/09/image.png 1024w, https://www.relataly.com/wp-content/uploads/2022/09/image.png 300w, https://www.relataly.com/wp-content/uploads/2022/09/image.png 768w, https://www.relataly.com/wp-content/uploads/2022/09/image.png 1073w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<p class="wp-block-paragraph">From the color overlay in the chart, we can tell that the RSI is low mainly (dark blue) when the Bitcoin price has seen a substantial decline and high (yellow) when the price has risen. </p>



<p class="wp-block-paragraph"></p>



<h4 class="wp-block-heading" id="h-5-2-bitcoin-price-chart-colored-by-btc-eth-correlation">5.2 Bitcoin Price Chart colored by BTC-ETH Correlation</h4>



<p class="wp-block-paragraph">In this section, we will create another price chart for Bitcoin. This time we color code the price trend with the RSI. High RSI values are yellow, and low values are dark blue. Running the code below will create the color-coded bitcoin chart.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># create a price chart
pd.plotting.register_matplotlib_converters()
fig, ax1 = plt.subplots(figsize=(18, 10), sharex=False)
x = dfcr.index # datetime index
y = dfcr['close-BTC-USD'] # the price variable
z = dfcr['cor_colors'] # the color coded indicator values

# draw points
for i in range(len(dfcr)):
    ax1.plot(x[i], np.array(y[i]), 'o',  color=z[i], alpha = 0.5, markersize=5)
ax1.set_ylabel('BTC-Close in $')
ax1.tick_params(axis='y', labelcolor='black')
ax1.set_xlabel('Date')
ax1.text(0.02, 0.95, 'BTC-USD - Colored by 50-day ETH-BTC Correlation',  transform=ax1.transAxes, fontsize=16)

# plot the color bar
pos_neg_clipped = ax2.imshow(list(z), cmap='Spectral', vmin=-1, vmax=1, interpolation='none')
cb = plt.colorbar(pos_neg_clipped)</pre></div>



<figure class="wp-block-image size-full"><img decoding="async" width="985" height="606" data-attachment-id="9537" data-permalink="https://www.relataly.com/cryptocurrency-price-charts-with-color-overlay-python/2820/image-2-5/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2022/09/image-2.png" data-orig-size="985,606" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-2" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2022/09/image-2.png" src="https://www.relataly.com/wp-content/uploads/2022/09/image-2.png" alt="line plot of bitcoin prices color coded by Bitcoin Ethereum correlation in Python" class="wp-image-9537" srcset="https://www.relataly.com/wp-content/uploads/2022/09/image-2.png 985w, https://www.relataly.com/wp-content/uploads/2022/09/image-2.png 300w, https://www.relataly.com/wp-content/uploads/2022/09/image-2.png 768w" sizes="(max-width: 985px) 100vw, 985px" /></figure>



<p class="wp-block-paragraph">The chart shows that the correlation between Bitcoin and Ethereum (yellow color) was strong when the price of bitcoin rose. So when Bitcoin is in a bull market, Ethereum tends to follow a similar price logic. In contrast, the correlation was weak when the Bitcoin price declined (dark blue).</p>



<h2 class="wp-block-heading" id="h-summary">Summary</h2>



<p class="wp-block-paragraph">In this article, we demonstrated how to use Python and Seaborn to create a price chart that incorporates color as a third dimension. We used the Bitcoin price as an example and created two color-coded charts: one that highlights the RSI, and another that highlights the Pearson Correlation between Bitcoin and Ethereum.</p>



<p class="wp-block-paragraph">By using color as an overlay, it is possible to highlight many interesting relationships in time-series data. A well-known example from the cryptocurrency world is the Bitcoin Rainbow Chart. This technique can be used to bring attention to various trends and patterns in the data.</p>



<p class="wp-block-paragraph">I hope this article has helped to bring you closer to charts in Python. I am always interested to receive feedback from my audience. So, let me know if you liked this content, and if you have any questions, please post them in the comments.</p>



<h2 class="wp-block-heading">Sources and Further Reading</h2>



<div style="display: inline-block;">
  <iframe sandbox="allow-popups allow-scripts allow-modals allow-forms allow-same-origin" style="width:120px;height:240px;" marginwidth="0" marginheight="0" scrolling="no" frameborder="0" src="//ws-eu.amazon-adsystem.com/widgets/q?ServiceVersion=20070822&amp;OneJS=1&amp;Operation=GetAdHtml&amp;MarketPlace=DE&amp;source=ss&amp;ref=as_ss_li_til&amp;ad_type=product_link&amp;tracking_id=flo7up-21&amp;language=de_DE&amp;marketplace=amazon&amp;region=DE&amp;placement=3030181162&amp;asins=3030181162&amp;linkId=669e46025028259138fbb5ccec12dfbe&amp;show_border=true&amp;link_opens_in_new_window=true"></iframe>
<iframe sandbox="allow-popups allow-scripts allow-modals allow-forms allow-same-origin" style="width:120px;height:240px;" marginwidth="0" marginheight="0" scrolling="no" frameborder="0" src="//ws-eu.amazon-adsystem.com/widgets/q?ServiceVersion=20070822&amp;OneJS=1&amp;Operation=GetAdHtml&amp;MarketPlace=DE&amp;source=ss&amp;ref=as_ss_li_til&amp;ad_type=product_link&amp;tracking_id=flo7up-21&amp;language=de_DE&amp;marketplace=amazon&amp;region=DE&amp;placement=1999579577&amp;asins=1999579577&amp;linkId=91d862698bf9010ff4c09539e4c49bf4&amp;show_border=true&amp;link_opens_in_new_window=true"></iframe>
<iframe sandbox="allow-popups allow-scripts allow-modals allow-forms allow-same-origin" style="width:120px;height:240px;" marginwidth="0" marginheight="0" scrolling="no" frameborder="0" src="//ws-eu.amazon-adsystem.com/widgets/q?ServiceVersion=20070822&amp;OneJS=1&amp;Operation=GetAdHtml&amp;MarketPlace=DE&amp;source=ss&amp;ref=as_ss_li_til&amp;ad_type=product_link&amp;tracking_id=flo7up-21&amp;language=de_DE&amp;marketplace=amazon&amp;region=DE&amp;placement=1839217715&amp;asins=1839217715&amp;linkId=356ba074068849ff54393f527190825d&amp;show_border=true&amp;link_opens_in_new_window=true"></iframe>
<iframe sandbox="allow-popups allow-scripts allow-modals allow-forms allow-same-origin" style="width:120px;height:240px;" marginwidth="0" marginheight="0" scrolling="no" frameborder="0" src="//ws-eu.amazon-adsystem.com/widgets/q?ServiceVersion=20070822&amp;OneJS=1&amp;Operation=GetAdHtml&amp;MarketPlace=DE&amp;source=ss&amp;ref=as_ss_li_til&amp;ad_type=product_link&amp;tracking_id=flo7up-21&amp;language=de_DE&amp;marketplace=amazon&amp;region=DE&amp;placement=1492032646&amp;asins=1492032646&amp;linkId=2214804dd039e7103577abd08722abac&amp;show_border=true&amp;link_opens_in_new_window=true"></iframe>
</div>



<p class="has-contrast-2-color has-base-3-background-color has-text-color has-background wp-block-paragraph"><em>The links above to Amazon are affiliate links. By buying through these links, you support the Relataly.com blog and help to cover the hosting costs. Using the links does not affect the price.</em></p>



<p class="wp-block-paragraph">And if you are interested in stock-market prediction, check out the following articles:</p>



<ul class="wp-block-list">
<li><a href="https://www.relataly.com/stock-market-prediction-using-multivariate-time-series-in-python/1815/" target="_blank" rel="noreferrer noopener">Stock Market Prediction using Multivariate Time Series and Recurrent Neural Networks in Python</a></li>



<li><a href="https://www.relataly.com/time-series-forecasting-multi-step-regression-using-neural-networks-with-multiple-outputs-in-python/5800/" target="_blank" rel="noreferrer noopener">Stock-Market prediction using Neural Networks for Multi-Output Regression in Python</a></li>



<li><a href="https://www.relataly.com/stock-market-prediction-using-a-recurrent-neural-network/122/" target="_blank" rel="noreferrer noopener">Stock Market Prediction using Univariate Time Series Models based on Recurrent Neural Networks with Python</a></li>
</ul>
<p>The post <a href="https://www.relataly.com/cryptocurrency-price-charts-with-color-overlay-python/2820/">Color-Coded Cryptocurrency Price Charts in Python</a> appeared first on <a href="https://www.relataly.com">relataly.com</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.relataly.com/cryptocurrency-price-charts-with-color-overlay-python/2820/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">2820</post-id>	</item>
		<item>
		<title>Streaming Tweets and Images via the Twitter API in Python</title>
		<link>https://www.relataly.com/accessing-twitter-data-via-the-twitter-rest-api/1976/</link>
					<comments>https://www.relataly.com/accessing-twitter-data-via-the-twitter-rest-api/1976/#comments</comments>
		
		<dc:creator><![CDATA[Florian Follonier]]></dc:creator>
		<pubDate>Sun, 03 Jan 2021 21:46:47 +0000</pubDate>
				<category><![CDATA[Data Science]]></category>
		<category><![CDATA[Data Sources]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Twitter API]]></category>
		<category><![CDATA[API Tutorials]]></category>
		<category><![CDATA[Beginner Tutorials]]></category>
		<category><![CDATA[Requesting Data via REST APIs]]></category>
		<category><![CDATA[Social Media Data]]></category>
		<guid isPermaLink="false">https://www.relataly.com/?p=1976</guid>

					<description><![CDATA[<p>Twitter is a rich source of data that can be used to understand current and future trends. Because tweets often include hashtags, they can be easily linked to specific contexts such as political discussions or financial instruments. This makes Twitter a valuable tool for collecting and analyzing data. In this article, we&#8217;ll demonstrate how to ... <a title="Streaming Tweets and Images via the Twitter API in Python" class="read-more" href="https://www.relataly.com/accessing-twitter-data-via-the-twitter-rest-api/1976/" aria-label="Read more about Streaming Tweets and Images via the Twitter API in Python">Read more</a></p>
<p>The post <a href="https://www.relataly.com/accessing-twitter-data-via-the-twitter-rest-api/1976/">Streaming Tweets and Images via the Twitter API in Python</a> appeared first on <a href="https://www.relataly.com">relataly.com</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">Twitter is a rich source of data that can be used to understand current and future trends. Because tweets often include hashtags, they can be easily linked to specific contexts such as political discussions or financial instruments. This makes Twitter a valuable tool for collecting and analyzing data. In this article, we&#8217;ll demonstrate how to use Python to access Twitter data via the <a href="https://developer.twitter.com/en/docs/twitter-api" target="_blank" rel="noreferrer noopener">Twitter API v2</a>. We&#8217;ll show how to extract tweets, process them, and use them to gain insights and make predictions. Whether you&#8217;re a data scientist, a business analyst, or a social media enthusiast, this tutorial will provide you with the tools you need to work with Twitter data in Python.</p>



<p class="wp-block-paragraph">This article shows two specific cases:</p>



<ul class="wp-block-list">
<li>Example A: Streaming Tweets and Storing the Data in a DataFrame</li>



<li>Example B: Streaming Images for a specific channel and storing them in a local directory</li>
</ul>



<p class="wp-block-paragraph"><a>If you are new to APIs, consider first familiarizing yourself with the </a><a href="https://www.relataly.com/access-data-sources-using-apis/278/" target="_blank" rel="noreferrer noopener">basics of REST APIs</a>.</p>



<p class="wp-block-paragraph">The rest of this article is structured as follows: First, we&#8217;ll look at how to sign up to use the Twitter API and obtain an authentication token. We will then look at the object model of Twitter and use the security token in our requests to the Twitter API. Then we will turn to the two examples, A &amp; B.</p>



<p class="wp-block-paragraph">Also: <a href="https://www.relataly.com/access-remote-data-sources-using-rest-apis-in-python/278/" target="_blank" rel="noreferrer noopener">Accessing Remote Data Sources via REST APIs in Python</a></p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%">
<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="3831" data-permalink="https://www.relataly.com/accessing-twitter-data-via-the-twitter-rest-api/1976/image-119/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2021/05/image.png" data-orig-size="1576,1132" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2021/05/image.png" src="https://www.relataly.com/wp-content/uploads/2021/05/image-1024x736.png" alt="twitter api python" class="wp-image-3831" width="353" height="254" srcset="https://www.relataly.com/wp-content/uploads/2021/05/image.png 1024w, https://www.relataly.com/wp-content/uploads/2021/05/image.png 300w, https://www.relataly.com/wp-content/uploads/2021/05/image.png 768w, https://www.relataly.com/wp-content/uploads/2021/05/image.png 1536w, https://www.relataly.com/wp-content/uploads/2021/05/image.png 1576w" sizes="(max-width: 353px) 100vw, 353px" /><figcaption class="wp-element-caption">Twitter data is a playground for data scientists.</figcaption></figure>
</div>
</div>



<div style="height:33px" aria-hidden="true" class="wp-block-spacer"></div>



<h2 class="wp-block-heading" id="h-basics-of-the-twitter-api">Basics of the Twitter API</h2>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">Twitter data has a variety of applications. For example, we can analyze tweets to discover trends or evaluate sentiment on a topic. Furthermore, images embedded in tweets and hashtags can train image recognition models or validate them. Thus, knowing how to obtain data via the Twitter API can be helpful if you are doing data science.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<h3 class="wp-block-heading" id="h-api-versions">API Versions</h3>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">Twitter provides two different API versions: The two versions have their documentation and are incompatible. While the API v1.1 is still more established, Twitter API v2 offers more options for fetching data from Twitter. For example, it allows tailoring the fields given back with the response, which can be helpful if the goal is to minimize traffic. The Twitter API v2 is currently in early access mode, but it will sooner or later become the new standard API in the market. Therefore, I decided to base this tutorial on the newer v2-version.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><img decoding="async" data-attachment-id="2803" data-permalink="https://www.relataly.com/accessing-twitter-data-via-the-twitter-rest-api/1976/image-2-10/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2021/01/image-2.png" data-orig-size="1131,821" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-2" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2021/01/image-2.png" src="https://www.relataly.com/wp-content/uploads/2021/01/image-2-1024x743.png" alt="Twitter API specification" class="wp-image-2803" width="662" height="478"/><figcaption class="wp-element-caption">Overview of Twitter APIs (Source: Twitter)</figcaption></figure>
</div></div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<h3 class="wp-block-heading" id="h-twitter-api-documentation">Twitter API Documentation</h3>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">When working with the Twitter <strong><a href="https://developer.twitter.com/en/docs/twitter-api/early-access" target="_blank" rel="noreferrer noopener">API v2</a></strong>, it is vital to understand the Twitter object model. The tweet object acts as the parent of four subobjects: according to the API documentation, the basic building block of Twitter is the Tweet object. It has various fields attached, such as the tweet text, created_at, and tweet id. The Twitter API documentation provides a complete list of these root-level fields. The standard API response does not include most of the areas. If we want to retrieve additional fields, we need to specify these fields in the request rules. </p>



<ul class="wp-block-list">
<li><a href="https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/user" target="_blank" rel="noreferrer noopener">User object</a></li>



<li><a href="https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/media" target="_blank" rel="noreferrer noopener">Media object</a></li>



<li><a href="https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/poll" target="_blank" rel="noreferrer noopener">Poll object</a></li>



<li><a href="https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/place" target="_blank" rel="noreferrer noopener">Place objects</a></li>
</ul>



<p class="wp-block-paragraph">Each object, in turn, has multiple fields for which we specify which fields to return in the rule, as with the Tweet Object. This article uses the tweet object and the media object, which contains all the media (e.g., images or videos) that tweets can have attached. </p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<h4 class="wp-block-heading" id="h-functioning-of-the-recent-search-endpoint">Functioning of the Recent Search Endpoint</h4>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">In this tutorial, we will be working with the <a href="https://developer.twitter.com/en/docs/twitter-api/tweets/search/introduction" target="_blank" rel="noreferrer noopener">Twitter Recent Search Endpoint</a>. There are also other API endpoints, but covering all of them would go beyond the scope of this article. One notable feature of the Recent Search endpoint is that we can&#8217;t retrieve the data directly using GET requests but first have to send a POST request to the API specifying which information we want to fetch. To change these rules, we first have to delete them with a POST request and then pass the new ruleset to the API with another POST request. This procedure may sound complicated, but it gives the user more control over the API.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<p class="wp-block-paragraph"></p>



<h4 class="wp-block-heading" id="h-different-api-models">Different API Models</h4>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">We can use the &#8220;Recent Search Endpoint&#8221; in batch and streaming modes. In batch mode, the endpoint returns a list of tweets once. If the stream option is enabled, the API returns a continuous flow of individual tweets, plus any new tweets as they are published to Twitter. In this way, we can stream and process tweets in (almost) real-time. In this tutorial, we will work with the streaming option enabled.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"><div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><img decoding="async" data-attachment-id="2812" data-permalink="https://www.relataly.com/accessing-twitter-data-via-the-twitter-rest-api/1976/image-5-8/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2021/01/image-5.png" data-orig-size="434,445" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-5" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2021/01/image-5.png" src="https://www.relataly.com/wp-content/uploads/2021/01/image-5.png" alt="" class="wp-image-2812" width="266" height="272" srcset="https://www.relataly.com/wp-content/uploads/2021/01/image-5.png 434w, https://www.relataly.com/wp-content/uploads/2021/01/image-5.png 293w" sizes="(max-width: 266px) 100vw, 266px" /><figcaption class="wp-element-caption">Stream Mode vs. Batch Mode</figcaption></figure>
</div></div>
</div>



<h4 class="wp-block-heading">Filters</h4>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">We can limit the tweets and fields that the API includes in the response by specifying parameters. For example, we can let the API know that we want to retrieve tweets with specific keywords or in a certain period or only those tweets with images attached. The <a href="https://developer.twitter.com/en/docs/twitter-api/tweets/filtered-stream/integrate/build-a-rule" target="_blank" rel="noreferrer noopener">API documentation</a> provides a list of all filter parameters.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<h2 class="wp-block-heading" id="h-twitter-search-api-python-examples">Twitter Search-API Python Examples</h2>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">To stream tweets from Twitter, you will need to use the Twitter API. The API allows developers to access Twitter&#8217;s data and functionality, including the ability to stream real-time tweets. In order to stream tweets, you will need to sign up for a Twitter developer account and obtain the necessary credentials, such as a consumer key and access token. Once you have these credentials, you can use them to authenticate your API requests and access the streaming endpoint for tweets.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<h3 class="wp-block-heading" id="h-setup-a-twitter-developer-account">Setup a Twitter Developer Account</h3>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">Using the Twitter API requires you to have your own Twitter developer account. If you don&#8217;t have an account yet, you need to create it on the Twitter <a href="https://developer.twitter.com/" target="_blank" rel="noreferrer noopener">developer page</a>. As of Jan 2021, the standard developer account is free and comes with a limit of 500.000 tweets that you can fetch per month.</p>



<p class="wp-block-paragraph">After logging into your developer account, go to the <a href="https://developer.twitter.com/en/portal/dashboard" target="_blank" rel="noreferrer noopener">developer dashboard page</a> and create a new project with a name of your choice. Once you have created a project, it will be shown in the &#8220;projects dashboard,&#8221; along with an overview of your monthly tweet usage. In the next section, you will retrieve your API key from the project.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<h4 class="wp-block-heading" id="h-obtaining-your-twitter-api-security-key">Obtaining your Twitter API Security Key</h4>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">The Twitter API accepts our requests only if we provide a personal Bearer token for authentication. Each project has its Bearer token. You can find the bearer token in the Developer Portal under the Authentication Token section. Store the token somewhere in between. In the next step, we will store it in a secure location.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"><div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><img decoding="async" data-attachment-id="2768" data-permalink="https://www.relataly.com/accessing-twitter-data-via-the-twitter-rest-api/1976/image-1-10/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2021/01/image-1.png" data-orig-size="1752,919" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-1" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2021/01/image-1.png" src="https://www.relataly.com/wp-content/uploads/2021/01/image-1-1024x537.png" alt="Twitter API Authentication Tokens" class="wp-image-2768" width="308" height="161" srcset="https://www.relataly.com/wp-content/uploads/2021/01/image-1.png 1024w, https://www.relataly.com/wp-content/uploads/2021/01/image-1.png 300w, https://www.relataly.com/wp-content/uploads/2021/01/image-1.png 768w, https://www.relataly.com/wp-content/uploads/2021/01/image-1.png 1536w, https://www.relataly.com/wp-content/uploads/2021/01/image-1.png 1752w" sizes="(max-width: 308px) 100vw, 308px" /><figcaption class="wp-element-caption">Twitter API Authentication Tokens</figcaption></figure>
</div></div>
</div>



<h4 class="wp-block-heading" id="h-storing-and-loading-api-tokens">Storing and Loading API Tokens</h4>



<p class="wp-block-paragraph">The Twitter API requires the user to authenticate during use by providing a secret token. It is best not to store these keys in your project but to put them separately in a safe place. In a production environment, you would, of course, want to decrypt the keys. However, it should be sufficient to store the key in a separate python file for our test case.</p>



<p class="wp-block-paragraph">Create a new Python file called &#8220;twitter_secrets.py&#8221; and fill in the following code. Then replace the Bearer_Key with the key you retrieved from the Twitter Developer portal in the previous step.</p>



<p class="wp-block-paragraph">In the following, create a Python file called &#8220;twitter_secrets.py&#8221; and fill in the code below:</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}">&quot;&quot;&quot;Replace the values below with your own Twitter API Tokens&quot;&quot;&quot;

# Twitter Bearer Token
BEARER_KEY = &quot;your own BEARER KEY&quot;


class TwitterSecrets:
    &quot;&quot;&quot;Class that holds Twitter Secrets&quot;&quot;&quot;

    def __init__(self):
        self.BEARER_KEY = BEARER_KEY
        
        # Tests if keys are present
        for key, secret in self.__dict__.items():
            assert secret != &quot;&quot;, f&quot;Please provide a valid secret for: {key}&quot;

twitter_secrets = TwitterSecrets()</pre></div>



<p class="wp-block-paragraph">Then replace the Bearer_Key with the key you retrieved from the Twitter Developer portal in the previous step.</p>



<p class="wp-block-paragraph">The twitter_screts.py has to go to the package library of your python environment. If you use anaconda under Windows, the path is typically: &lt;user&gt;\anaconda3\Lib. Once you have placed the file in your python library, you can import it into your python project and use the bearer token from the import, as shown below:</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># imports the twitter_secrets python file in which we store the twitter API keys
from twitter_secrets import twitter_secrets as ts

bearer_token = ts.BEARER_TOKEN</pre></div>



<h3 class="wp-block-heading" id="h-prerequisites">Prerequisites</h3>



<p class="wp-block-paragraph">Before starting the coding part, make sure that you have set up your Python 3 environment and required packages. If you don&#8217;t have an environment set up yet, you can follow&nbsp;this tutorial&nbsp;to set up the&nbsp;<a href="https://www.anaconda.com/products/individual" target="_blank" rel="noreferrer noopener">Anaconda environment</a>.</p>



<p class="wp-block-paragraph">Also, make sure you install all required packages. In this tutorial, we will be working with the following standard packages:&nbsp;</p>



<ul class="wp-block-list">
<li><em><a href="https://pandas.pydata.org/" target="_blank" rel="noreferrer noopener">pandas</a></em></li>
</ul>



<p class="wp-block-paragraph">You can install packages using console commands:</p>



<ul class="wp-block-list">
<li><em>pip install &lt;package name&gt;</em></li>



<li><em>conda install &lt;package name&gt;</em>&nbsp;(if you are using the anaconda packet manager)</li>
</ul>



<h3 class="wp-block-heading" id="h-example-a-streaming-tweets-via-the-twitter-recent-search-endpoint">Example A: Streaming Tweets via the Twitter Recent Search Endpoint</h3>



<p class="wp-block-paragraph">In the first use case, we will first define some simple filter rules and then request tweets from the API based on these rules. As a response, the API returns a stream of tweets which we will process further. We store the text from the tweets in a DataFrame and further tweet information.</p>



<p class="wp-block-paragraph">We won&#8217;t detail all the code components, but we will go through the most important functions with inline code. The code is available on the GitHub repository.</p>



<div class="wp-block-kadence-advancedbtn kb-buttons-wrap kb-btns_ea7ba1-49"><a class="kb-button kt-button button kb-btn_37eb41-5b kt-btn-size-standard kt-btn-width-type-full kb-btn-global-inherit kt-btn-has-text-true kt-btn-has-svg-true wp-block-button__link wp-block-kadence-singlebtn" href="https://github.com/flo7up/relataly-public-python-API-tutorials/blob/main/126%20Getting%20Real-Time%20Price%20Data%20via%20the%20Gate.io%20API.ipynb" target="_blank" rel="noreferrer noopener"><span class="kb-svg-icon-wrap kb-svg-icon-fe_eye kt-btn-icon-side-left"><svg viewBox="0 0 24 24"  fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"  aria-hidden="true"><path d="M1 12s4-8 11-8 11 8 11 8-4 8-11 8-11-8-11-8z"/><circle cx="12" cy="12" r="3"/></svg></span><span class="kt-btn-inner-text">View on GitHub </span></a>

<a class="kb-button kt-button button kb-btn_c92de6-61 kt-btn-size-standard kt-btn-width-type-full kb-btn-global-inherit kt-btn-has-text-true kt-btn-has-svg-true wp-block-button__link wp-block-kadence-singlebtn" href="https://github.com/flo7up/relataly-public-python-API-tutorials" target="_blank" rel="noreferrer noopener"><span class="kb-svg-icon-wrap kb-svg-icon-fa_github kt-btn-icon-side-left"><svg viewBox="0 0 496 512"  fill="currentColor" xmlns="http://www.w3.org/2000/svg"  aria-hidden="true"><path d="M165.9 397.4c0 2-2.3 3.6-5.2 3.6-3.3.3-5.6-1.3-5.6-3.6 0-2 2.3-3.6 5.2-3.6 3-.3 5.6 1.3 5.6 3.6zm-31.1-4.5c-.7 2 1.3 4.3 4.3 4.9 2.6 1 5.6 0 6.2-2s-1.3-4.3-4.3-5.2c-2.6-.7-5.5.3-6.2 2.3zm44.2-1.7c-2.9.7-4.9 2.6-4.6 4.9.3 2 2.9 3.3 5.9 2.6 2.9-.7 4.9-2.6 4.6-4.6-.3-1.9-3-3.2-5.9-2.9zM244.8 8C106.1 8 0 113.3 0 252c0 110.9 69.8 205.8 169.5 239.2 12.8 2.3 17.3-5.6 17.3-12.1 0-6.2-.3-40.4-.3-61.4 0 0-70 15-84.7-29.8 0 0-11.4-29.1-27.8-36.6 0 0-22.9-15.7 1.6-15.4 0 0 24.9 2 38.6 25.8 21.9 38.6 58.6 27.5 72.9 20.9 2.3-16 8.8-27.1 16-33.7-55.9-6.2-112.3-14.3-112.3-110.5 0-27.5 7.6-41.3 23.6-58.9-2.6-6.5-11.1-33.3 2.6-67.9 20.9-6.5 69 27 69 27 20-5.6 41.5-8.5 62.8-8.5s42.8 2.9 62.8 8.5c0 0 48.1-33.6 69-27 13.7 34.7 5.2 61.4 2.6 67.9 16 17.7 25.8 31.5 25.8 58.9 0 96.5-58.9 104.2-114.8 110.5 9.2 7.9 17 22.9 17 46.4 0 33.7-.3 75.4-.3 83.6 0 6.5 4.6 14.4 17.3 12.1C428.2 457.8 496 362.9 496 252 496 113.3 383.5 8 244.8 8zM97.2 352.9c-1.3 1-1 3.3.7 5.2 1.6 1.6 3.9 2.3 5.2 1 1.3-1 1-3.3-.7-5.2-1.6-1.6-3.9-2.3-5.2-1zm-10.8-8.1c-.7 1.3.3 2.9 2.3 3.9 1.6 1 3.6.7 4.3-.7.7-1.3-.3-2.9-2.3-3.9-2-.6-3.6-.3-4.3.7zm32.4 35.6c-1.6 1.3-1 4.3 1.3 6.2 2.3 2.3 5.2 2.6 6.5 1 1.3-1.3.7-4.3-1.3-6.2-2.2-2.3-5.2-2.6-6.5-1zm-11.4-14.7c-1.6 1-1.6 3.6 0 5.9 1.6 2.3 4.3 3.3 5.6 2.3 1.6-1.3 1.6-3.9 0-6.2-1.4-2.3-4-3.3-5.6-2z"/></svg></span><span class="kt-btn-inner-text">Relataly GitHub Repo </span></a></div>



<h4 class="wp-block-heading" id="h-step-1-define-functions-to-interact-with-the-twitter-api">Step #1: Define Functions to Interact with the Twitter API</h4>



<p class="wp-block-paragraph">We begin by defining functions to interact with the Twitter API. </p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}">import requests 
import json 
import pandas as pd

# imports the twitter_secrets python file in which we store the twitter API keys
from twitter_secrets import twitter_secrets as ts

# a function that provides a bearer token to the API
def create_headers(bearer_token):
    headers = {&quot;Authorization&quot;: &quot;Bearer {}&quot;.format(bearer_token)}
    return headers

# this function defines the rules on what tweets to pull    
def set_rules(headers, delete, bearer_token, rules):
    payload = {&quot;add&quot;: rules}
    response = requests.post(
        &quot;https://api.twitter.com/2/tweets/search/stream/rules&quot;,
        headers=headers,
        json=payload,
    )
    if response.status_code != 201:
        raise Exception(
            &quot;Cannot add rules (HTTP {}): {}&quot;.format(response.status_code, response.text)
        )
    print(json.dumps(response.json()))
    
# this function requests the current rules in place
def get_rules(headers, bearer_token):
    response = requests.get(
        &quot;https://api.twitter.com/2/tweets/search/stream/rules&quot;, headers=headers
    )
    if response.status_code != 200:
        raise Exception(
            &quot;Cannot get rules (HTTP {}): {}&quot;.format(response.status_code, response.text)
        )
    print(json.dumps(response.json()))
    return response.json()

# this function resets all rules
def delete_all_rules(headers, bearer_token, rules):
    if rules is None or &quot;data&quot; not in rules:
        return None

    ids = list(map(lambda rule: rule[&quot;id&quot;], rules[&quot;data&quot;]))
    payload = {&quot;delete&quot;: {&quot;ids&quot;: ids}}
    response = requests.post(
        &quot;https://api.twitter.com/2/tweets/search/stream/rules&quot;,
        headers=headers,
        json=payload
    )
    if response.status_code != 200:
        raise Exception(
            &quot;Cannot delete rules (HTTP {}): {}&quot;.format(
                response.status_code, response.text
            )
        )
    print(json.dumps(response.json()))

# this function starts the stream
def get_stream(headers, set, bearer_token, expansions, fields, save_to_disk, save_path):
    data = []
    response = requests.get(
        &quot;https://api.twitter.com/2/tweets/search/stream&quot; + expansions + fields, headers=headers, stream=True,
    )
    print(response.status_code)
    if response.status_code != 200:
        raise Exception(
            &quot;Cannot get stream (HTTP {}): {}&quot;.format(
                response.status_code, response.text
            )
        )
    i = 0
    for response_line in response.iter_lines():
        i += 1
        if i == max_results:
            break
        else:
            json_response = json.loads(response_line)
            #print(json.dumps(json_response, indent=4, sort_keys=True))
            try:
                save_tweets(json_response)
                if save_to_disk == True:
                    save_media_to_disk(json_response, save_path)
            except (json.JSONDecodeError, KeyError) as err:
                # In case the JSON fails to decode, we skip this tweet
                print(f&quot;{i}/{max_results}: ERROR: encountered a problem with a line of data... \n&quot;)
                continue

# this function saves a tweet to the SQLite DB                
def save_tweets(tweet):
    print(json.dumps(tweet, indent=4, sort_keys=True))
    data = tweet['data']
    public_metrics = data['public_metrics']
    tweet_list.append([data['id'], data['author_id'], data['created_at'], data['text'], public_metrics['like_count']])
</pre></div>



<h4 class="wp-block-heading" id="h-step-2-subscribe-to-the-tweet-streaming-service">Step #2: Subscribe to the Tweet Streaming Service</h4>



<p class="wp-block-paragraph">Next, we subscribe to a stream of tweets. Once you have subscribed to the stream, you can process the received tweets as needed, such as by filtering or storing them for further analysis.</p>



<p class="wp-block-paragraph">In this example, we will simply save the data to disk and append it to a text file. Tweets may have media files attached. If you also like to save these images to disk, you can set the save_media_to_disk variable to &#8220;True.&#8221;</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># the max number of tweets that will be returned
max_results = 20

# save to disk
save_media_to_disk = False
save_path = &quot;&quot;

# You can adjust the rules if needed
search_rules = [
    {&quot;value&quot;: &quot;dog has:images&quot;, &quot;tag&quot;: &quot;dog pictures&quot;, &quot;lang&quot;: &quot;en&quot;},
    {&quot;value&quot;: &quot;cat has:images -grumpy&quot;, &quot;tag&quot;: &quot;cat pictures&quot;, &quot;lang&quot;: &quot;en&quot;},
]
tweet_fields = &quot;?tweet.fields=attachments,author_id,created_at,public_metrics&quot;
expansions = &quot;&quot;
tweet_list = []


bearer_token = ts.BEARER_TOKEN
headers = create_headers(bearer_token)
rules = get_rules(headers, bearer_token)
delete = delete_all_rules(headers, bearer_token, rules)
set = set_rules(headers, delete, bearer_token, search_rules)
get_stream(headers, set, bearer_token, expansions, tweet_fields, save_media_to_disk, save_path)

df = pd.DataFrame (tweet_list, columns = ['tweetid', 'author_id' , 'created_at', 'text', 'like_count'])
df</pre></div>



<p class="wp-block-paragraph"></p>



<h3 class="wp-block-heading" id="h-example-b-streaming-images-from-twitter-to-disk">Example B: Streaming Images from Twitter to Disk</h3>



<p class="wp-block-paragraph">The second use case is streaming image data from Twitter. Twitter images are useful in various machine learning use cases, e.g., training models for image recognition and classification. </p>



<p class="wp-block-paragraph">To be able to use the images later, we save them directly to our local drive. To do this, we reuse several functions from the first use case. We add some functions for creating the folder structure in which we then store the images. You can also find the code for this example on Github. The code is available on the GitHub repository.</p>



<div class="wp-block-kadence-advancedbtn kb-buttons-wrap kb-btns_1c9a6f-ad"><a class="kb-button kt-button button kb-btn_668636-e5 kt-btn-size-standard kt-btn-width-type-full kb-btn-global-inherit kt-btn-has-text-true kt-btn-has-svg-true wp-block-button__link wp-block-kadence-singlebtn" href="https://github.com/flo7up/relataly-public-python-API-tutorials/blob/main/111B%20Pulling%20Images%20via%20the%20Twitter%20API%20v2.0.ipynb" target="_blank" rel="noreferrer noopener"><span class="kb-svg-icon-wrap kb-svg-icon-fe_eye kt-btn-icon-side-left"><svg viewBox="0 0 24 24"  fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"  aria-hidden="true"><path d="M1 12s4-8 11-8 11 8 11 8-4 8-11 8-11-8-11-8z"/><circle cx="12" cy="12" r="3"/></svg></span><span class="kt-btn-inner-text">View on GitHub </span></a>

<a class="kb-button kt-button button kb-btn_e5363e-2d kt-btn-size-standard kt-btn-width-type-full kb-btn-global-inherit kt-btn-has-text-true kt-btn-has-svg-true wp-block-button__link wp-block-kadence-singlebtn" href="https://github.com/flo7up/relataly-public-python-API-tutorials" target="_blank" rel="noreferrer noopener"><span class="kb-svg-icon-wrap kb-svg-icon-fa_github kt-btn-icon-side-left"><svg viewBox="0 0 496 512"  fill="currentColor" xmlns="http://www.w3.org/2000/svg"  aria-hidden="true"><path d="M165.9 397.4c0 2-2.3 3.6-5.2 3.6-3.3.3-5.6-1.3-5.6-3.6 0-2 2.3-3.6 5.2-3.6 3-.3 5.6 1.3 5.6 3.6zm-31.1-4.5c-.7 2 1.3 4.3 4.3 4.9 2.6 1 5.6 0 6.2-2s-1.3-4.3-4.3-5.2c-2.6-.7-5.5.3-6.2 2.3zm44.2-1.7c-2.9.7-4.9 2.6-4.6 4.9.3 2 2.9 3.3 5.9 2.6 2.9-.7 4.9-2.6 4.6-4.6-.3-1.9-3-3.2-5.9-2.9zM244.8 8C106.1 8 0 113.3 0 252c0 110.9 69.8 205.8 169.5 239.2 12.8 2.3 17.3-5.6 17.3-12.1 0-6.2-.3-40.4-.3-61.4 0 0-70 15-84.7-29.8 0 0-11.4-29.1-27.8-36.6 0 0-22.9-15.7 1.6-15.4 0 0 24.9 2 38.6 25.8 21.9 38.6 58.6 27.5 72.9 20.9 2.3-16 8.8-27.1 16-33.7-55.9-6.2-112.3-14.3-112.3-110.5 0-27.5 7.6-41.3 23.6-58.9-2.6-6.5-11.1-33.3 2.6-67.9 20.9-6.5 69 27 69 27 20-5.6 41.5-8.5 62.8-8.5s42.8 2.9 62.8 8.5c0 0 48.1-33.6 69-27 13.7 34.7 5.2 61.4 2.6 67.9 16 17.7 25.8 31.5 25.8 58.9 0 96.5-58.9 104.2-114.8 110.5 9.2 7.9 17 22.9 17 46.4 0 33.7-.3 75.4-.3 83.6 0 6.5 4.6 14.4 17.3 12.1C428.2 457.8 496 362.9 496 252 496 113.3 383.5 8 244.8 8zM97.2 352.9c-1.3 1-1 3.3.7 5.2 1.6 1.6 3.9 2.3 5.2 1 1.3-1 1-3.3-.7-5.2-1.6-1.6-3.9-2.3-5.2-1zm-10.8-8.1c-.7 1.3.3 2.9 2.3 3.9 1.6 1 3.6.7 4.3-.7.7-1.3-.3-2.9-2.3-3.9-2-.6-3.6-.3-4.3.7zm32.4 35.6c-1.6 1.3-1 4.3 1.3 6.2 2.3 2.3 5.2 2.6 6.5 1 1.3-1.3.7-4.3-1.3-6.2-2.2-2.3-5.2-2.6-6.5-1zm-11.4-14.7c-1.6 1-1.6 3.6 0 5.9 1.6 2.3 4.3 3.3 5.6 2.3 1.6-1.3 1.6-3.9 0-6.2-1.4-2.3-4-3.3-5.6-2z"/></svg></span><span class="kt-btn-inner-text">Relataly GitHub Repo </span></a></div>



<h4 class="wp-block-heading" id="h-step-1-define-functions-to-interact-with-the-twitter-api-1">Step #1: Define Functions to Interact with the Twitter API</h4>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}">import requests 
import json 
import pandas as pd
import urllib
import os
from os import path
from datetime import datetime as dt

# imports the twitter_secrets python file in which we store the twitter API keys
from twitter_secrets import twitter_secrets as ts

def create_headers(bearer_token):
    headers = {&quot;Authorization&quot;: &quot;Bearer {}&quot;.format(bearer_token)}
    return headers
        
def set_rules(headers, delete, bearer_token, rules):
    payload = {&quot;add&quot;: rules}
    response = requests.post(
        &quot;https://api.twitter.com/2/tweets/search/stream/rules&quot;,
        headers=headers,
        json=payload,
    )
    if response.status_code != 201:
        raise Exception(
            &quot;Cannot add rules (HTTP {}): {}&quot;.format(response.status_code, response.text)
        )
    print(json.dumps(response.json()))
    
def get_rules(headers, bearer_token):
    response = requests.get(
        &quot;https://api.twitter.com/2/tweets/search/stream/rules&quot;, headers=headers
    )
    if response.status_code != 200:
        raise Exception(
            &quot;Cannot get rules (HTTP {}): {}&quot;.format(response.status_code, response.text)
        )
    print(json.dumps(response.json()))
    return response.json()

def delete_all_rules(headers, bearer_token, rules):
    if rules is None or &quot;data&quot; not in rules:
        return None

    ids = list(map(lambda rule: rule[&quot;id&quot;], rules[&quot;data&quot;]))
    payload = {&quot;delete&quot;: {&quot;ids&quot;: ids}}
    response = requests.post(
        &quot;https://api.twitter.com/2/tweets/search/stream/rules&quot;,
        headers=headers,
        json=payload
    )
    if response.status_code != 200:
        raise Exception(
            &quot;Cannot delete rules (HTTP {}): {}&quot;.format(
                response.status_code, response.text
            )
        )
    print(json.dumps(response.json()))

def get_stream(headers, set, bearer_token, expansions, fields, save_to_disk, save_path):
    data = []
    response = requests.get(
        &quot;https://api.twitter.com/2/tweets/search/stream&quot; + expansions + fields, headers=headers, stream=True,
    )
    print(response.status_code)
    if response.status_code != 200:
        raise Exception(
            &quot;Cannot get stream (HTTP {}): {}&quot;.format(
                response.status_code, response.text
            )
        )
    i = 0
    for response_line in response.iter_lines():
        i += 1
        if i == max_results:
            break
        else:
            json_response = json.loads(response_line)
            #print(json.dumps(json_response, indent=4, sort_keys=True))
            try:
                save_tweets(json_response)
                if save_to_disk == True:
                    save_media_to_disk(json_response, save_path)
            except (json.JSONDecodeError, KeyError) as err:
                # In case the JSON fails to decode, we skip this tweet
                print(f&quot;{i}/{max_results}: ERROR: encountered a problem with a line of data... \n&quot;)
                continue
                
def save_tweets(tweet):
    #print(json.dumps(tweet, indent=4, sort_keys=True))
    data = tweet['data']
    includes = tweet['includes']
    media = includes['media']
    for line in media:
        tweet_list.append([data['id'], line['url']])  
        
def save_media_to_disk(tweet, save_path):
    data = tweet['data']
    #print(json.dumps(data, indent=4, sort_keys=True))
    includes = tweet['includes']
    media = includes['media']
    for line in media:
        media_url = line['url']
        media_key = line['media_key']
        pic = urllib.request.urlopen(media_url)
        file_path = save_path + &quot;/&quot; + media_key + &quot;.jpg&quot;
        try:
            with open(file_path, 'wb') as localFile:
                localFile.write(pic.read())
            tweet_list.append(media_key, media_url)
        except Exception as e:
            print('exception when saving media url ' + media_url + ' to path: ' + file_path)
            if path.exists(file_path):
                print(&quot;path exists&quot;)
    
def createDir(save_path):
    try:
        os.makedirs(save_path)
    except OSError:
        print (&quot;Creation of the directory %s failed&quot; % save_path)
        if path.exists(savepath):
            print(&quot;file already exists&quot;)
    else:
        print (&quot;Successfully created the directory %s &quot; % save_path)</pre></div>



<h4 class="wp-block-heading" id="h-step-2-define-the-folder-structure-to-store-the-images">Step #2: Define the Folder Structure to Store the Images</h4>



<p class="wp-block-paragraph">We want to store images contained in tweets on disk. To find these images again afterward, we create a new directory for each run. </p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># save to disk
save_to_disk = True
 
if save_to_disk == True: 
    # detect the current working directory and print it
    base_path = os.getcwd()
    print (&quot;The current working directory is %s&quot; % base_path)
    img_dir = '/twitter/downloaded_media/'
    # the write path in which the data will be stored. If it does not yet exist, it will be created
    now = dt.now()
    dt_string = now.strftime(&quot;%d%m%Y-%H%M%S&quot;)# ddmmYY-HMS
    save_path = base_path + img_dir + dt_string
    createDir(save_path)</pre></div>



<h4 class="wp-block-heading" id="h-step-3-subscribe-to-the-tweet-streaming-service">Step #3: Subscribe to the Tweet Streaming Service</h4>



<p class="wp-block-paragraph">Finally, we call the Twitter API and subscribe to the Streaming Service. We store the tweet id and the preview image URL in a DataFrame (df).</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># the max number of tweets that will be returned
max_results = 10

# You can adjust the rules if needed
search_rules = [
    {&quot;value&quot;: &quot;dog has:images&quot;, &quot;tag&quot;: &quot;dog pictures&quot;, &quot;lang&quot;: &quot;en&quot;},
]

media_fields = &quot;&amp;media.fields=duration_ms,height,media_key,preview_image_url,public_metrics,type,url,width&quot;
expansions = &quot;?expansions=attachments.media_keys&quot;
tweet_list = []

bearer_token = ts.BEARER_TOKEN
headers = create_headers(bearer_token)
rules = get_rules(headers, bearer_token)
delete = delete_all_rules(headers, bearer_token, rules)
set = set_rules(headers, delete, bearer_token, search_rules)
get_stream(headers, set, bearer_token, expansions, media_fields, save_to_disk, save_path)

df = pd.DataFrame (tweet_list, columns = ['tweetid', 'preview_image_url'])
df</pre></div>



<h2 class="wp-block-heading" id="h-summary">Summary</h2>



<p class="wp-block-paragraph">In this tutorial, you learned how to stream and process Twitter data in near real-time using the Twitter API v2 with two use cases. The first use case has shown requesting tweet text and how to store it in a DataFrame. In the second case, we have streamed images and saved them to a local directory. There are many more ways to interact with the Twitter API, but it&#8217;s already possible to implement some exciting projects based on these two cases. </p>



<p class="wp-block-paragraph">If you liked this post, leave a comment. And if you want to learn more about using the Twitter API with Python, consider checking out my other articles:</p>



<ul class="wp-block-list">
<li><a href="https://www.relataly.com/posting-tweets-on-twitter-using-python-and-tweepy/3925/" target="_blank" rel="noreferrer noopener">Posting Tweets on Twitter via Tweepy in Python</a></li>



<li><a href="https://www.relataly.com/building-a-twitter-bot-for-trading-signals-using-python/3974/" target="_blank" rel="noreferrer noopener">Building a Twitter Bot for Trading Signals in Python</a></li>
</ul>



<h2 class="wp-block-heading" id="h-sources-and-further-reading">Sources and Further Reading</h2>



<ul class="wp-block-list">
<li><a href="https://developer.twitter.com/en/docs/twitter-api" target="_blank" rel="noreferrer noopener">https://developer.twitter.com/en/docs/twitter-api </a>A part of the presented Python code stems from the Twitter API documentation and has been modified to fit the purpose of this article.</li>
</ul>
<p>The post <a href="https://www.relataly.com/accessing-twitter-data-via-the-twitter-rest-api/1976/">Streaming Tweets and Images via the Twitter API in Python</a> appeared first on <a href="https://www.relataly.com">relataly.com</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.relataly.com/accessing-twitter-data-via-the-twitter-rest-api/1976/feed/</wfw:commentRss>
			<slash:comments>1</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">1976</post-id>	</item>
		<item>
		<title>Image Classification with Convolutional Neural Networks &#8211; Classifying Cats and Dogs in Python</title>
		<link>https://www.relataly.com/image-classification-with-deep-learning/2485/</link>
					<comments>https://www.relataly.com/image-classification-with-deep-learning/2485/#respond</comments>
		
		<dc:creator><![CDATA[Florian Follonier]]></dc:creator>
		<pubDate>Sun, 13 Dec 2020 14:09:31 +0000</pubDate>
				<category><![CDATA[Classification (two-class)]]></category>
		<category><![CDATA[Convolutional Neural Network (CNN)]]></category>
		<category><![CDATA[Data Sources]]></category>
		<category><![CDATA[Image Recognition]]></category>
		<category><![CDATA[Keras]]></category>
		<category><![CDATA[Neural Networks]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Tensorflow]]></category>
		<category><![CDATA[Use Cases]]></category>
		<category><![CDATA[Beginner Tutorials]]></category>
		<category><![CDATA[Computer Vision]]></category>
		<category><![CDATA[Deep Learning]]></category>
		<category><![CDATA[Image Dataset]]></category>
		<category><![CDATA[Supervised Learning]]></category>
		<category><![CDATA[Two-Label Classification]]></category>
		<guid isPermaLink="false">https://www.relataly.com/?p=2485</guid>

					<description><![CDATA[<p>This tutorial shows how to use Convolutional Neural Networks (CNNs) with Python for image classification. CNNs belong to the field of deep learning, a subarea of machine learning, and have become a cornerstone to many exciting innovations. There are endless applications, from self-driving cars over biometric security to automated tagging in social media. And the ... <a title="Image Classification with Convolutional Neural Networks &#8211; Classifying Cats and Dogs in Python" class="read-more" href="https://www.relataly.com/image-classification-with-deep-learning/2485/" aria-label="Read more about Image Classification with Convolutional Neural Networks &#8211; Classifying Cats and Dogs in Python">Read more</a></p>
<p>The post <a href="https://www.relataly.com/image-classification-with-deep-learning/2485/">Image Classification with Convolutional Neural Networks &#8211; Classifying Cats and Dogs in Python</a> appeared first on <a href="https://www.relataly.com">relataly.com</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">This tutorial shows how to use Convolutional Neural Networks (CNNs) with Python for image classification. CNNs belong to the field of deep learning, a subarea of machine learning, and have become a cornerstone to many exciting innovations. There are endless applications, from self-driving cars over biometric security to automated tagging in social media. And the importance of CNNs grows steadily! So there are plenty of reasons to understand how this technology works and how we can implement it. </p>



<p class="wp-block-paragraph">This article proceeds as follows: The first part introduces the core concepts behind CNNs and explains their use in image classification. The second part is a hands-on tutorial in which you will build your own CNN to distinguish images of cats and dogs. This tutorial develops a model that achieves around 82% validation accuracy. We will work with TensorFlow and Python to integrate different layers, such as Convolution Layers, Dense layers, and MaxPooling. Furthermore, we will prevent the network from overfitting the training data by using Dropout between the layers. We will also load the model and make predictions on a fresh set of images. Finally, we analyze and illustrate the performance of our image classifier. </p>



<p class="wp-block-paragraph">Also: <a href="https://www.relataly.com/automated-prompt-generation-for-dall-e-using-chatgpt-in-python-a-step-by-step-api-tutorial/12143/" target="_blank" rel="noreferrer noopener">Generating Detailed Images with OpenAI DALL-E and ChatGPT in Python: A Step-By-Step API Tutorial</a></p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<h2 class="wp-block-heading" id="h-image-classification-with-convolutional-neural-networks">Image Classification with Convolutional Neural Networks</h2>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">The history of image recognition dates back to the mid-1960s when the first attempts were made to identify objects by coding their characteristic shapes and lines. However, this task turned out to be incredibly complex. Our human brain is trained so well to recognize things that one can easily forget how diverse the observation conditions can be. Here are some examples:</p>



<ul class="wp-block-list">
<li>Fotos can be taken from various viewpoints</li>



<li>Living things can have multiple forms and poses</li>



<li>Objects come in different forms, colors, and sizes</li>



<li>The picture may hide parts of the things in the picture</li>



<li>The light conditions vary from image  to image</li>



<li>There may be one or multiple objects in the same image</li>
</ul>



<p class="wp-block-paragraph">At the beginning of the 1990s, the focus of research shifted to statistical approaches and learning algorithms.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%">
<figure class="wp-block-image size-large"><img decoding="async" width="512" height="512" data-attachment-id="13345" data-permalink="https://www.relataly.com/image-classification-with-deep-learning/2485/machine_learning_computer_vision_dazzling_magic_neural_network-min/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2023/03/machine_learning_computer_vision_dazzling_magic_neural_network-min.png" data-orig-size="1024,1024" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="machine_learning_computer_vision_dazzling_magic_neural_network-min" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2023/03/machine_learning_computer_vision_dazzling_magic_neural_network-min.png" src="https://www.relataly.com/wp-content/uploads/2023/03/machine_learning_computer_vision_dazzling_magic_neural_network-min-512x512.png" alt="The idea of computer vision is inspired by the fact that the visual cortex has cells activated by specific shapes and their orientation in the visual field. " class="wp-image-13345" srcset="https://www.relataly.com/wp-content/uploads/2023/03/machine_learning_computer_vision_dazzling_magic_neural_network-min.png 512w, https://www.relataly.com/wp-content/uploads/2023/03/machine_learning_computer_vision_dazzling_magic_neural_network-min.png 300w, https://www.relataly.com/wp-content/uploads/2023/03/machine_learning_computer_vision_dazzling_magic_neural_network-min.png 140w, https://www.relataly.com/wp-content/uploads/2023/03/machine_learning_computer_vision_dazzling_magic_neural_network-min.png 768w, https://www.relataly.com/wp-content/uploads/2023/03/machine_learning_computer_vision_dazzling_magic_neural_network-min.png 1024w" sizes="(max-width: 512px) 100vw, 512px" /><figcaption class="wp-element-caption">The idea of computer vision is inspired by the fact that the visual cortex has cells activated by specific shapes and their orientation in the visual field. </figcaption></figure>
</div>
</div>



<p class="wp-block-paragraph"></p>



<h3 class="wp-block-heading" id="h-the-emergence-of-cnns">The Emergence of CNNs</h3>



<p class="wp-block-paragraph">The basic concept of a neural network in computer vision has existed since the 1980s. It goes back to research from Hubel and Wiesel on the emergence of a cat&#8217;s visual system. They found that the visual cortex has cells activated by specific shapes and their orientation in the visual field. Some of their findings inspired the development of crucial computer vision technologies, such as, for example, hierarchical features with different levels of abstraction [1, 2]. However, it took another three decades of research and the availability of faster computers before the emergence of modern CNNs.</p>



<p class="wp-block-paragraph">The year 2012  was a defining moment for the use of CNNs in image recognition. This year, for the first time, CNN won the <a href="http://www.image-net.org/challenges/LSVRC/" target="_blank" rel="noreferrer noopener">ILSVRC </a>competition for computer vision. The challenge was classifying more than a hundred thousand images into 1000 object categories. With an error rate of only 15,3%, the succeeding model was a CNN called &#8220;AlexNet.&#8221;.</p>



<p class="wp-block-paragraph">AlexNet was the first model to achieve more than 75% accuracy. In the same year, CNNs succeeded in several other competitions. For example, in 2015, the CNN ResNet exceeded human performance in the ILSVRC competition. Only a decade ago, this achievement was considered almost impossible. So how was this performance increase possible? To understand this surge in performance, let us first look at what a picture is.</p>



<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="2653" data-permalink="https://www.relataly.com/image-classification-with-deep-learning/2485/image-15-5/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2020/12/image-15.png" data-orig-size="1081,506" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-15" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2020/12/image-15.png" src="https://www.relataly.com/wp-content/uploads/2020/12/image-15-1024x479.png" alt="" class="wp-image-2653" width="848" height="395" srcset="https://www.relataly.com/wp-content/uploads/2020/12/image-15.png 300w, https://www.relataly.com/wp-content/uploads/2020/12/image-15.png 768w" sizes="(max-width: 848px) 100vw, 848px" /><figcaption class="wp-element-caption">Top-performing models in the ImageNet image classification challenge (Alyafeai &amp; Ghouti, 2019)</figcaption></figure>



<h3 class="wp-block-heading" id="h-what-is-an-image">What is an Image?</h3>



<p class="wp-block-paragraph">A digital image is a three-dimensional array of integer values. One dimension of this array represents the pixel width, and one dimension represents the height of the picture. The third dimension contains the color depth, defined by the image format. As shown below, we can thus represent the format of a digital image as &#8220;width x height x depth.&#8221; Next, let&#8217;s have a quick look at different image formats.</p>



<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="2649" data-permalink="https://www.relataly.com/image-classification-with-deep-learning/2485/image-11-6/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2020/12/image-11.png" data-orig-size="1152,437" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-11" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2020/12/image-11.png" src="https://www.relataly.com/wp-content/uploads/2020/12/image-11-1024x388.png" alt="an image is a multidimensional integer array" class="wp-image-2649" width="861" height="326" srcset="https://www.relataly.com/wp-content/uploads/2020/12/image-11.png 1024w, https://www.relataly.com/wp-content/uploads/2020/12/image-11.png 300w, https://www.relataly.com/wp-content/uploads/2020/12/image-11.png 768w, https://www.relataly.com/wp-content/uploads/2020/12/image-11.png 1152w" sizes="(max-width: 861px) 100vw, 861px" /><figcaption class="wp-element-caption">A digital image is a multidimensional integer array.</figcaption></figure>



<h3 class="wp-block-heading" id="h-overview-of-different-image-formats">Overview of Different Image Formats</h3>



<p class="wp-block-paragraph">We can train CNNs with different image formats, but the input data are always multidimensional arrays of integer values. One of the most commonly used color formats in deep learning is &#8220;RGB.&#8221; RGB stands for the three color channels: &#8220;Red,&#8221; &#8220;Green,&#8221; and &#8220;Blue.&#8221; RGB images are divided into three layers of integer values, one layer for each color channel—the integer values of a 16-bit RGB image in each layer range from 1 to 255. Together, the three layers can reproduce 65,536 different colors. </p>



<p class="wp-block-paragraph">In contrast to RGB images, grey-scale images only have a single color layer. This layer resembles the brightness of each pixel in the image. Consequently, the format of a grey-scale image is width x height x 1. Using grey-scale images or images with black and white shades instead of RGB images can speed up the training process because less data needs to be processed. However, image data with multiple color channels provide the model with more information, leading to better predictions. The RGB format is often a good choice between prediction quality and performance. Next, let&#8217;s look at how CNNs handle digital images in the learning process.</p>



<h3 class="wp-block-heading" id="h-convolutional-neural-networks">Convolutional Neural Networks</h3>



<p class="wp-block-paragraph">As mentioned before, a CNN is a specific form of an artificial neural network. The main difference between the CNN and the standard multi-layer perceptron is their convolutional layers. CNNs can have other layers, but the convolutions make a CNN so good at detecting objects. They allow the network to identify patterns based on features that work regardless of where in the image they occur. Let&#8217;s see how this works in more detail.</p>



<h4 class="wp-block-heading" id="h-convolutional-layers">Convolutional Layers</h4>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow">
<p class="wp-block-paragraph">Convolutional layers use a rasterizing technique that breaks down an image into smaller groups of pixels called filters. Filters act as feature detectors from the original image. The primary purpose is to extract meaningful features from the input images.</p>



<p class="wp-block-paragraph">During the training, the CNN slides the filter over image locations and calculates the dot product for each feature at a time. The results of these calculations are stored in a so-called feature map (sometimes called an activation map). A feature map represents where in the image a particular feature was identified. Subsequently, the values from the feature map are transformed with an activation function (usually ReLu), and the algorithm uses them as input to the next layer.</p>


<div class="wp-block-image">
<figure class="alignleft size-large is-resized"><img decoding="async" data-attachment-id="6596" data-permalink="https://www.relataly.com/image-classification-with-deep-learning/2485/image-1/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2022/04/image-1.png" data-orig-size="1237,502" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-1" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2022/04/image-1.png" src="https://www.relataly.com/wp-content/uploads/2022/04/image-1-1024x416.png" alt="Illustration of operations in the convolutional layers" class="wp-image-6596" width="811" height="332"/><figcaption class="wp-element-caption">Illustration of operations in the convolutional layers</figcaption></figure>
</div></div>
</div>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow">
<p class="wp-block-paragraph">Features become more complex with the increasing depth of the network. In the first layer of the network, convolutions will detect generic geometric forms and low-level features based on edges, corners, squares, or circles. The subsequent layers of the network will look at more sophisticated shapes and may, for example, include features that resemble the form of an eye of a cat or the nose of a dog. In this way, convolutions provide the network with features at different levels of detail that enable powerful detection patterns.</p>


<div class="wp-block-image">
<figure class="alignleft size-large is-resized"><img decoding="async" data-attachment-id="2661" data-permalink="https://www.relataly.com/image-classification-with-deep-learning/2485/image-16-4/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2020/12/image-16.png" data-orig-size="1908,819" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-16" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2020/12/image-16.png" src="https://www.relataly.com/wp-content/uploads/2020/12/image-16-1024x440.png" alt="Convolutions at the example of an image that contains the number &quot;3&quot;" class="wp-image-2661" width="874" height="374" srcset="https://www.relataly.com/wp-content/uploads/2020/12/image-16.png 300w, https://www.relataly.com/wp-content/uploads/2020/12/image-16.png 768w, https://www.relataly.com/wp-content/uploads/2020/12/image-16.png 1536w, https://www.relataly.com/wp-content/uploads/2020/12/image-16.png 1908w" sizes="(max-width: 874px) 100vw, 874px" /><figcaption class="wp-element-caption">Exemplary convolutions of an image that contains the number &#8220;3.&#8221;</figcaption></figure>
</div></div>
</div>



<h4 class="wp-block-heading" id="h-pooling-downsampling">Pooling / Downsampling</h4>



<p class="wp-block-paragraph">A convolutional layer is usually followed by a pooling operation, which reduces the amount of data by filtering unnecessary information. This process is also called downsampling or subsampling. There are various forms of pooling. In the most common variant &#8211; max-pooling &#8211; only the highest value in a predefined grid (e.g., 2&#215;2) is processed, and the remaining values are discarded. For example, imagine a 2&#215;2 grid with values 0.1, 0.5, 0.4, and 0.8. The algorithm would only process the 0,8 further for this grid and use it as part of the input to the next layer. The advantages of pooling are reduced data and faster training times. Because pooling minimizes the complexity of the network, it allows for the construction of deeper architectures with more layers. In addition, pooling offers a certain protection against overfitting during training.</p>



<h4 class="wp-block-heading" id="h-dropout">Dropout</h4>



<p class="wp-block-paragraph">Dropout is another technique that helps prevent the network from overfitting the training data. When we activate Dropout for a layer, the algorithm will remove a random number of neurons from the layer per training step. As a result, the network needs to learn patterns that give less weight to individual layers and thus generalize better. The dropout rate controls the percentage of switched-off neurons in each training iteration. We can configure Dropout for each layer separately. </p>



<p class="wp-block-paragraph">CNNs with many layers and training epochs tend to overfit the training data. Especially here, Dropout is crucial to avoid overfitting and to achieve good prediction results with data that the network does not know yet. A typical value for the rate lies between 10% to 30%.</p>



<h4 class="wp-block-heading" id="h-multi-layer-perceptron-mlp">Multi-Layer Perceptron (MLP)</h4>



<p class="wp-block-paragraph">The CNN architecture ends with multiple dense layers that are fully connected. The layers are part of a Multilayer Perception (MLP), which has the task of dense down the results from the previous convolutions and outputting one of the multiple classes. Consequently, the number of neurons in the final dense layer usually corresponds to the number of different classes to be predicted. It is also possible to use a single neuron in the final layer for two-class prediction problems. In this case, the last neuron outputs a binary label of 0 or 1.</p>



<h2 class="wp-block-heading" id="h-building-a-cnn-with-tensorflow-that-classifies-cats-and-dogs">Building a CNN with Tensorflow that Classifies Cats and Dogs</h2>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">Now that you are familiar with the basic concepts behind convolutional neural networks, we can commence with the practical part and build an image classifier. In the following, we will train a CNN to distinguish images of cats and dogs. We first define a CNN model and then feed it a few thousand photos from a public dataset with labeled images of cats and dogs.</p>



<p class="wp-block-paragraph">Distinguishing cats and dogs may not sound difficult, but many challenges exist. Imagine the almost infinite circumstances in which animals can be photographed, not to mention the many forms a cat can take. These variations lead to the fact that even humans sometimes confuse a cat with a dog or vice versa. So don&#8217;t expect our model to be perfect right from the start. Our model will score around 82% accuracy on the validation dataset.</p>



<p class="wp-block-paragraph">The code is available on the GitHub repository.</p>



<div class="wp-block-kadence-advancedbtn kb-buttons-wrap kb-btns_d70aa4-6e"><a class="kb-button kt-button button kb-btn_b926ba-d4 kt-btn-size-standard kt-btn-width-type-full kb-btn-global-inherit kt-btn-has-text-true kt-btn-has-svg-true wp-block-button__link wp-block-kadence-singlebtn" href="https://github.com/flo7up/relataly-public-python-tutorials/blob/master/06%20Computer%20Vision/200%20Classifying%20Cats%20%26%20Dogs%20Binary.ipynb" target="_blank" rel="noreferrer noopener"><span class="kb-svg-icon-wrap kb-svg-icon-fe_eye kt-btn-icon-side-left"><svg viewBox="0 0 24 24"  fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"  aria-hidden="true"><path d="M1 12s4-8 11-8 11 8 11 8-4 8-11 8-11-8-11-8z"/><circle cx="12" cy="12" r="3"/></svg></span><span class="kt-btn-inner-text">View on GitHub </span></a>

<a class="kb-button kt-button button kb-btn_80b142-4f kt-btn-size-standard kt-btn-width-type-full kb-btn-global-inherit kt-btn-has-text-true kt-btn-has-svg-true wp-block-button__link wp-block-kadence-singlebtn" href="https://github.com/flo7up/relataly-public-python-API-tutorials" target="_blank" rel="noreferrer noopener"><span class="kb-svg-icon-wrap kb-svg-icon-fa_github kt-btn-icon-side-left"><svg viewBox="0 0 496 512"  fill="currentColor" xmlns="http://www.w3.org/2000/svg"  aria-hidden="true"><path d="M165.9 397.4c0 2-2.3 3.6-5.2 3.6-3.3.3-5.6-1.3-5.6-3.6 0-2 2.3-3.6 5.2-3.6 3-.3 5.6 1.3 5.6 3.6zm-31.1-4.5c-.7 2 1.3 4.3 4.3 4.9 2.6 1 5.6 0 6.2-2s-1.3-4.3-4.3-5.2c-2.6-.7-5.5.3-6.2 2.3zm44.2-1.7c-2.9.7-4.9 2.6-4.6 4.9.3 2 2.9 3.3 5.9 2.6 2.9-.7 4.9-2.6 4.6-4.6-.3-1.9-3-3.2-5.9-2.9zM244.8 8C106.1 8 0 113.3 0 252c0 110.9 69.8 205.8 169.5 239.2 12.8 2.3 17.3-5.6 17.3-12.1 0-6.2-.3-40.4-.3-61.4 0 0-70 15-84.7-29.8 0 0-11.4-29.1-27.8-36.6 0 0-22.9-15.7 1.6-15.4 0 0 24.9 2 38.6 25.8 21.9 38.6 58.6 27.5 72.9 20.9 2.3-16 8.8-27.1 16-33.7-55.9-6.2-112.3-14.3-112.3-110.5 0-27.5 7.6-41.3 23.6-58.9-2.6-6.5-11.1-33.3 2.6-67.9 20.9-6.5 69 27 69 27 20-5.6 41.5-8.5 62.8-8.5s42.8 2.9 62.8 8.5c0 0 48.1-33.6 69-27 13.7 34.7 5.2 61.4 2.6 67.9 16 17.7 25.8 31.5 25.8 58.9 0 96.5-58.9 104.2-114.8 110.5 9.2 7.9 17 22.9 17 46.4 0 33.7-.3 75.4-.3 83.6 0 6.5 4.6 14.4 17.3 12.1C428.2 457.8 496 362.9 496 252 496 113.3 383.5 8 244.8 8zM97.2 352.9c-1.3 1-1 3.3.7 5.2 1.6 1.6 3.9 2.3 5.2 1 1.3-1 1-3.3-.7-5.2-1.6-1.6-3.9-2.3-5.2-1zm-10.8-8.1c-.7 1.3.3 2.9 2.3 3.9 1.6 1 3.6.7 4.3-.7.7-1.3-.3-2.9-2.3-3.9-2-.6-3.6-.3-4.3.7zm32.4 35.6c-1.6 1.3-1 4.3 1.3 6.2 2.3 2.3 5.2 2.6 6.5 1 1.3-1.3.7-4.3-1.3-6.2-2.2-2.3-5.2-2.6-6.5-1zm-11.4-14.7c-1.6 1-1.6 3.6 0 5.9 1.6 2.3 4.3 3.3 5.6 2.3 1.6-1.3 1.6-3.9 0-6.2-1.4-2.3-4-3.3-5.6-2z"/></svg></span><span class="kt-btn-inner-text">Relataly GitHub Repo </span></a></div>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%">
<figure class="wp-block-image is-resized"><img decoding="async" src="https://www.relataly.com/wp-content/uploads/2022/04/Image-Recognition-Convolutional-Neural-Networks.png" alt="Image Recognition Convolutional Neural Networks - classifying cats and dogs python " width="382" height="125"/><figcaption class="wp-element-caption">Cat or Dog? That&#8217;s what our CNN will predict.</figcaption></figure>
</div>
</div>



<div style="height:29px" aria-hidden="true" class="wp-block-spacer"></div>



<h3 class="wp-block-heading" id="h-prerequisites">Prerequisites</h3>



<p class="wp-block-paragraph">Before starting the coding part, make sure that you have set up your <a href="https://www.python.org/downloads/" target="_blank" rel="noreferrer noopener">Python 3</a> environment and required packages. If you don&#8217;t have an environment, you can follow&nbsp;<a href="https://www.relataly.com/anaconda-python-environment-machine-learning/1663/" target="_blank" rel="noreferrer noopener">this tutorial</a>&nbsp;to set up the&nbsp;<a href="https://www.anaconda.com/products/individual" target="_blank" rel="noreferrer noopener">Anaconda environment</a>.</p>



<p class="wp-block-paragraph">Also, make sure you install all required packages. In this tutorial, we will be working with the following standard packages:&nbsp;</p>



<ul class="wp-block-list">
<li><em><a href="https://pandas.pydata.org/" target="_blank" rel="noreferrer noopener">pandas</a></em></li>



<li><em><a href="https://numpy.org/" target="_blank" rel="noreferrer noopener">NumPy</a></em></li>



<li><a href="https://docs.python.org/3/library/math.html" target="_blank" rel="noreferrer noopener">math</a></li>



<li><em><a href="https://matplotlib.org/" target="_blank" rel="noreferrer noopener">matplotlib</a></em></li>
</ul>



<p class="wp-block-paragraph">In addition, we will be using <em><a href="https://keras.io/" target="_blank" rel="noreferrer noopener">Keras&nbsp;</a></em>(2.0 or higher) with <a href="https://www.tensorflow.org/" target="_blank" rel="noreferrer noopener"><em>Tensorflow</em> </a>backend and the machine learning library <a href="https://scikit-learn.org/stable/" target="_blank" rel="noreferrer noopener">Scikit-learn</a>.</p>



<p class="wp-block-paragraph">You can install packages using console commands:</p>



<ul class="wp-block-list">
<li><em>pip install &lt;package name&gt;</em></li>



<li><em>conda install &lt;package name&gt;</em>&nbsp;(if you are using the anaconda packet manager)</li>
</ul>



<h4 class="wp-block-heading" id="h-download-the-dataset">Download the Dataset</h4>



<p class="wp-block-paragraph">We will train our image classification model with a public dataset from <a href="http://www.kaggle.com" target="_blank" rel="noreferrer noopener">Kaggle.com</a>. The dataset contains more than 25.000 JPG pictures of cats and dogs. The images are uniformly named and numbered, for example, dog.1.jpg, dog.2.jpg, dog.3.jpg, cat.1.jpg, cat.2.jpg, and so on. You can download the picture set directly from Kaggle: <a href="https://www.kaggle.com/c/dogs-vs-cats/overview" target="_blank" rel="noreferrer noopener">cats-vs-dogs</a>. </p>



<h4 class="wp-block-heading" id="h-setup-the-folder-structure">Setup the Folder Structure</h4>



<p class="wp-block-paragraph">There are different ways data can be structured and loaded during model training. One approach (1) is to split the images into classes and create a separate folder for each class, class_a, class_b, etc. Another method (2) is to put all images into a single folder and define a DataFrame that splits the data into test and train. Because the cats and dogs dataset files already contain the classes in their name, I decided to go for the second approach. </p>



<p class="wp-block-paragraph">Before we begin with the coding part, we create a folder structure that looks as follows:</p>



<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="2676" data-permalink="https://www.relataly.com/image-classification-with-deep-learning/2485/image-17-4/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2020/12/image-17.png" data-orig-size="532,286" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-17" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2020/12/image-17.png" src="https://www.relataly.com/wp-content/uploads/2020/12/image-17.png" alt="structure of the data that we will use to train the convolutional neural network" class="wp-image-2676" width="409" height="220" srcset="https://www.relataly.com/wp-content/uploads/2020/12/image-17.png 532w, https://www.relataly.com/wp-content/uploads/2020/12/image-17.png 300w" sizes="(max-width: 409px) 100vw, 409px" /><figcaption class="wp-element-caption">The folder structure of our cats and dogs prediction project</figcaption></figure>



<p class="wp-block-paragraph">If you want to use the standard pathways given in the python tutorial, make sure that your notebook resides in the parent folder of the &#8220;data&#8221; folder.</p>



<p class="wp-block-paragraph">After you have created the folder structure, open the cats-vs-dogs zip file. The ZIP file contains the folders &#8220;train,&#8221; &#8220;test,&#8221; and &#8220;sample.&#8221; Unzip the JPG files from the &#8220;train&#8221; (20.000 images) and the &#8220;test&#8221; folder (5.000 pictures) to the &#8220;train&#8221; folder of your project. Afterward, the train folder should contain 25.000 images. The sample folder is intended to include your sample images, for example, of your pet. We will later use the images from the sample folder to test the model on new real-world data. </p>



<p class="wp-block-paragraph">We have fulfilled all requirements and can start with the coding part.</p>



<h3 class="wp-block-heading" id="h-step-1-make-imports-and-check-training-device">Step #1 Make Imports and Check Training Device</h3>



<p class="wp-block-paragraph">We begin by setting up the imports for this project. I have put the package imports at the beginning to give you a  quick overview of the packages you need to install.</p>



<p class="wp-block-paragraph">Using the GPU instead of the CPU allows for faster training times. However, setting up Tensorflow to work with the GPUs can cause problems. Not everyone has a GPU; in this case, TensorFlow should usually automatically run all code on the CPU. However, should you for any reason prefer to manually switch to CPU training, change [&#8220;CUDA_VISIBLE_DEVICES&#8221;]= &#8220;1&#8221; to &#8220;-1&#8221;. As a result, Tensorflow will run all code on the CPU and ignore all available GPUs. </p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}">import os
#os.environ[&quot;CUDA_VISIBLE_DEVICES&quot;]=&quot;-1&quot; 

import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Convolution2D, MaxPooling2D, ZeroPadding2D
from tensorflow.keras.layers import Conv2D, Activation, Dropout, Flatten, Dense, BatchNormalization
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
from tensorflow.keras.metrics import Accuracy
from tensorflow.keras import regularizers
from tensorflow.keras.optimizers import SGD, Adam
from tensorflow.python.client import device_lib
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, accuracy_score

tf.config.allow_growth = True
tf.config.per_process_gpu_memory_fraction = 0.9

from random import randint
import seaborn as sns
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors
import seaborn as sns
from PIL import Image
import random as rdn</pre></div>



<p class="wp-block-paragraph">Running the command below checks the TensorFlow version and the number of available GPUs in our system. </p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># check the tensorflow version
print('Tensorflow Version: ' + tf.__version__)

# check the number of available GPUs
physical_devices = tf.config.list_physical_devices('GPU')
print(&quot;Num GPUs:&quot;, len(physical_devices))</pre></div>



<pre class="wp-block-preformatted">Tensorflow Version: 2.4.0-rc3
Num GPUs: 1</pre>



<p class="wp-block-paragraph">My GPU is an RTX 3080. When I wrote this article, the GPU was not yet supported by the standard TensorFlow release. I have therefore used the pre-release version of TensorFlow (2.4.0-rc3). I expect the following standard release (2.3) to work fine. </p>



<p class="wp-block-paragraph">In my case, the GPU check returns one because I have a single GPU on my computer. If TensorFlow doesn&#8217;t recognize any GPU, this command will return 0. Tensorflow will then run on the CPU.</p>



<h3 class="wp-block-heading" id="h-step-2-define-the-prediction-classes">Step #2 Define the Prediction Classes</h3>



<p class="wp-block-paragraph">Next, we will define the path to the folders that contain our train and validation images. In addition, we will define a Dataframe &#8220;image_df,&#8221; which has all the pictures from the &#8220;train&#8221; folder. With the help of this Dataframe, we can later split the data simply by defining which images from the train folder contain the training dataset and which belong to the test dataset. Important note: the dataframe &#8220;image_df&#8221; only includes the names of the images and the classes, but not the photos themselves.</p>



<p class="wp-block-paragraph">It&#8217;s good to check the distribution of classes in the training data set. For this purpose, we create a bar plot, which illustrates the number of both classes in the image data. And yes, I admit, I choose some custom colors to make it look fancy.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># set the directory for train and validation images
train_path = 'data/images/cats-and-dogs/train/'
#test_path = 'data/cats-and-dogs/test/'

# function to create a list of image labels 
def createImageDf(path):
    filenames = os.listdir(path)
    categories = []

    for fname in filenames:
        category = fname.split('.')[0]
        if category == 'dog':
            categories.append(1)
        else:
            categories.append(0)
    df = pd.DataFrame({
        'filename':filenames,
        'category':categories
    })
    return df

# display the header of the train_df dataset
image_df = createImageDf(train_path)
image_df.head(5)

sns.countplot(y='category', data=image_df, palette=['#2FE5C7',&quot;#2F8AE5&quot;], orient=&quot;h&quot;)</pre></div>



<p class="wp-block-paragraph"></p>



<figure class="wp-block-image size-full"><img decoding="async" width="376" height="262" data-attachment-id="11572" data-permalink="https://www.relataly.com/image-classification-with-deep-learning/2485/image-9-2/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2022/12/image-9.png" data-orig-size="376,262" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-9" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2022/12/image-9.png" src="https://www.relataly.com/wp-content/uploads/2022/12/image-9.png" alt="" class="wp-image-11572" srcset="https://www.relataly.com/wp-content/uploads/2022/12/image-9.png 376w, https://www.relataly.com/wp-content/uploads/2022/12/image-9.png 300w" sizes="(max-width: 376px) 100vw, 376px" /></figure>



<p class="wp-block-paragraph">The number of images in the two classes is balanced, so we don&#8217;t need to rebalance the data. That&#8217;s nice!</p>



<h3 class="wp-block-heading" id="h-step-3-plot-sample-images">Step #3 Plot Sample Images</h3>



<p class="wp-block-paragraph">I prefer not to jump directly into preprocessing and check that the data has been correctly loaded. We will do this by plotting some random images from the train folder. This step is not necessary, but it&#8217;s a best practice.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}">n_pictures = 16 # number of pictures to be shown
columns = int(n_pictures / 2)
rows = 2
plt.figure(figsize=(40, 12))
for i in range(n_pictures):
    num = i + 1
    ax = plt.subplot(rows, columns, i + 1)
    if i &lt; columns:
        image_name = 'cat.' + str(rdn.randint(1, 1000)) + '.jpg'
    else: 
        image_name = 'dog.' + str(rdn.randint(1, 1000)) + '.jpg'
    plt.xlabel(image_name)    
    plt.imshow(load_img(train_path + image_name)) 

#if you get a deprecated warning, you can ignore it</pre></div>



<figure class="wp-block-image size-full"><img decoding="async" width="1024" height="315" data-attachment-id="7123" data-permalink="https://www.relataly.com/image-classification-with-deep-learning/2485/cats-and-dogs-neural-networks-classification/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2022/04/cats-and-dogs-neural-networks-classification.png" data-orig-size="1024,315" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="cats-and-dogs-neural-networks-classification" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2022/04/cats-and-dogs-neural-networks-classification.png" src="https://www.relataly.com/wp-content/uploads/2022/04/cats-and-dogs-neural-networks-classification.png" alt="classifying cats and dogs convolutional neural networks" class="wp-image-7123" srcset="https://www.relataly.com/wp-content/uploads/2022/04/cats-and-dogs-neural-networks-classification.png 1024w, https://www.relataly.com/wp-content/uploads/2022/04/cats-and-dogs-neural-networks-classification.png 300w, https://www.relataly.com/wp-content/uploads/2022/04/cats-and-dogs-neural-networks-classification.png 768w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<p class="wp-block-paragraph">I never expected to have so many pictures of cats and dogs one day, but I guess neither did you 🙂 Neural networks require a fixed input shape where each neuron corresponds to a pixel value. </p>



<p class="wp-block-paragraph">As we can see from the sample images, the images in our dataset have different sizes and aspect ratios. For the images to fit into the input shape of our neural network, we need to put the images into a standard format. But before that, we split the data into two datasets for train and test.</p>



<h3 class="wp-block-heading" id="h-step-4-split-the-data">Step #4 Split the Data</h3>



<p class="wp-block-paragraph">Image classification requires splitting the data into a train and a validation set. We define a split ratio of 1/5 so that 80% of the data goes into the training dataset and 20% goes into the validation dataframe. We shuffle the data to create two DataFrameswith a mix of random cat and dog pictures. In addition, we transform the classes of the images into categorical values 0-&gt;&#8221;cat&#8221; and 1-&gt;&#8221;dog&#8221;. The result is two new DataFrames: train_df (20.000 images) and validate_df (5.000 images).</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}">image_df[&quot;category&quot;] = image_df[&quot;category&quot;].replace({0:'cat',1:'dog'})

train_df, validate_df = train_test_split(image_df, test_size=0.20, random_state=42)
train_df = train_df.reset_index(drop=True)
total_train = train_df.shape[0]

validate_df = validate_df.reset_index(drop=True)
total_validate = validate_df.shape[0]
train_df.head()

print(len(train_df), len(validate_df))</pre></div>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:false,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;null&quot;,&quot;mime&quot;:&quot;text/plain&quot;,&quot;theme&quot;:&quot;3024-day&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Plain Text&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;text&quot;}">Output: 20000 5000</pre></div>



<h3 class="wp-block-heading" id="h-step-5-preprocess-the-images">Step #5 Preprocess the Images</h3>



<p class="wp-block-paragraph">The next step is to define two data generators for these DataFrames, which use the names given in the train and validation DataFrames to feed the images from the &#8220;train&#8221; path into our neural network. The data generator has various configuration options. We will perform the following operations:</p>



<ul class="wp-block-list">
<li>Rescale the image by dividing their RGB color values (1-255) by 255</li>



<li>Shuffle the images (again)</li>



<li>Bring the images into a uniform shape of 128 x 128 pixels</li>



<li>We define a batch size of 32, which processes the 32 images simultaneously.</li>



<li>The class mode is &#8220;binary&#8221; so our two prediction labels are encoded as&nbsp;float32&nbsp;scalars with values 0 or 1. As a result, we will only have a single end neuron in our network.</li>



<li>We perform some data augmentation techniques on the training data (incl. horizontal flip, shearing, and zoom). In this way, the model never sees different variants of the images, which helps to prevent overfitting.</li>
</ul>



<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="2700" data-permalink="https://www.relataly.com/image-classification-with-deep-learning/2485/image-19-4/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2020/12/image-19.png" data-orig-size="833,262" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-19" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2020/12/image-19.png" src="https://www.relataly.com/wp-content/uploads/2020/12/image-19.png" alt="" class="wp-image-2700" width="753" height="236" srcset="https://www.relataly.com/wp-content/uploads/2020/12/image-19.png 833w, https://www.relataly.com/wp-content/uploads/2020/12/image-19.png 300w, https://www.relataly.com/wp-content/uploads/2020/12/image-19.png 768w" sizes="(max-width: 753px) 100vw, 753px" /><figcaption class="wp-element-caption">Some augmentation techniques</figcaption></figure>



<p class="wp-block-paragraph">It is essential to mention that the input shape of the first layer of the neural network must correspond to the image shape of 128 x 128. The reason is that each pixel becomes an input to a neuron.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># set the dimensions to which we will convert the images
img_width, img_height = 128, 128
target_size = (img_width, img_height)
batch_size = 32
rescale=1.0/255

# configure the train data generator
print('Train data:')
train_datagen = ImageDataGenerator(rescale=rescale)
train_generator = train_datagen.flow_from_dataframe(
    train_df, 
    train_path,
    shear_range=0.2, #
    zoom_range=0.2, #
    horizontal_flip=True, # 
    shuffle=True, # shuffle the image data
    x_col='filename', y_col='category',
    classes=['dog', 'cat'],
    target_size=target_size,
    batch_size=batch_size,
    color_mode=&quot;rgb&quot;,
    class_mode='binary')

# configure test data generator
# only rescaling
print('Test data:')
validation_datagen = ImageDataGenerator(rescale=rescale)
validation_generator = validation_datagen.flow_from_dataframe(
    validate_df, 
    train_path,    
    shuffle=True,
    x_col='filename', y_col='category',
    classes=['dog', 'cat'],
    target_size=target_size,
    batch_size=batch_size,
    color_mode=&quot;rgb&quot;,
    class_mode='binary')</pre></div>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:false,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;null&quot;,&quot;mime&quot;:&quot;text/plain&quot;,&quot;theme&quot;:&quot;3024-day&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Plain Text&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;text&quot;}">Train data:
Found 20000 validated image filenames belonging to 2 classes.
Test data:
Found 5000 validated image filenames belonging to 2 classes.</pre></div>



<p class="wp-block-paragraph">At this point, we have already completed the data preprocessing part. The next step is to define and compile the convolutional neural network.</p>



<h3 class="wp-block-heading" id="h-step-6-define-and-compile-the-convolutional-neural-network">Step #6 Define and Compile the Convolutional Neural Network</h3>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">The architecture of our image classification CNN is inspired by the famous VGGNet. In this section, we will define and compile our CNN model. We do this by defining multiple layers and stacking them on top of each other. However, to lower the amount of time needed to train the network, I reduced the number of layers.</p>



<p class="wp-block-paragraph">The initial layer of our network is the initial input layer, which receives the preprocessed images. As already noted, the shape of the input layer needs to match the shape of our images. Considering how we have defined the format of the images in our data generators, the input shape is defined as 128 x 128 x 3. </p>



<p class="wp-block-paragraph">The subsequent layers are four convolutional layers. Each of these layers is followed by a pooling layer. In addition, we define a Dropoutrate of 20% for each convolutional layer. </p>



<p class="wp-block-paragraph">Finally, a fully connected output layer with 128 neurons and a binary layer for the output complete the structure of the CNN.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%">
<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="2608" data-permalink="https://www.relataly.com/image-classification-with-deep-learning/2485/image-8-7/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2020/12/image-8.png" data-orig-size="539,493" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-8" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2020/12/image-8.png" src="https://www.relataly.com/wp-content/uploads/2020/12/image-8.png" alt="3-dimensional Input Shape of our Neural Network " class="wp-image-2608" width="423" height="386" srcset="https://www.relataly.com/wp-content/uploads/2020/12/image-8.png 539w, https://www.relataly.com/wp-content/uploads/2020/12/image-8.png 300w" sizes="(max-width: 423px) 100vw, 423px" /><figcaption class="wp-element-caption">3-dimensional Input Shape of our Neural Network </figcaption></figure>



<p class="wp-block-paragraph"></p>
</div>
</div>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow">
<div class="wp-block-kadence-infobox kt-info-box_4a47ba-1f"><span class="kt-blocks-info-box-link-wrap info-box-link kt-blocks-info-box-media-align-top kt-info-halign-left"><div class="kt-blocks-info-box-media-container"><div class="kt-blocks-info-box-media kt-info-media-animate-drawborder"><div class="kadence-info-box-icon-container kt-info-icon-animate-drawborder"><div class="kadence-info-box-icon-inner-container"><span class="kb-svg-icon-wrap kb-svg-icon-fe_cpu kt-info-svg-icon"><svg viewBox="0 0 24 24"  fill="none" stroke="currentColor" stroke-width="1" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"  aria-hidden="true"><rect x="4" y="4" width="16" height="16" rx="2" ry="2"/><rect x="9" y="9" width="6" height="6"/><line x1="9" y1="1" x2="9" y2="4"/><line x1="15" y1="1" x2="15" y2="4"/><line x1="9" y1="20" x2="9" y2="23"/><line x1="15" y1="20" x2="15" y2="23"/><line x1="20" y1="9" x2="23" y2="9"/><line x1="20" y1="14" x2="23" y2="14"/><line x1="1" y1="9" x2="4" y2="9"/><line x1="1" y1="14" x2="4" y2="14"/></svg></span></div></div></div></div><div class="kt-infobox-textcontent"><h4 class="kt-blocks-info-box-title">Additional Info</h4><p class="kt-blocks-info-box-text"><strong><em>Loss function</em>:</strong> measures model accuracy during training. We try to minimize this function to &#8220;steer&#8221; the model in the right direction. We use binary_crossentropy.<br/><strong><em>Optimizer</em>:</strong> defines how the model weights are updated based on the data it sees and its loss function.<br/><strong><em>Metrics</em></strong> are<strong> </strong>used to monitor the steps during training and testing. The following example uses <em>accuracy</em>, which is the fraction of the correctly classified images.</p></div></span></div>
</div>
</div>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># define the input format of the model
input_shape = (img_width, img_height, 3)
print(input_shape)

# define  model
model = Sequential()
model.add(Conv2D(32, (3, 3), strides=(1, 1), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=input_shape))
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.20))
model.add(Conv2D(64, (3, 3), strides=(1, 1), activation='relu', kernel_initializer='he_uniform', padding='same'))
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.20))
model.add(Conv2D(64, (3, 3), strides=(1, 1), activation='relu', kernel_initializer='he_uniform', padding='same'))
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.20))
model.add(Conv2D(128, (3, 3),  strides=(1, 1),activation='relu', kernel_initializer='he_uniform', padding='same'))
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.20))
model.add(Flatten())
model.add(Dense(128, activation='relu', kernel_initializer='he_uniform'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))

# compile the model and print its architecture
opt = SGD(lr=0.001, momentum=0.9)
history = model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])
print(model.summary())</pre></div>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:false,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;null&quot;,&quot;mime&quot;:&quot;text/plain&quot;,&quot;theme&quot;:&quot;3024-day&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Plain Text&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;text&quot;}">input_shape: (100, 100, 3)
Model: &quot;sequential&quot;
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 100, 100, 32)      896       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 50, 50, 32)        0         
_________________________________________________________________
dropout (Dropout)            (None, 50, 50, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 50, 50, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 25, 25, 64)        0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 25, 25, 64)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 25, 25, 64)        36928     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 12, 12, 64)        0         
_________________________________________________________________
dropout_2 (Dropout)          (None, 12, 12, 64)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 12, 12, 128)       73856     
_________________________________________________________________
...
Trainable params: 720,257
Non-trainable params: 0
_________________________________________________________________
None</pre></div>



<p class="wp-block-paragraph">At this point, we have defined and assembled our convolutional neural network. Next, it is time to train the model.</p>



<h3 class="wp-block-heading" id="h-step-7-train-the-model">Step #7 Train the Model</h3>



<p class="wp-block-paragraph">Before we train the image classifier, we still have to choose the number of epochs. More epochs can improve the model performance and lead to longer training times. In addition, the risk increases that the model overfits. Finding the optimal number of epochs is difficult and often requires a trial-and-error approach. I typically start with a small number of 5 epochs and then increase this number until increases do not lead to significant improvements.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># train the model
epochs = 40
early_stop = EarlyStopping(monitor='loss', patience=6, verbose=1)

history = model.fit(
    train_generator,
    epochs=epochs,
    callbacks=[early_stop],
    steps_per_epoch=len(train_generator),
    verbose=1,
    validation_data=validation_generator,
    validation_steps=len(validation_generator))</pre></div>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:false,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;null&quot;,&quot;mime&quot;:&quot;text/plain&quot;,&quot;theme&quot;:&quot;3024-day&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Plain Text&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;text&quot;}">Epoch 1/35
625/625 [==============================] - 121s 194ms/step - loss: 0.7050 - accuracy: 0.5282 - val_loss: 0.6902 - val_accuracy: 0.5824
Epoch 2/35
625/625 [==============================] - 115s 183ms/step - loss: 0.6853 - accuracy: 0.5469 - val_loss: 0.6856 - val_accuracy: 0.5806
Epoch 3/35
625/625 [==============================] - 115s 184ms/step - loss: 0.6744 - accuracy: 0.5752 - val_loss: 0.6746 - val_accuracy: 0.5806
Epoch 4/35
625/625 [==============================] - 112s 180ms/step - loss: 0.6569 - accuracy: 0.5987 - val_loss: 0.6593 - val_accuracy: 0.6110
Epoch 5/35
625/625 [==============================] - 115s 185ms/step - loss: 0.6423 - accuracy: 0.6194 - val_loss: 0.6474 - val_accuracy: 0.6134
Epoch 6/35
625/625 [==============================] - 116s 185ms/step - loss: 0.6309 - accuracy: 0.6370 - val_loss: 0.6386 - val_accuracy: 0.6260
Epoch 7/35
625/625 [==============================] - 115s 183ms/step - loss: 0.6139 - accuracy: 0.6539 - val_loss: 0.6082 - val_accuracy: 0.6682</pre></div>



<p class="wp-block-paragraph">A quick comment on the required time to train the model. Although the model is not overly complex and the size of the data is still moderate, training the model can take some time. I made two training runs &#8211; the first run on my GPU (Nvidia Geforce 3080 RTX) and the second on my CPU (AMD Ryzen 3700x). On the GPU, training took approximately 10 minutes. The CPU training was much slower and took about 30 minutes, three times longer than the GPU.  </p>



<p class="wp-block-paragraph">After training, you may want to save the classification model and load it at a later time. You can do this with the code below:<br>However, we need to define the model strictly as it was during training before loading.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Safe the weights
model.save_weights('cats-and-dogs-weights-v1.h5')

# Define model as during training
# model architecture

# Loads the weights
model.load_weights('cats-and-dogs-weights-v1.h5')</pre></div>



<h3 class="wp-block-heading" id="h-step-8-visualize-model-performance">Step #8 Visualize Model Performance</h3>



<p class="wp-block-paragraph">After training the model, we want to check the performance of our image classification model. For this purpose, we can apply the same performance measures as in traditional classification projects. The code below illustrates the performance of our image classifier on the validation dataset. </p>



<p class="wp-block-paragraph">To learn more about measuring model performance, check out my <a href="https://www.relataly.com/measuring-classification-performance-with-python-and-scikit-learn/846/" target="_blank" rel="noreferrer noopener">previous post on Measuring Model Performance</a>. </p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}">def plot_loss(history, value1, value2, title):
    fig, ax = plt.subplots(figsize=(15, 5), sharex=True)
    plt.plot(history.history[value1], 'b')
    plt.plot(history.history[value2], 'r')
    plt.title(title)
    plt.ylabel(&quot;Loss&quot;)
    plt.xlabel(&quot;Epoch&quot;)
    ax.xaxis.set_major_locator(plt.MaxNLocator(epochs))
    plt.legend([&quot;Train&quot;, &quot;Validation&quot;], loc=&quot;upper left&quot;)
    plt.grid()
    plt.show()

# plot training &amp; validation loss values
plot_loss(history, &quot;loss&quot;, &quot;val_loss&quot;, &quot;Model loss&quot;)
# plot training &amp; validation loss values
plot_loss(history, &quot;accuracy&quot;, &quot;val_accuracy&quot;, &quot;Model accuracy&quot;)
</pre></div>



<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="2725" data-permalink="https://www.relataly.com/image-classification-with-deep-learning/2485/image-25-4/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2020/12/image-25.png" data-orig-size="894,333" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-25" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2020/12/image-25.png" src="https://www.relataly.com/wp-content/uploads/2020/12/image-25.png" alt="" class="wp-image-2725" width="865" height="322" srcset="https://www.relataly.com/wp-content/uploads/2020/12/image-25.png 894w, https://www.relataly.com/wp-content/uploads/2020/12/image-25.png 300w, https://www.relataly.com/wp-content/uploads/2020/12/image-25.png 768w" sizes="(max-width: 865px) 100vw, 865px" /><figcaption class="wp-element-caption"><img decoding="async" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAA34AAAFNCAYAAABfWL0+AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAABxYUlEQVR4nO3deZyN5f/H8ddl7Ft2kbWItFhTiGiRSlQIbbRY2jctolX7vpNKRUmifFu0R+uvkKyhJNmyC4Nhluv3x+cMY8yMWc5yz8z7+Xicx1nu5bzPPefMnM9c131dznuPiIiIiIiIFFxFYh1AREREREREIkuFn4iIiIiISAGnwk9ERERERKSAU+EnIiIiIiJSwKnwExERERERKeBU+ImIiIiIiBRwKvxERKRQcM7Vc85551zRbKzb3zn3QzRyiYiIRIMKPxERCRzn3HLn3B7nXJV0j88JFW/1YhRNREQkX1LhJyIiQfU30Df1jnPuWKBU7OIEQ3ZaLEVERNJT4SciIkE1Drg0zf1+wNi0KzjnDnHOjXXObXDO/eOcG+6cKxJaFuece8I5t9E5tww4O4NtX3PO/eucW+2ce8A5F5edYM6595xza51zW51z3znnjk6zrJRz7slQnq3OuR+cc6VCy05yzv3knPvPObfSOdc/9Ph059yVafaxX1fTUCvnNc65P4E/Q489G9rHNufcr8659mnWj3PO3emc+8s5tz20vLZz7kXn3JPpXstHzrkbs/O6RUQk/1LhJyIiQfUzUN45d1SoIOsNvJVuneeBQ4DDgZOxQvGy0LIBQFegOdAK6Jlu2zeBJKBBaJ3OwJVkz6dAQ6AaMBt4O82yJ4CWQFugEnAbkOKcqxPa7nmgKtAMmJPN5wM4FzgBaBK6PzO0j0rAeOA951zJ0LKbsdbSs4DywOXATuw1901THFcBTgXeyUEOERHJh1T4iYhIkKW2+p0OLAZWpy5IUwwO9d5v994vB54ELgmtcgHwjPd+pfd+M/Bwmm2rA2cCN3rvd3jv1wNPA32yE8p7Pyb0nLuBe4GmoRbEIliRdYP3frX3Ptl7/1NovYuAr7z373jvE733m7z3c3JwLB723m/23u8KZXgrtI8k7/2TQAmgUWjdK4Hh3vsl3swNrTsD2IoVe4Re73Tv/boc5BARkXxI5wmIiEiQjQO+A+qTrpsnUAUoDvyT5rF/gMNCt2sCK9MtS1UXKAb865xLfaxIuvUzFCo4HwR6YS13KWnylABKAn9lsGntTB7Prv2yOeduwQq8moDHWvZSB8PJ6rneBC4GvgxdP5uHTCIikk+oxU9ERALLe/8PNsjLWcD76RZvBBKxIi5VHfa1Cv6LFUBpl6VaCewGqnjvK4Qu5b33R3NwFwLdgdOwbqb1Qo+7UKYE4IgMtluZyeMAO4DSae4fmsE6PvVG6Hy+27FWzYre+wpYS15qFZvVc70FdHfONQWOAqZksp6IiBQgKvxERCTorgBO8d7vSPug9z4ZmAg86Jwr55yri53blnoe4ETgeudcLedcReCONNv+C3wBPOmcK++cK+KcO8I5d3I28pTDisZNWLH2UJr9pgBjgKecczVDg6y0cc6VwM4DPM05d4FzrqhzrrJzrllo0znA+c650s65BqHXfLAMScAGoKhz7m6sxS/Vq8AI51xDZ45zzlUOZVyFnR84Dpic2nVUREQKNhV+IiISaN77v7z3szJZfB3WWrYM+AEb5GRMaNkrwOfAXGwAlvQthpdiXUV/B7YAk4Aa2Yg0Fus2ujq07c/plg8B5mPF1WbgUaCI934F1nJ5S+jxOUDT0DZPA3uAdVhXzLfJ2ufYQDF/hLIksH9X0KewwvcLYBvwGvtPhfEmcCxW/ImISCHgvPcHX0tEREQKDOdcB6xltF6olVJERAo4tfiJiIgUIs65YsANwKsq+kRECg8VfiIiIoWEc+4o4D+sS+szMQ0jIiJRpa6eIiIiIiIiBZxa/ERERERERAo4FX4iIiIiIiIFXNFYBwinKlWq+Hr16u29v2PHDsqUKRO7QAHKEYQMQckRhAxByRGEDMoRvAxByRGEDEHJEYQMyhG8DEHJEYQMQckRhAzKEbwM0c7x66+/bvTeVz1ggfe+wFxatmzp05o2bZoPgiDkCEIG74ORIwgZvA9GjiBk8F45gpbB+2DkCEIG74ORIwgZvFeOoGXwPhg5gpDB+2DkCEIG75UjaBm8j24OYJbPoFZSV08REREREZECToWfiIiIiIhIAafCT0REREREpIArUIO7ZCQxMZFVq1aRkJAQswyHHHIIixYtitnzhzNDyZIlqVWrFsWKFQtDKhERERERiYYCX/itWrWKcuXKUa9ePZxzMcmwfft2ypUrF5PnDmcG7z2bNm1i1apV1K9fP0zJREREREQk0gp8V8+EhAQqV64cs6KvIHHOUbly5Zi2noqIiIiISM4V+MIPUNEXRjqWIiIiIiL5T6Eo/GJl06ZNNGvWjHbt2nHooYdy2GGH0axZM5o1a8aePXuy3HbWrFlcf/31UUoqIiIiIiIFWYE/xy+WKleuzJw5c9i+fTtPPvkkZcuWZciQIXuXJyUlUbRoxj+CVq1a0apVq2hFFRERERGRAiyiLX7OuS7OuSXOuaXOuTsyWH6Ic+4j59xc59xC59xlaZYtd87Nd87Ncc7NimTOaOrfvz8333wznTp14vbbb2fGjBm0bduW5s2b07ZtW5YsWQLA9OnT6dq1KwD33nsvl19+OR07duTwww/nueeei+VLEBEREREpVBIS4J9/4Jdf4MMPYfRoWL061qlyJmItfs65OOBF4HRgFTDTOfeh9/73NKtdA/zuvT/HOVcVWOKce9t7n9oPspP3fmOkMsbKH3/8wVdffUVcXBzbtm3ju+++o2jRonz11VfceeedTJ48+YBtFi9ezLRp09i+fTuNGjXiqquu0pQKIiIiIiK5lJgIGzbA2rV2Wbdu3+3097duPXD7jz+Gww6Lfu7cimRXz9bAUu/9MgDn3ASgO5C28PNAOWcjhpQFNgNJkQp0440wZ05499msGTzzTM626dWrF3FxcQBs3bqVfv368eeff+KcIzExMcNtzj77bEqUKEGJEiWoVq0a69ato1atWnkLLyIiIiJSwCQlwapVsHw5fPFFdWbO3L+IS729MZPmpfLl4dBDoXp1OO446Nx53/1DD913u3r1qL6sPItk4XcYsDLN/VXACenWeQH4EFgDlAN6e+9TQss88IVzzgMve+9HRzBrVJUpU2bv7bvuuotOnTrxwQcfsHz5cjp27JjhNiVKlNh7Oy4ujqSkiNXHIiIiIiL78x5WroQ6dWKdhJQUK9z+/tsuy5fvf3vFCkhOTl37KABKldpXtDVoACeddGAxl3q/VKkYvbAIi2Thl9G4/z7d/TOAOcApwBHAl865773324B23vs1zrlqoccXe++/O+BJnBsIDASoXr0606dP37ssPj6eQw45hO3btwMwYkSeX1OGQrvPVHJyMrt376ZYsWIkJiaya9euvZk2bdpEpUqV2L59Oy+//DLee7Zv387OnTtJSkpi+/bte7dN3SYlJYX4+Pi997MjOTk5R+tnJSEhYb/jnBPx8fG53jZcgpAhKDmCkEE5gpchKDmCkCEoOYKQQTmClyEoOYKQISg5gpAhIjm858gnn6TmJ5/wz8UX8/dll0GRgw8Vktsc3sPWrcVYu7Yk//5bkrVrS4Zul9p7OzFx/+evVGk3NWokUL9+Am3bJlCjRgKHHppA2bKbqVWrKKVLJ5PVrGQJCVY0Ll+e47jZEoT3RiQLv1VA7TT3a2Ete2ldBjzivffAUufc30BjYIb3fg2A9369c+4DrOvoAYVfqCVwNECrVq182haz6dOnU7JkScqVKxe2F5Ub27dv39tNs1ixYpQqVWpvpjvvvJN+/foxcuRITjnlFJxzlCtXjtKlS1O0aFHKlSu3d9vUbYoUKULZsmVz9Lq2b98etuNQsmRJmjdvnqttp0+fnmmrZrQEIUNQcgQhg3IEL0NQcgQhQ1ByBCGDcgQvQ1ByBCFDUHIEIUNEctx5J3zyCTRvTt233qJuQgK8+SaULp3rHNu372ulW7bswNa7HTv2X79yZahfH044wa7r14d69ey6bl0oVaoEUAI4JIMM7XP7ysMmCO+NSBZ+M4GGzrn6wGqgD3BhunVWAKcC3zvnqgONgGXOuTJAEe/99tDtzsD9Ecwacffee2+Gj7dp04Y//vhj7/0RoWbJjh077n1zpN92wYIFkYgoIiIiIrK/J5+Ehx+GQYNg5Eh4+mkYMsQqtP/9D2rWzHCzxERYs6YkX321f3GXep3+/Lpy5ayIO+IIOO20fUVdaoEX43acAiFihZ/3Psk5dy3wORAHjPHeL3TODQ4tHwWMAN5wzs3Huobe7r3f6Jw7HPjAxnyhKDDee/9ZpLKKiIiIiEg6b7xhRV6vXvDii+Ac3HwzNGyI79uXlFatWfjwRywo1vyAlrsVKyAl5cS9uypadF8x16OHXR9++L7irlIlsuyKKXkX0QncvfdTganpHhuV5vYarDUv/XbLgKaRzCYiIiIiIvskJMCWLfDff8CH/6PRnVey9pjTmdJmHJseimPDBmvoW7bsHMom/8jEf8/hiP4ncTdv8z/O5dBDrYhr1w4uvhj27FnMmWc25vDDbdqD0KD2EiMRLfxERERERCQ6vIf4eFi/vgTz5lkBt2XL/pf0j6W9n5Bg++nAt3xOb2bSklMXvM+Om210+XLlrNWuQQOof3pTvqwyg/PHdueDP88n8b6HKT78tv2a7aZPX0vHjo2jfRgkEyr8RERERERiLDERtm2zicK3bt3/dvr7md3evt2mOoA2GT6Hc3DIIVChAlSsaJejjtr/fsP43zj3mXPYXeVwir46ldn1ylKhgq1TvHj6PR4KN0+Hyy6j+N13wLLF8PLLGa0oAaDCT0REREQiY88emD8fZs2CmTNhzRq4/35o1SrWyaJm1y473y11xMrUyz//wObN+wq3XbsOvq/ixa1wS72UL2+DoaTeTn187dolnHhio73FXMWKVriVL3+Q7pZ//AEnnQFVK1Lsxy9oWavywUOVKgXvvAONG8N999mJfpMnQ5Uq2To+Ej0q/EREREQk75KS4PffrchLLfTmzbPiD2w8/iJF4OST4d13oWvX2OYNk9279xV2n39egy++2DclwfLlNtF4WsWK2fQDdevaXOjpi7a0t9PfL1Eie5mmT/+Xjh0b5eyFrF4NnUNDb3z5JdSqlf1tnYN774Ujj4TLL7c5Fz7+OGfPLxGnwi/COnbsyA033MB5552397FnnnmGP/74g5deeinD9Z944glatWrFWWedxfjx46lQocJ+69x7772ULVuWIUOGZPq8U6ZM4cgjj6RJkyYAPPDAA5x++umcdtpp4XlhIiIiUnilpFjr0MyZNJgyBYYNg99+29dsVb68terdeKNdH3+8VTrr1lnB1707vPACXHVVLF9FtuzZAytX7l/Mpb29Zr9ZqhtRtKgVdPXqwVln7RvJsl49u9SoEcBBTjZvtqJv82aYNs0KuNy48EJ7seeeC23aUHH4cIjl3HWLFnHI3LnQoUO2Jpwv6FT4RVjfvn2ZPHnyfoXfhAkTePzxxw+67dSpUw+6TmamTJlC165d9xZ+w4cPj/lE9iIiIpIPeW/d91Jb8WbNgtmz7YQyoEbJklbcDR5s161a2egfGX3RPvRQ+PZb6NMHrr7aKqeHH475l/KdO+Gvv+yydOn+l5UrU8+bM3FxULu2FXGdO+9f2P377//Ro0cbiuanb9jx8Vah/vUXfPYZtGyZt/21aQMzZkDXrhx3++1Qtqy9N6IlIQEmTbJzDX/4geZg/2S4+mq47DLr81pI5ae3Zb7Us2dPhg0bxu7duylRogTLly9nzZo1jB8/nptuuoldu3bRs2dP7rvvvgO2rVevHrNmzaJKlSo8+OCDjB07ltq1a1O1alVahj6Ur7zyCqNHj2bPnj00aNCAcePGMWfOHD788EO+/fZbHnjgASZPnszdd9/NeeedR8+ePfn6668ZMmQISUlJHH/88YwcOZISJUpQr149+vXrx0cffURiYiLvvfcejRtrJCYREZFCZdMmK85Su2zOmmVDPoL1NWzWDC69dG9L3vdr19Lx1FMz3d2//8JPP8GPP9rpfuXLl+HQmh/Qr9X1tH7sMVb8sILl975B1VolqF7dvpdHog7cti3jwm7p0vStdtYrtUEDOOmk/eeaq1fPpiUoVizj55g+fXf+Kvr27LFJ9WbOtPPywtU6V7cu/PQTmzt3pvJVV8GiRTYRfCQPzuLFMHo0vPmmtVw2bAiPP87vW7bQ5Ntvbf7B4cNtnolrroHjjotcloDKT2/NfKly5cq0bNmSzz77jO7duzNhwgR69+7N0KFDqVSpEsnJyZx66qnMmzeP4zJ5A/76669MmDCB3377jaSkJFq0aLG38Dv//PMZMGAAYK16r732Gtdddx3dunWja9eu9OzZc799JSQk0L9/f77++muOPPJILr30UkaOHMmNN94IQJUqVZg9ezYvvfQSTzzxBK+++mrkDo6IiIgEw5498Omn9qX5449tiMmiReHYY23y7tSWvGOOObDq2bBh783kZFiwwIq8n36yy99/27KSJW3zNWvgu/VFGbnxRW6hPo//dBvLO6+mHVPYQiWKFoWqVaFatexdSpfeF2Xz5gOLutRib/36/WMfeqgVd6efbteplyOOsMFQCrzkZCvgv/gCxoyx7pnhVK4c8x94gI6ffAJPPw1//gkTJlg34HDZvdsK1pdfhu++s/fmeefBoEHQqRM4x/rp02ny4IPWFfnFF2HsWCsQO3SwAvC88zKv5AuYwlX43XgjzJkT3n02awbPPJPlKj179mTChAl7C78xY8YwceJERo8eTVJSEv/++y+///57poXf999/z3nnnUfp0G+2bt267V22YMEChg8fzn///Ud8fDxnnHFGllmWLFlC/fr1OTLUd7tfv368+OKLewu/888/H4CWLVvy/vvvZ+MAiIiISL7kvXXZfPNNG5Vx40arpK691oq95s2tWsvCtm0wc2ZFpk+3Yu/nn63nIFhh1a4dXHcdtG1ru0s7yn9SkmPTpltZMbYOJw27lOVV2jL5yk9Zmlyf9evZe/nrL7tO3W96ZcpYobhpU7vU3qd71aplxVy3bvsXd4cfbnPSFVre28/53Xfh8cetC2QkxMXBU09Bo0ZWZLVtCx99ZM2nefHHH1a8vfGGtVAffjg88oi9jmrVMt6meXN49VV47DF4/XUrAnv3hpo1rVAcONDetAVY4Sr8YqRr164MGzaM2bNns2vXLipWrMgTTzzBzJkzqVixIv379ychdcbMTLg0k2Gm1b9/f6ZMmULTpk154403mD59epb78d5nubxEaLiouLg4kpKSslxXRERE8qE1a+Ctt6zlY+FCq8a6d4d+/eCMMzLtjue9td6ldtv86Sfruul9U4oUscbBSy+17/bt2llvv0y+vgD2NNWrA7f2hhNrUr57dy57+URrcTz++APW37nTGhfXrWO/wjD1sn37ejp0OGy/4q5UqTAds4Lmnntg1Ci4/XbIYrDAsBk0yJpSe/WyET+nTLE3Sk7s2QMffGCte9Om2Ruoe3fb96mnZr9/cKVKcMst1iD02Wd2/t8998ADD0DPnlYQt2mT9Zs3nypchd9BWuYipWzZsnTs2JHLL7+cvn37sm3bNsqUKcMhhxzCunXr+PTTT+mYRZ/qDh060L9/f+644w6SkpL46KOPGDRoEADbt2+nRo0aJCYm8vbbb3PYYYcBUK5cOban/7cX0LhxY5YvX87SpUv3nhN48sknR+R1i4iISEDs3An/+5+17n35pY1W0qaNffm/4IIM+zbu3m2949J220ydmqBcOdv8/POhbNm5DBjQNG89+Nq3tyc46yw7z+ydd6yZLo3SpfdNg5CR6dP/pGPHw/IQopB49lkYMQKuuMIG1omW006zJuGuXa0b5pgxcNFFB99u6VJ45RVrpduwwU60fOgha93LSwtdXBycfbZd/vwTRo60TO+8Y62D11wDffvu35c4nytchV8M9e3bl/PPP58JEybQuHFjmjdvztFHH83hhx9Ou3btsty2RYsW9O7dm2bNmlG3bl3at2+/d9mIESM44YQTqFu3Lscee+zeYq9Pnz4MGDCA5557jkmTJu1dv2TJkrz++uv06tVr7+Aug6M50pKIiIhEh/fwww+kvP4mbtJ7uO3b2FOjDusvHco/HS5l3SFHsn07bH/bBuhMe1m61MZ02b3bdnX44fa9vV07a6g5+uh9UxJMn74lPKdtNW4M//d/cM45dt7Vc8/Zl28Jn7fespau88+3oj/arVqNGlnx16OHDbKyeLFN+p6+tW7PHvtHxejR8NVX9mbr1s1a904/Pfyj/zRsaF1SR4yAt9+2VsArr4Rbb7UC+aqr7EOQz6nwi5Lzzjtvv26Wb7zxRobrpe2quXz58r23hw0bxrBhww5Y/6qrruKqDObAadeuHb///vve+6NGjdo7ncOpp57Kb7/9dsA2aZ+vVatWB+02KiIiItG1e7c1TixebKc5zZlzBG+nK9zKb1xG57Vj6b51LHVT/mYnZZhET96kH9/+ezL+jSLwxoH7jouzlrzy5W3kymuvtSKvbdsonvpUvbp14+vb1wL884+du6U52PLuk0+gf3845RQrbmI1/GjlyjagzFVXWffKxYutJbp0aZs2JLV1b906mxBxxAibFL5mzchnK1PGzvUbMAC+/94KwKefthFJzzrL3pOdO+fb96MKPxEREZGA2bjRvg+nv/z99/5zypUsWZMKFaBG6a2cl/Qe3baOpenW70nBseSwU3jnuPtY3uJ8SlYuw0XlYHA5K+4yupQsGZDTmsqUsXO5brjBBh755x8rDA4y0Ixk4fvv7fy15s3t/LpYH8vixW2glaOOgttuszd2akFYpIh1Bx00yM45jcVs987ZqJ8dOsDq1dby+PLLcOaZdgLpNddYEZ3P5gRU4SciIiISA8nJ9n03owJv06Z965UoYT3kWra0U6IaN7bLkUck89eoJzluzhwrlBIS4Mgj4bYHKXLxxRxVpw5HxezV5VFcHDz/vI3+OGSIDUgzZYoVB5Izc+da99m6dWHq1OAMZ+qc/WyPPBIuvNDOM733XutaWatWrNPtc9hh1h112DCbOuKFF+Cmm+z+pElWDOYTKvxEREREIig+HpYsObC4++MPO5UpVbVqVtD16LGvuGvc2Hq7HdDosWwZtDyD45YutS/Ml11mo3K2bh2QZrswcM5GX6xTBy65xPqcfvppgTjXKmr++stazcqVs9a0qlVjnehA3bpZYV+mTGxa97KreHHrgty3r4169NJLNrdlPlIoCj/vfabTIUjOHGw6CBERkXwnKckGEkktoHJpzx4r5ubP3//yzz/71omLs1HtGze2U4ZSi7tGjWyU+WxZvtxGRYyPZ+Hdd3P0nXdas2BB1asX1KhhQ/efGJruoXXrWKcKvn//tYFQkpLsvMk6dWKdKHPhnNQ9Gpo3t3MR85kCX/iVLFmSTZs2UblyZRV/eeS9Z9OmTZSMdb9wERGRcFm2zFqTfvrJ7q9ZY124suA9rF1bgo8/3r/AW7IEEhNtnaJFrZhr08bGiTjqKLscccT+k5jn2D//WNG3fTt8/TUbtm4t2EVfqpNOsp/RmWfum+6he/dYpwqsotu3W0vf+vVW9B2Vbzv9ShgV+MKvVq1arFq1ig0bNsQsQ0JCQsyLpXBlKFmyJLWC1O9aREQkN7yHceNslD7n7PYXX8Dw4bBjBzz4IDjHli0HtuAtWADbtrXZu6s6dWzy8q5d7frYY63oy1OBl5GVK21Exi1b4OuvrdWhMI3AnToVQNrpHq69NtapgmfnTo4dOtSGf/3kEzj++FgnkoAo8IVfsWLFqF+/fkwzTJ8+nebNmxf6DCIiIkHgN2/BDx5Mkfcmkty2PfEjx5FYsy4rj7qQ0qtK0+jhh5ny9g6uSXyGNf/u6y1UoYIVdRdfDCVK/EGPHkdyzDFwyCFRCL16tRV9GzfaBOwtW0bhSQOoWjVrwbrwQrjuOuv2+thj+XZ4/RzbuRPWrrVunJldr1hB+S1bYOJEm3xRJKTAF34iIiKSv3kPixbZuB7ffgurVjWlbFk7dSkx0a6ze2m3Zxqvp1zKoaxlOA/y6E+3k9I0dUCJIsBInilSmhtWPE2FBjuZ9cgojmkaxzHH2OB+qWeNTJ++hnbtjozOAfj3Xyv61q2zVsnCfn5b6dI2uuJNN9n8aitWwNixsU6VeykpNozrwQq6tWth27YDty9SxOY/rFHDLi1asKB+fY7t0SP6r0UCTYWfiIiIBE58vPVm/PRTu6xYYY83bgzFizuKFYNSpexcuswuxYrtu13C7eHM/7uLDr88zpZKDRh7wf9Rvm4rHk2zfrVqcOyxjoYNnoT7y9DxgQfoOG8n3PJm7Ca7XrvWir41a+Dzz21wE7FRcp59FurVs5E/V6+m2JAhsU6VPQkJMH48vPGGnWO6bp39VyK9smXh0EOtmGvaFLp02Xc/7XWVKgeMhrmpMHUBlmxT4SciIiIx5z38/vu+Qu/77601r2xZ6602bJh9761TB6ZPn0PHjh2zv/PFi20CvNmzYeBAKj/1FFeUKZPFBg5GjLDh5YcOhV27bDCRaA+isn49nHqqVb2ffWbTGcg+zsHNN9ub4uKLObFPHxtq/6qrgjmtxZo1NgXAyy9bl90mTWzUzfTFXOrtsmVjnVgKGBV+IiIiEhOZteodcwzceKMN4NiuXR4GSfEeRo2yFqHSpW0C8JyMBHnHHbbdDTfAuefC++9bM2M0bNhgRd/ff9uk2+3bR+d586OePeHoo1l7xx0cNnkyvPkmNGtmBeCFF8a+gJoxw1onJ06E5GQbnOaGG2x01qAVp1KgFZIzYUVERCTWvIeFC+GJJ6ymqVTJ6qm334YWLawhZMUKGznzscfse3Gui771621i6Kuvhg4dbKe5Gf7/+uvh1Vetm+VZZ9k0CpG2aZM1cy5danPW5aR1s7A66ij+vOkma1UbNcrebIMGQc2acM019vOPpsREmDDB5vM44QT46CPL8eef8L//WfddFX0SZWrxExERKYz++QdGjrTWkJ497eS5CAhNN8enn1pvxbC36mVk6lSbiH3rVmtpufbavI36eMUV1tJ36aXQubO9mEjZvNmKviVLrFg45ZTIPVdBVK6cFXwDB9rUD6NGwWuvWRfLdu1g8GB7v0domq1iW7fCQw/Z861eDQ0a2Huwf//8N0m5FDgq/ERERAqTpUvh4Yf3jYKYlAR33WXnG/XsCT162JwFuWyN2LMH5s610TczOldv+HA7V6927TC+plS7dsGtt8KLL9pr+Ooruw6HCy+04q93bzjlFIrdc0949pvWli1WWP7+u7UKnX56+J+jsHDOWtvatIGnnrLun6NGwSWX2H8cLr/cisMGDcLzfPPnw7PPcuK4cfYhOO00e76zzio8U01I4EX0neic6+KcW+KcW+qcuyOD5Yc45z5yzs11zi10zl2W3W1FREQkB37/3Saga9TIRhS86iobUXDVKnj+eRvS8oEHbPTAI4+089tmzbIuc5nw3nquvfWW9Yg88URrcGnd2uqv9evtO/Y331jvxQ8+gAEDIlT0zZkDrVpZ0XfTTXZeVbiKvlTnnQcffgiLFtHshhtsiP1w2boVzjgD5s2zA9WlS/j2XdhVrmyDwCxebP8M6NjRisGGDe2Yf/BBxqNqHkxy8r5um8cdB2+/zbrOnWHBAptrsWtXFX0SKBFr8XPOxQEvAqcDq4CZzrkPvfe/p1ntGuB37/05zrmqwBLn3NtAcja2FRERkYP57Td48EGb96xMGRvo5OabbdTAVNdea5f1620AlMmT7US8Rx+FunXh/POhZ0+2bIrj44+tpkq9bNliuyhd2uqu66+3U5pOPBFq1YrC60tJsS/xw4bZF/zPP7dWs0jp0gU+/ZSSZ55p5w5+/bWNKpkX27bZfufMsWN/1llhiSrpFCliJ5eeeqqdC/jaazB6tL2/a9a0/0oMGGATNmZl61Z4/XX7h8myZfZGf/hhGDCAP+bPp+bRR0fn9YjkUCT/DdEaWOq9X+a93wNMANKfVe2Bcs45B5QFNgNJ2dxWREREMvPzz9bi0KKFtXLcdZed1/fYY/sXfWlVq2bd3z7/nJ3L1/PH0NdZVvoYEp99Edq1o3XPPvx9znX88MB01v+bTM+e8Mor1ki1dat173z8cesxGpWib/VqK/JuvRXOPtuCRLLoS9WxI3OffNJG3mzf3rrP5tb27Xai46xZNurjOeeEL6dkrmZN+0z8/be12h13HNx/v/2j47zz4Isv7J8Kaf35p/1no1Yta1U+9FB4910r/u64w/7xIBJgkTzH7zBgZZr7q4AT0q3zAvAhsAYoB/T23qc457KzrYiIiKTlPXz3nc1B9/XX9kX0gQdsNMEKFTLdLDkZFi2CX37Z15I3f34lkpP7A/05pvZWrqz5CSevf4Or17zGdbtfgH+rgjsP6vSAxp2gaLFovUozaZIVqbt3W/V5xRVRHSVxW5MmMG2aFZodOlhx3aRJznYSH2+te7/8YgXEuedGJKtkoWhRG/21Wzcr4EaPhjFjrOX7iCNsoJgmTWwgpKlTbf3evW06hlatYp1eJEecz6Lvfp527Fwv4Azv/ZWh+5cArb3316VZpyfQDrgZOAL4EmgKnHGwbdPsYyAwEKB69eotJ0yYsHdZfHw8ZWM9d0tAcgQhQ1ByBCFDUHIEIYNyBC9DUHIEIUNQchw0g/dUnDmTum+9RYX589lTsSIr+vTh33POITk071xyMmzeXJwNG0qyfn2J0KUkf/1VhiVLyrFrl/0vuGzZRBo33s5RR22jcePtNG68jUqVEvfmKB8XR+UZM6j67bdU+vlniu7aRWK5cmxs146NHTqwuWVLfFiH6Nxf3M6d1H36aep89RXbGjdm0bBh7IpK8+L+Un8mpZcvp+ktt+CSk5n3+OPEN2yYre2L7NrFcUOHcsj8+fw+fDgbOnXKU45YCkKGcOZwe/ZQ9fvvqfnhh1SYNw+APRUrsuacc1jTrRt7smjZK2jHoiDkCEKGaOfo1KnTr977A/4zEcnCrw1wr/f+jND9oQDe+4fTrPMJ8Ij3/vvQ/W+AO4C4g22bkVatWvlZs2btvT99+nQ6BmDumyDkCEKGoOQIQoag5AhCBuUIXoag5AhChqDkyDRDSgp89BH+gQdws2axp3ptFnW7nR8bXc7ydaVYuZK9lzVrDhy/okwZOPpoG4zlhBPsukGDzMejOCBHQoJ1iZs0yQY92brVRnc55xw4+WRrgUxMtEtS0r7bWV0Ott6KFfj163FDh8I990CxKLc0ZnQsli6188a2bbM5K044SCelnTvtGE2fbiPj9O0bnhwxEoQMEcuxcKH9fM84I1tTQBToY5FPcwQhQ7RzOOcyLPwi2dVzJtDQOVcfWA30AS5Mt84K4FTge+dcdaARsAz4LxvbioiIFArx8XHMm7eviFv1TzKH/jiZM2c/yBE75vE3h/MQrzB23aUkvmKtbSVK2KlItWtbDVa79oGXChXy2DuyZMl93eT27LHhOydNsm5y48dnvl1cnBVsqZeiRfe/n9GlZEkrKmvVYs7JJ9P8+uvzEDzMGjSwLrannmrD+H/8sR30jOzaZRPJT5tmU2rkoeiTKDj6aLuIFAARK/y890nOuWuBz7EWvDHe+4XOucGh5aOAEcAbzrn5gANu995vBMho20hlFRERCZrERGtEGzUKvvqqPQBFSaQv73AnD9GYJSwv1ZgXTxzHynZ9aFq3KBPTFHVVq0b1lDebgb1LF7uMGmVNjBkVdEWL5nmI+63Tp4cnczjVrWuTFp52mh2DKVOslSithAQbOOTrr21UyIsvjklUESmcIjqBu/d+KjA13WOj0txeA2Q4/FZG24qIiBR0//xjY5W89hqsXWtF3GUXLuHqst9yzMePUHLN3/imTWH4e9Q77zyuiYuLdeQDFS2a9ykO8qMaNaz7ZufO1gqadsCW3buhRw+bbuK116Bfv1gmFZFCKKKFn4iIiBxccrINGPjyy3YNNjvB4MHQpc7vJJ7akZIbNthJeKOexXXtGuXmPMm2qlWtG+eZZ9q8FuPGWcHXq9e+H/Lll8c6pYgUQir8REREYiR1DulXXrFz9w491OYhv/JK6zlISgp0GEjK7t02iMppp6ngyw8qVLCfV7ducNFFNsH8rFnw0ks2BYWISAyo8BMREYmilBSb8m3UKDuHLzkZTj8dnnnGBnrcb5DKsWPhxx/569ZbaXz66bGKLLlRrhx88om19n32GTz/PFx1VaxTiUghpsJPREQkCtavt/E8Ro+2eaKrVIFbboEBA2xQyANs3gy33gpt27K2SxcaRz2x5Fnp0lbdL1sGjRrFOo2IFHIq/ERERCLEexvlf9QomDzZRurs0AEeeADOP9+mXMjUnXfCli0wcqQVgZI/FSumok9EAkGFn4iISJht3my9NF9+GRYvtlO+rr7aTu9q0iQbO/jlF2savPFGOO44GylSREQkD1T4iYiIhIH38PPP1ro3caJN2Xbiida984ILrNdftiQn27lgNWrAvfdGMrKIiBQiKvxERETSSUy0VrvNm2H+/PJs22a3N23a93jq7bTXO3ZA2bLQvz8MGgTNmuXiyV96CX77zeaAK18+zK9MREQKKxV+IiJS4HlvhdnSpfDXXzbQSkbFW+rt7dvTbt1iv33FxUGlSnapXBlq1bLemJUrWzfO3r1tQMdc+fdfGD7chvns1Su3L1dEROQAKvxERKRA8B42bLDibulS+PPPfbeXLoX//tt//SJFoGJFK9gqVbKelUcfva+gS71euXIunTo13ftY+fIRnEpvyBDrI/rii5qvT0REwkqFn4iI5BveW2td2qIu7e1t2/atW6QI1KtnUyVceCE0bGi3jzjCirzy5W2dg5k+fQutWkXsJe3zzTcwfjzcfbeFFRERCSMVfiIiEjgJCTB//iEsW3ZggRcfv2+9uDgr7ho2hLZtrbBLLfDq1YPixWP1CnJozx4b9vPww+GOO2KdRkRECiAVfiIiEgjx8TB1Krz/PnzyCcTHNwegaFGoX9+KuQ4d7Dq1wKtb16ZJy/eefBKWLLEDUKpUrNOIiEgBpMJPRERiZssW+Ogjm9z8889h926oVs26ZtatO5/evY+lbl0r/gqs5cthxAib0f3MM2OdRkRECqiC/KdUREQCaN06mDLFWva++QaSkqB2bRg82Gqfdu2sC+f06Zs44ohYp42CG26wkw2feSbWSUREpABT4SciIhG3cqUVepMnww8/2CAtDRrALbdAjx7QqlW6QSw/+YTDx42DNm2gRImY5Y64Dz+0y2OPWfUrIiISISr8REQkIv780wq999+HmTPtsWOPhXvusZa9Y47JYMaCtWutBWziROqAzafw4otRTh4lO3fC9dfb5H833hjrNCIiUsCp8BMRkbDwHubP39eyt2CBPX788fDII1bsZTpLQUoKvPoq3HabDek5YgQr586l9ksvWd/PCy+M2uuImgcfhH/+gW+/LSAj1IiISJCp8BMRkVzz3lrzUlv2li61Vrz27eHZZ+Hcc6FOnYPsZNEiGDjQ+oB27AgvvwxHHsmyr76i9tq1MGAANGtmLWMFxeLF8PjjcOmlNlSpiIhIhKnwExGRbEtKgrlz4fvvrU774QcbrKVoUTjlFLj1VujeHapXz8bOEhLg4YftUq4cvP469Ou3t/+nL1oU3n0Xmje3EwFnzLD18jvvbc6+MmWs+BMREYkCFX4iIpKpHTvgl1/2FXn/93/7JlCvXx86d4bTToNzzoGKFXOw42+/hUGDbO66iy6Cp56yeRzSq1kTJkywJxkwAN55J4MTA/OZd96BadNg5MiMX7OIiEgEqPATEZG9Nm60Au+dd47g9tth9mxr5XMOjjvOGuTat7fT7mrVysUTbNli5/G9+qpVjp99BmeckfU2nTrBAw/AnXfCSSfBtdfm6rUFwtatcPPNduLjgAGxTiMiIoWICj8RkULKe/j7732ted9/b6eeARQrdhgnnmg12kkn2awKFSrk8cnefddG7Ny0yXZ8zz1QunT2tr/9dvjpJyuaWrWCE0/MQ5gYuusuWL8ePvnEJisUERGJEhV+IiJBkJgI27fj9uyJ2FMkJ9uom6lF3g8/wJo1tqxCBWvF69/fCr0dO76nc+eTw/PEy5fDVVdZ616rVvD55zZYS04UKQJjx0KLFnDBBdYUWaVKePJFy+zZNjXF1VdDy5axTiMiIoWMCj8Rkdzw3k6A274988u2bVkvT3tJSACgTYUKVhi1bh2WmLt3wwcfwLhxVuht22aP16oFJ59s3TZPOgmOPtpqq1TTp/u8P3lSkg3teffd1lf0mWesm2ZuW7oqVoRJk6BtWzsvcOrU/NNqlpJixW+VKtZtVUREJMpU+ImI5MSePTbQyA8/WPGXHWXL2miU5cvbdblyULfuvtuplzJlSH7iCRse84MP4PTTcx1z8WJ45RV4803rWVmvnk2Fd9JJVuwddIqFvPr1VzuH7bffoGtXa+kKx5O2bAnPP28DwzzwgHUXzQ9efdVGJR03Lo99ZkVERHInooWfc64L8CwQB7zqvX8k3fJbgYvSZDkKqOq93+ycWw5sB5KBJO99q0hmFRHJltdft36S11yTcfGWQTG3X1PaQfxWty5t778fzj7bujb26ZPtbXftsvn0Ro+2iEWL2jx6AwfCqafmKEbuxcdbC9+zz9qIle+9Z1MxhHMkzgEDrPC+7z471+9gg8PE2oYNcMcd1sR60UUHX19ERCQCIlb4OefigBeB04FVwEzn3Ife+99T1/HePw48Hlr/HOAm7/3mNLvp5L3fGKmMIiI5sns3PPigjXTy/PMRmVZgT+XKNtVBt27WRLdx40FHsVywwFr3xo2zQTMbNIBHH7UROLM1n164fPKJnb+2YgUMHmzz80Widcs5GDXKWhMvusiua9cO//OEy+23W3fel17K/1NRiIhIvhXJ//+2BpZ675d57/cAE4DuWazfF3gngnlERPLmtddg5Uq4//7IfoFPPc/vnHPguuusBS1dt9KdO+GNN+x0t2OPtTrojDPgm29sarzbboti0bd2LfTubV06y5Sx5saRIyPbpbF0aWve3LMHevWy6yD64QdrJb7lFmjSJNZpRESkEItk4XcYsDLN/VWhxw7gnCsNdAEmp3nYA1845351zg2MWEoRkexISICHHrKT5E49NfLPV6qUFTaXXw4jRtjAIMnJzJ1rDYA1a8Jll8HmzfDEE7Bqlc0L3qlTlLp0ghWjr74KRx0FU6ZYQfzbb3aMouHII2HMGJthfsiQ6DxnTiQm2s+tTh2bxkFERCSGnM/u4AQ53bFzvYAzvPdXhu5fArT23l+Xwbq9gYu99+ekeaym936Nc64a8CVwnff+uwy2HQgMBKhevXrLCRMm7F0WHx9P2bJlw/zKci4IOYKQISg5gpAhKDmCkCG/5Djs/fdp+PzzzHnySf5r0SJ6Gbyn9sjXOOK9t/m8XDe6b3+XlGLFOfnkDXTtuobjjtsakcbHg/1MSqxdS6MnnqDSr7/yX9OmLLn5ZnaFecSY7L4vjnjxRWpPmsTCu+5iwymnhDVDTnKkV2viRBqMHMn8ESPYlMdiOD98RgpbjiBkCEqOIGQISo4gZFCO4GWIdo5OnTr9muH4KN77iFyANsDnae4PBYZmsu4HwIVZ7OteYMjBnrNly5Y+rWnTpvkgCEKOIGTwPhg5gpDB+2DkCEIG7/NBjp07va9Rw/uTT/Y+JSVqGX791ftBg7wvW9b7G3nKe/ArGnbym/7eGtEM6XPsJyXF+1GjLFSZMt6/9JL3ycnRzZDenj3et21rmRYtil2OtFautDxdu4blPRP4z0iUBSFHEDJ4H4wcQcjgfTByBCGD98oRtAzeRzcHMMtnUCtFskPQTKChc66+c6440Af4MP1KzrlDgJOB/6V5rIxzrlzqbaAzsCCCWUVEMjd6NPz7r40iGeHBOXbsiOPll23WgpYtbTqGHj2g14834ceOo/bf31Pp/I6wbl1Ec2To779tKovBg22ewQULrCtj1PqWZqJYMZg40brH9uhhI4vG2k032TyGzz2nAV1ERCQQIjaqp/c+yTl3LfA5Np3DGO/9Qufc4NDyUaFVzwO+8N7vSLN5deADZ38siwLjvfefRSqriEimdu600SlPOcWG4w+j3bth0SKYP3/fZfr0tiQk2IAtzz9vg1ZWrBjaoO3FUKWyFTft2sEXX8Dhh4c1U4ZSUmz0mNtusyLm5ZdtSoUgFTSHHQbjx0PnzlaYjhsXu3yffWYTzT/wANSvH5sMIiIi6UR0Hj/v/VRgarrHRqW7/wbwRrrHlgFNI5lNRCRbRo2y1rVJk3K9i5QUWL58/wJv/nz44w9ITrZ1ihe3MVJOO20dw4fXpHXrTOqWM8+Er7+2ef7atbMio2kEf10uWwZXXAHTp9uE8q+8YvMXBtFpp9kAM3fdZcfmqquinyEhwUbfOfLIYA44IyIihVZECz8RkXxtxw545BErKLI5OMeGDQcWeAsX2q5S1a9vLXrnn2/Xxx4LDRtaj8Xp0//ghBNqZv0kbdrYNAFnnAEdOsBHH9l1OKWkwAsv2Bx0cXFW8F1xRbBa+TJy553w009w443QqhUcf3x0n//RR+Gvv+DLL6FEieg+t4iISBZU+ImIZOall6ySu+++Axbt3Am//35gkZf21LsqVayou/zyfQXe0UdDuXJhyNakCfz4oxV/nTvDu+9C96ymSs2Bv/6i2U03wbx5tv/Ro21KgvygSBHr5tmiBfTsCbNnQ+XK0XnupUutW3CfPvbPAhERkQBR4ScikpH4eHjsMSt82rYFrLvm+PFWY82fv29O9ZIlraA788x9Bd6xx9oE6hFtIKtTxyZLP/tsaz585RWrMnMrtZVv6FDKOmcT1l92WfBb+dKrXNm65p50ElxyCXz8ceQGoElMhG++scFl3n/f+uw++WRknktERCQPVPiJiGTkhRdg40a23nwf40fC229bAxtYPXH33fsKvCOOsN6QMVGlip3z17OndcVcv966Z+a0WPvzT9v+++/hzDOZedlltOnVKzKZo+H44+GZZ+Dqq+Ghh2D48PDtOykJvv3W/gPw/vuwaZM14557rp3fV/MgXXVFRERiQIWfiEg6O9duI+7Bx5lX7Szann0CSUnWs/Khh6BvX6hXL9YJ0ylbFj78EPr3h6FDrfh74onstXIlJ9uUA8OGWWvV669Dv37s/vbbiMeOuMGD7VzIu++GE0/MW/fL5GQriidOhMmT7RiXKQPdukHv3tYyXLJk+LKLiIiEmQo/ERGsEWfGjIqMGQNHvPs89+zZzL2l7uXGG21KhaZNA97jsXhxeOstqFoVnn7azk0cM8ZGjMnMH39YV86ffrLuoi+/bNMiFBTO2fmJc+bAhRfCb7/l7PWlpMBPP9Hguees4l+71uYK7NrVir2zzrL7IiIi+YAKPxEptLyHmTOtG+e778K6dU2pXX4rI3mSjW3O4cPvj49dF87cKFLEujdWr24teJs2wXvvWctUWsnJtt7w4dZKNXYsXHxxwCvbXCpTxlroWrWCCy6waSmyKoa9h19+sTfEe+/B6tXUKF7cir0LLrDr9MdTREQkH1DhJyKFzp9/WrE3frzdTv1e37TpAoYmvU+xEVso88K9kJ+KvlTO2ZQGVataV8fTTrPBTVJHtly82Fr5fv4ZzjnHWvlq1Iht5khr3NgGqunTxyahf/rp/Zd7D7NmWTfOiRNhxQp7U3TpAo89xk8VKtD+rLNik11ERCRMVPiJSKGwbp014rz1lrXyOQcdO9o4KD16QIUK8MPHyyl28VM2SEeLFjFOnEcDBtjAL337Qvv2MHWqtWDddReULm0H4sILC2YrX0Z697bReZ55xiZ379HDuoCmFnvLlkHRojY1xogRdu5ehQoAJE+fHsPgIiIi4aHCT0QKrO3bYcoUa9376ivr4di0qc3S0Lcv1Kq1//q13nsPtm6Fe++NRdzwO+88+PxzK2IaNrQTGc89F0aOhEMPjXW66HviCZgxw1o877zTmnvj4qxVdNgwOzaVKsU6pYiISESo8BORAsF7WLnSvtf/8otdz5gBCQlQt6718LvoIptvL0ObN1Nr8mRrCWraNKrZI+rkk23qgSFDbLqGPn0KTytfesWLW6vnqafaHIi33mrFcZUqsU4mIiIScSr8RCRf+u8/67KZWuD98ot15wT7ft+8OQwaZNPbtW2bjZkNnnqKojt2wD33RDp69DVrZk2eArVr22imIiIihYwKPxEJvN27Yd68/VvylizZt7xRI5tGrXVruzRtasVftm3aBM8+y/qTT6basceGPb+IiIhIrKnwE5GDS0mxwUGKF7cqq3bt7E0Ongve26lXaVvy5syBPXtsefXqcMIJcMkldt2q1d4xOHLvySdhxw6W9+tHtTzuSkRERCSIVPiJSNY2bIB+/eDTT/c9VqqUDRbSuLEVgmkv5crlaPfbtsH//V9lvv7aCr2ZM2HLFltWurQVdjfcsK81r3btMJ+itmEDPPcc9O7Nzvr1w7hjERERkeBQ4Scimfv+exv+cuNGeP55OO4462O5eLFdz54NkyZZi2CqGjUyLgjr1iXtbOjx8fDsszbQ4n//HUuRInDMMXZOXmqR16SJjbAfUU88Abt22bl9a9dG+MlEREREYkOFn4gcKCUFHnkE7r4bDj/cJvtu1syWdeiw/7q7d8Nff1khmPby7rv7mu4ASpSAhg1JbtCIWdsbMfaXRsyKb0SXLo1o3Xk5Awc2o0yZqL1Cs349vPCCFbeNG6vwExERkQJLhZ+I7G/9ejuB7osvrCB6+eWsu2+WKGFNc02a7P+499ZSGCoEkxcuZuVXS0j+eD4tk6ZwAsm23mfw3+pjKdPjYyhTJ3KvKyOPPWbzPdx9d3SfV0RERCTKVPiJyD7Tp8OFF1pL3ejRcOWVuT+hzjmoWpXkSlV566+TuG8K/P23Ta3w0H2JnFx7mXUZnT+fsg8/DC1awPjx0LlzOF9R5tauhZdegosvhiOPjM5zioiIiMRIZIblE5H8JTkZ7r/fJrYuX96G0hwwIE+jqKSk2FzZxxwD/ftDxYo2MOgPP8DJpxWz8/66d4fhw/l11Cg7N7BLF8uR9pzBSHn0URsq9K67Iv9cIiIiIjGmwk+ksFu71ibBu+cea+2bNcsGcckl7+Hjj6FlS7jgApv1YdIk2+2ZZ2ZcS+6qXdvOI7z4YsvRtavNrRcpa9bAqFFw6aXQoEHknkdEREQkIFT4iRRiFX791QZt+ekneO01GDsWypbN9f6+/tq6cp5zjk3TMG6cTbzeo0c2Gg/LlIE337SC7OuvrXKcNSvXWbL0yCOQlATDh0dm/yIiIiIBo8JPpDBKToZ77qHprbdCpUo2ed7ll+e6a+f//R+ccgqcdhqsWmXjwSxebA14aWZwODjnYNAg6w/qPbRrZzvzPle5MrRqlZ2/2K+fjVgqIiIiUgio8BMpbNassQrt/vtZe8YZVvQdfXSudvXbb9Yrs21bWLgQnnkG/vwTBg6EYsXykPH4422OwFNOgcGD7STBnTvzsMM0Hn7YCl+19omIiEghosJPpDD54gvr2jljBrzxBktuv53cTJ63aBH06mUDcf74Izz0kE3ld8MNULJkmLJWrgyffAL33Wd9Rk880arKvFixAl591Vo369ULS0wRERGR/ECFn0hhkJQEw4bZqJnVqlkrX79+Od7NsmW22THHwGef2YCYf/8NQ4fm6dTAzBUpYnPsffoprF4NrVrBBx/kfn8PPWTdRocNC19GERERkXxAhZ9IQbdqlXWZfOgha+maMePAydYz4b0Vdu+8Y5s2agQTJ8LNN1sReP/9UKFCZOMDNurob79B48Zw/vlw661WzObE8uUwZozNTVgnyhPFi4iIiMRYRCdwd851AZ4F4oBXvfePpFt+K3BRmixHAVW995sPtq2IZMOnn8Ill0BCArz1Flx0UZarb9tmjYG//GKzK/z8M2zYYMtKl7Zz94YNg5o1o5A9vTp14LvvrOp84gkrYCdMsPn/suPBB23wmDvvjGxOERERkQCKWOHnnIsDXgROB1YBM51zH3rvf09dx3v/OPB4aP1zgJtCRd9BtxWRLCQm2uAljz1mc/JNnGjNdWkkJ8OyZWX48899hd7vv+8bQLNxYzjrLDu17oQTrHtnngZsCYcSJeDFF200mYED7STDd9+FDh2y3u7vv+GNN2ygmFq1ohJVREREJEgi2eLXGljqvV8G4JybAHQHMive+gLv5HJbEUm1YgX07Wtz8w0aBE8/DaVKsXatFXipRd7MmRAffzxgMzqccIJNuH7iiTaoZsWKMX4dWbnoImja1CYIPOUUm5fvllsyn47igQdsXomhQ6ObU0RERCQgIln4HQasTHN/FXBCRis650oDXYBrc7qtiISkpFh3zptuwu/Zw5/3vcMn5frwy2VW6P3zj61WtKgN7NmvHxxyyCL69z+KBg1yPYVf7BxzzL75B2+91Qrd11+HQw7Zf72lS21i+GuvjVEfVREREZHYcz6cEyOn3bFzvYAzvPdXhu5fArT23l+Xwbq9gYu99+fkYtuBwECA6tWrt5wwYcLeZfHx8ZSNyFCDOROEHEHIEJQcQcgQ7hwVZ82izgujqfjPn8wt2ZI+iW+zONm6dlavnsBRR23jqKO20aTJNho2jKdEiZSwZ8iLPOXwnlqTJnHEqFHsqlGDhffdx44jjti7uPEjj1B12jR+GT+ePZUrRy5HmAQhQ1ByBCFDUHIEIYNyBC9DUHIEIUNQcgQhg3IEL0O0c3Tq1OlX732rAxZ47yNyAdoAn6e5PxQYmsm6HwAX5mbbtJeWLVv6tKZNm+aDIAg5gpDB+2DkCEIG78OTY/uPc/3KY87wHvwy6vnevONPapvsb7/d+w8+8H7NmshnCIew5PjuO+9r1PC+VCnvx461x5Ys8b5IEe9vvjl6OfIoCBm8D0aOIGTwPhg5gpDBe+UIWgbvg5EjCBm8D0aOIGTwXjmClsH76OYAZvkMaqVIdvWcCTR0ztUHVgN9gAvTr+ScOwQ4Gbg4p9uKFEbJyfD9O6tw99xF+2VvsocKPFzlSdy11/Bo/xLUrRvrhDHSvj3Mnm3nN156qc0s/99/NiDMbbfFOp2IiIhITEWs8PPeJznnrgU+x6ZkGOO9X+icGxxaPiq06nnAF977HQfbNlJZRfKDBQtg4itbqfLaowzY8TQOz9fNhlDh0aHccXrF/HeOXiQceih8+aWNaProo/bYkCFQvXpsc4mIiIjEWETn8fPeTwWmpntsVLr7bwBvZGdbkcJmwwYYPx7Gv7GH1nNe5m7upyobWdH+Iqq/+iCnH1lYm/eyULSojfLZpo1N2K7WPhEREZHsFX7OuTLALu99inPuSKAx8Kn3PjGi6UQKod274aOPYOxY+HSqp3vyZN4rMZQ6LGVP+1Pgmcep06JFrGMGX/fudhERERERimRzve+Aks65w4CvgcvIoJVORHLHe5ty4aqroEYN6NUL3E8/srR6WybRizoNS8LUqRT/9iubtFxEREREJAey29XTee93OueuAJ733j/mnPstksFECoN//oFx46x1788/oVQpuOa0JQzZPJTqP35g88699ppNuhcXF+u4IiIiIpJPZbvwc861AS4CrsjhtiKSRmIivPsuPPlkU+bMscc6doT7rl5HjwX3UfyN0VC6NDz4INx4o90WEREREcmD7BZvN2Jz6X0QGpnzcGBaxFKJFEC7dlnj3eOPw4oVUKtWCUaMgEt77KDOpKfgrscgIQEGD4a774Zq1WIdWUREREQKiGwVft77b4FvAZxzRYCN3vvrIxlMpKDYuhVeegmeftpG6WzXzu6XLvF/dFr+N5x6N/z7L/ToAQ89BEceGevIIiIiIlLAZGtwF+fceOdc+dDonr8DS5xzt0Y2mkj+tm4dDB0KderAnXdCy5bw3Xfwww9wdqlvaD3gShgwAOrXt8nGJ01S0SciIiIiEZHdUT2beO+3Aedic+vVAS6JVCiR/Gz5crj2WqhXz+YQP+MMmD0bPv0U2rcHfvkFzjwTl5gIkydbJdi2bYxTi4iIiEhBlt3Cr5hzrhhW+P0vNH+fj1gqkXzo99/h0kuhQQMYPRouuggWL4aJE6F589BK69ZZl87DDmP2Sy/B+eeDczHNLSIiIiIFX3YHd3kZWA7MBb5zztUFtkUqlEh+MmMGPPwwTJliA3Bedx3ccgvUqpVuxcRE6N0bNm+Gn34i6b//YpBWRERERAqjbLX4ee+f894f5r0/y5t/gE4RziYSWN7D11/DaafBCSfA9Ok2EOc//9ggLgcUfQC33QbffguvvALNmkU5sYiIiIgUZtlq8XPOHQLcA3QIPfQtcD+wNUK5RAIpJQU+/NBa+GbMgBo1bHqGQYOgXLksNhw/Hp55Bm64wfqAioiIiIhEUXbP8RsDbAcuCF22Aa9HKpRI0CQmwtixcOyxcN55sHEjjBoFy5bBkCEHKfrmzoUrr4QOHaxKFBERERGJsuye43eE975Hmvv3OefmRCCPSKDs2gVjxli99s8/VviNHw+9ekHR7Hx6Nm+2SrFiRRvlpVixiGcWEREREUkvu4XfLufcSd77HwCcc+2AXZGLJRJ7U6fC5ZfbQJxt28ILL8DZZ+dgEM7kZOvWuWqVTeBXvXpE84qIiIiIZCa7hd9gYGzoXD+ALUC/yEQSia2UFLj/frscd5w11LVvn4tZF+65Bz77DF5+GU48MSJZRURERESyI1uFn/d+LtDUOVc+dH+bc+5GYF4Es4lE3ZYtcPHF1tp3ySV2Hl/p0rnY0ZQp8OCDcMUVMGBAuGOKiIiIiORIdgd3Aazg896nzt93cwTyiMTMnDnQqhV8+SW89BK8+WYui77Fi20m9+OPt/6hmqBdRERERGIsR4VfOvo2K8GQkmKjsOTBuHHQpg0kJNhUe1ddlct6bft2G8ylZEmYPNmuRURERERiLC+Fnw9bCpHcSkiAzp3hsMOs0MqhPXvgmmusge7EE2H2bCsAc8V76N8f/vzTTgysXTuXOxIRERERCa8sCz/n3Hbn3LYMLtuBmlHKKJKxxES44AL4+msbMbNnTzufbseObG2+ejWcfLJ167zlFuvimaeBNx95BN5/3+Z+6NgxDzsSEREREQmvLAs/73057335DC7lvPfZHRFUJPxSUqx17aOP4MUXYd48GDoUXnsNWrSAX3/NcvNvv7XV5s+3xrknnsjmvHyZ+eILGDYM+vSBG2/Mw45ERERERMIvL109RWLDe7j2WptJ/aGH4OqrbWL0hx6Cb76BnTutv+Zjj1mBmG7TiRNrceqpNqf6jBk2GXue/P23FXzHHguvvqrBXEREREQkcFT4Sf4zbBiMHAm33QZ33LH/so4dYe5c6N4dbr8dTjvNJlAH4uOtPhs5sgHdu1vR16RJHrPs3Annn28V5fvvQ5kyedyhiIiIiEj4qfCT/OXRR+Hhh2HQIDunLqPWtUqVrP/ma69ZdXfccax+/n1at4ZJk2DgwL+YNAnKl89jFu8tx9y51vp4xBF53KGIiIiISGSo8JP8Y9Qoa+Hr29fO68uqS6VzcPnl8NtvbKl0BIdd34Ohywbw1f920LfvyvD0xnz+eXjrLbj/fjjzzDDsUEREREQkMlT4Sf4wfrydy3f22TazelzcQTdJSoKhYxpS/a8feePQO7h4z2t0uqUFZZcsyXue776zoUC7dYM778z7/kREREREIiiihZ9zrotzbolzbqlz7o5M1unonJvjnFvonPs2zePLnXPzQ8tmRTKnBNxHH9lEex06wHvv2UAuB7FhA3TpYr1BLx9UnL7LH8Z9/TXs2EGLa6/NcOCXbFu92kaEOfxwGDsWiuj/JyIiIiISbBH7xuqciwNeBM4EmgB9nXNN0q1TAXgJ6Oa9PxpIP75iJ+99M+99q0jllICbNs2KrBYt4MMPoVSpg24yYwa0bAk//ABjxlgP0RIlgE6dYN48NrVtawO/nH66FXE5sXu3zRe4cyd88AEcckjuXpeIiIiISBRFsqmiNbDUe7/Me78HmAB0T7fOhcD73vsVAN779RHMI/nNjBnWlfKII+DTTw86Gov3MHo0tG9vjXA//QSXXZZupUqVWHjvvTbtws8/w3HHWQGXXTfcYNu98UYYhgQVEREREYmOSBZ+hwEr09xfFXosrSOBis656c65X51zl6ZZ5oEvQo8PjGBOCaIFC2zAlKpV4csvoXLlLFdPSIArr7RBNjt1svnbW7TIZGXn4Ior4LffoH59m45h4EDYsSPrTK+9Bi+/bAPM9OiRu9clIiIiIhIDznsfmR071ws4w3t/Zej+JUBr7/11adZ5AWgFnAqUAv4PONt7/4dzrqb3fo1zrhrwJXCd9/67DJ5nIDAQoHr16i0nTJiwd1l8fDxly5aNyOvLiSDkCEKG7OYouXo1zW+4AZzjt2efJaFmzSzXX7u2BPfccwx//FGOSy5ZTr9+y7Mc+yVtBpeYSP3XX6f2hAnsqlWL34cNI75RowO2KbdoEc1vuIH/mjZl3iOPZGtwmYMJws8kCBmUI3gZgpIjCBmCkiMIGZQjeBmCkiMIGYKSIwgZlCN4GaKdo1OnTr9meKqc9z4iF6AN8Hma+0OBoenWuQO4N83914BeGezrXmDIwZ6zZcuWPq1p06b5IAhCjiBk8D4bOVat8r5ePe8rV/Z+4cKD7m/RIu8PO8z78uW9/9//8pDhm29sR8WKef/oo94nJ+9btm6d97VqWa6NG7P3JLnNEWVByOC9cgQtg/fByBGEDN4HI0cQMnivHEHL4H0wcgQhg/fByBGEDN4rR9AyeB/dHMAsn0GtFMmunjOBhs65+s654kAf4MN06/wPaO+cK+qcKw2cACxyzpVxzpUDcM6VAToDCyKYVYJg40YbcGXTJvjss4OeQzdvHpx8MiQm2kAu3brl4bk7dbKJ2M85xwZ+6dzZBn5JSoLevS3b++8ftMupiIiIiEgQFY3Ujr33Sc65a4HPgThgjPd+oXNucGj5KO/9IufcZ8A8IAV41Xu/wDl3OPCBs1m2iwLjvfefRSqrBMC2bTb/wt9/W9HXKuuBXGfOhDPOgNKl4euvIYPemTlXuTJMmmTn8t1wgw380r49TJ9u0zY0bx6GJxERERERib6IFX4A3vupwNR0j41Kd/9x4PF0jy0DmkYymwTIrl3W0jZ3LkyZYs14WfjhBzjrLKhSxYq++vXDmMU5GyWmfXu46CL43//guuvgkkvC+CQiIiIiItEV0cJP5KD27LF58b7/HsaPh7PPznL1r76yLp116tjtWrUilKtRI5sPYto0OOWUCD2JiIiIiEh0RPIcP5GsJSfDpZfC1Kk2y3qfPlmu/tFH0LUrNGgA334bwaIvVfHi1p+0WLEIP5GIiIiISGSp8JPY8B6uugrefRcee8zm0cvCxIk23d5xx9kpd9WrRyemiIiIiEhBoMJPos97uO02eOUVGDYMbr01y9XffBP69oUTT7TunZUqRSmniIiIiEgBocJPou/hh+GJJ+Caa2DEiCxXHTkS+ve30+w++wzKl49ORBERERGRgkSFn0TVYR98YK18F18Mzz1no2hm4skn4eqrbcDPjz6CMmWiGFREREREpABR4SfRM24cDZ97Drp3h9dfhyIZv/28h/vvhyFD4IILYPJkKFkyyllFRERERAoQFX4SHT/9BJdfzpbmzWHCBCia8Uwi3sMdd8A990C/fjbDgwbVFBERERHJGxV+EnkbN0Lv3lCnDgvvvz/T5ruUFJsr/bHHbMDPMWMgLi7KWUVERERECiBN4C6RlZICl1wC69fD//0fSdu2ZbhacjIMGGA9QIcMseIvi9P/REREREQkB9TiJ5H16KM2HOczz0CLFhmukpgIF11kRd8996joExEREREJN7X4SeR8+y0MHw59+sDgwRmukpBgvUA//NAKvoNM6SciIiIiIrmgwk8iY906K/gaNIDRozNswtu5E849F778El54wab1ExERERGR8FPhJ+GXnAwXXgj//QdffAHlyh2wyvbt0LUr/PCDDeJy2WXRjykiIiIiUlio8JPwu/9++OYbq+iOPfaAxVu2QJcuMHu2TdfQu3cMMoqIiIiIFCIq/CS8vvwSRoywSfgyaMbbsqUYnTrBokU2MXu3bjHIKCIiIiJSyKjwk/BZvdqG52zSBF58McPFN97YjA0b4KOPoHPnGGQUERERESmENJ2DhEdSkg3msnMnvPcelCmzd5H3Nmpn27awYUMJPvtMRZ+IiIiISDSp8JPwGD7cRmoZPRqOOmrvw4sXw5lnQvfuULYsPP30XDp0iGFOEREREZFCSIWf5N0nn9hE7YMG2WiewNatcMstNrbLzz/b/O1z5kCjRttjGlVEREREpDDSOX6SN//8A5dcAs2awTPPkJICb74Jd9wBGzbAlVfCgw9C1aqxDioiIiIiUnip8JPc27PH5mJISoL33uOXuSW5/nqYMQPatIGpU6Fly1iHFBERERERdfWU3Lv9dvjlF7Y8NYbLHmzAiSfCypUwbhz8+KOKPhERERGRoFCLn+TO++/DM88w+6Tr6XhzTxISrA4cNgzKlYt1OBERERERSUuFn+TcX3+ReOnl/F6yNSf+8Didz4ann4aGDWMdTEREREREMqLCT3Lkr4UJ+HYXUHmH48Z67/LBC8U5++xYpxIRERERkazoHD/Jlvh4uPNO+Oq4m2mwdTbT+r3J50vqqegTEREREckHIlr4Oee6OOeWOOeWOufuyGSdjs65Oc65hc65b3OyrUSe9zB+PDRqBH8//A6DUkYSf9WtnP9GN4oXj3U6ERERERHJjogVfs65OOBF4EygCdDXOdck3ToVgJeAbt77o4Fe2d1WIu+336B9e7joIjix4hLeKj0Q2rWj7LMPxjqaiIiIiIjkQCRb/FoDS733y7z3e4AJQPd061wIvO+9XwHgvV+fg20lQjZsgEGDbDqGP/6A11/cyaQivYgrXRImTIBixWIdUUREREREcsB57yOzY+d6Al2891eG7l8CnOC9vzbNOs8AxYCjgXLAs977sdnZNs0+BgIDAapXr95ywoQJe5fFx8dTtmzZiLy+nAhCjuxmmDPnEO666xh27izK+eevol+/f2j50kMc+tlnzHvkEba0bh2VHJEUhAxByRGEDMoRvAxByRGEDEHJEYQMyhG8DEHJEYQMQckRhAzKEbwM0c7RqVOnX733rQ5Y4L2PyAXrtvlqmvuXAM+nW+cF4GegDFAF+BM4MjvbZnRp2bKlT2vatGk+CIKQIzsZVqzwvkoV7xs18n7hwtCDr7/uPXg/fHjUckRaEDJ4H4wcQcjgvXIELYP3wcgRhAzeByNHEDJ4rxxBy+B9MHIEIYP3wcgRhAzeK0fQMngf3RzALJ9BrRTJ6RxWAbXT3K8FrMlgnY3e+x3ADufcd0DTbG4rYZSQAD16wO7dMGUKNG4MLFgAV18NnTrBvffGOKGIiIiIiORWJM/xmwk0dM7Vd84VB/oAH6Zb539Ae+dcUedcaeAEYFE2t5Uwuv56mDkT3nwzVPTFx0OvXlC+vA3rGRcX64giIiIiIpJLEWvx894nOeeuBT4H4oAx3vuFzrnBoeWjvPeLnHOfAfOAFKx75wKAjLaNVNbC7pVX7DJ0KJx3HjaHw6BBNrLLV1/BoYfGOqKIiIiIiORBJLt64r2fCkxN99iodPcfBx7PzrYSfjNmwLXXwumnw4gRoQdfecVa+UaMsG6eIiIiIiKSr0V0AncJtvXr7by+GjXgnXdCvTkXLLB+n2ecAXfeGeuIIiIiIiISBhFt8ZPgSkqC3r1h40b48UeoXDm04J57oGRJGDcOiuj/AiIiIiIiBYG+2RdSd9wB06fDqFHQokXowd9/h/ffh+uug6pVYxlPRERERETCSIVfITRxIjz5JFxzDfTrl2bBww9D6dJwww0xyyYiIiIiIuGnwq+QWbAALr8c2raFp55Ks2DZMjvRb/BgqFIlZvlERERERCT8VPgVIv/9B+efD+XKwXvvQfHiaRY+9piN7nLLLbGKJyIiIiIiEaLBXQqJlBS49FL4+2/45huoWTPNwtWr4fXX4bLL0i0QEREREZGCQIVfIfHWW3X56CN47jlo3z7dwqeeguRkuP32mGQTEREREZHIUlfPQuDTT+GNN+px8cU2Wft+Nm60oT0vvBDq149JPhERERERiSwVfgXcX39ZTXf44Tt4+WVwLt0Kzz4LO3fa/A4iIiIiIlIgqfArwHbutMFcnIP7719A6dLpVti2DZ5/3lZq0iQmGUVEREREJPJ0jl8B5T0MGADz58PUqVCyZMKBK730EmzdCnfeGf2AIiIiIiISNWrxK6Cefx7Gj4f774cuXTJYYedOG9TljDOgZcuo5xMRERERkehR4VcAffedTcfXrVsWjXmvvQYbNsCwYVHNJiIiIiIi0afCr4BZvRouuMAG6Bw7Fopk9BPes8cmbG/fPoO5HUREREREpKDROX4FyJ490LMnxMfD11/DIYdksuK4cbBqFbzySlTziYiIiIhIbKjwK0Buugl+/hkmToSjj85kpeRkeOQRO6/vjDOimk9ERERERGJDhV8B8cYbNkjnrbdCr15ZrPjee7B0KUyenMGkfiIiIiIiUhDpHL8CYPZsGDwYTjkFHnooixVTUmyFo46Cc8+NVjwREREREYkxtfjlcxs32vzr1arBhAlQNKuf6Mcf28R+mY76IiIiIiIiBZEKv3wsORn69oV//4UffoCqVbNY2Xt48EEb7rNv36hlFBERERGR2FPhl48NHw5ffQWvvgrHH5/1uhVmz4YZM2DUqIM0C4qIiIiISEGj/n751Pvv2+CcAwfCFVccfP26b78NNWpAv36RDyciIiIiIoGipp98aOFCq99at4bnnsvGBv/3f1T87Td48kkoWTLi+UREREREJFjU4pfPbN4M3btD2bLW6leiRDY2eughEsuXt+ZBEREREREpdFT45SNJSTYuy4oVNg3fYYdlY6O5c+Hjj1nVo4dViyIiIiIiUuhEtPBzznVxzi1xzi11zt2RwfKOzrmtzrk5ocvdaZYtd87NDz0+K5I584uhQ+GLL2yi9rZts7nRww9DuXKsPu+8iGYTEREREZHgitg5fs65OOBF4HRgFTDTOfeh9/73dKt+773vmsluOnnvN0YqY37y9tvwxBNw9dVw5ZXZ3OiPP2DiRLj9dpLKlYtoPhERERERCa5Itvi1BpZ675d57/cAE4DuEXy+AuvXX63Y69ABnnkmBxs+8oidBHjjjRFKJiIiIiIi+UEkC7/DgJVp7q8KPZZeG+fcXOfcp865o9M87oEvnHO/OucK7agk69fDeedBtWrw3ntQrFg2N1yxAsaNgwEDoHr1iGYUEREREZFgc977yOzYuV7AGd77K0P3LwFae++vS7NOeSDFex/vnDsLeNZ73zC0rKb3fo1zrhrwJXCd9/67DJ5nIDAQoHr16i0nTJiwd1l8fDxlAzCgSW5zJCY6hgxpypIl5Xj++d9o2DA+29s2eO45an70Eb+8/Ta7q1XL98eioGUISo4gZFCO4GUISo4gZAhKjiBkUI7gZQhKjiBkCEqOIGRQjuBliHaOTp06/eq9b3XAAu99RC5AG+DzNPeHAkMPss1yoEoGj98LDDnYc7Zs2dKnNW3aNB8Euc1x1VXeg/fjx+dww7VrvS9Z0vsrrshzhnALQo4gZPA+GDmCkMF75QhaBu+DkSMIGbwPRo4gZPBeOYKWwftg5AhCBu+DkSMIGbxXjqBl8D66OYBZPoNaKZJdPWcCDZ1z9Z1zxYE+wIdpV3DOHeqcc6HbrbGup5ucc2Wcc+VCj5cBOgMLIpg1cF55BUaOhNtusykccuTpp2HPHrj99ohkExERERGR/CVio3p675Occ9cCnwNxwBjv/ULn3ODQ8lFAT+Aq51wSsAvo4733zrnqwAehmrAoMN57/1mksgbNjz/CNddAly7w0EM53HjLFpvv4YILoGHDiOQTEREREZH8JWKFH4D3fiowNd1jo9LcfgF4IYPtlgFNI5ktqFatgh49oG5dGD8e4uJyuIPnn4ft223SPxERERERESJc+EnO7NplI3ju2AHffAMVK+ZwB/Hx8OyzcM45cNxxEckoIiIiIiL5jwq/gPAeBg2CWbNgyhRo0iQXO3n5Zdi8GYYNC3c8ERERERHJxyI5uIvkwDPP2LR7990H3XMzzX1CAjzxBJx6KpxwQrjjiYiIiIhIPqYWvwD46isYMsS6eQ4fnsudvP46rF0Lb78d1mwiIiIiIpL/qcUvxpYtg9694aij4M03oUhufiKJifDYY3DiidCpU9gzioiIiIhI/qYWvxiKj7dund7D//4H5crlckfvvAPLl9uInjYFhoiIiIiIyF4q/GLEe+jfH37/HT77DI44Ipc7SkmBhx+2UTzPPjucEUVEREREpIBQ4RcjDz4IkyfDk0/C6afnYUcffACLF8OECWrtExERERGRDOkcvxj46CO46y64+GK46aY87Mh7qyCPPBJ69gxbPhERERERKVjU4hdlixbBRRdBq1YwenQeG+k++wx++w3GjIG4uLBlFBERERGRgkUtflH03382mEupUvD++3adJw89BLVrWyUpIiIiIiKSCbX4RUlyMlx4oQ2++c03Vq/lyXffwQ8/2EiexYuHI6KIiIiIiBRQavGLkuHD4dNPrU476aQ87uzzz+GKK6BaNbsWERERERHJggq/KPjmm6o88ggMGmSXXFu0CM46C7p0sYFdJkwIQ39REREREREp6FT4RdicOfDYY4056SR47rlc7mTTJrjuOjj2WPjpJ3jiCVi4EDp1CmdUEREREREpoHSOXwRt2ADnngvlyycyaVJczk/F27MHXnwR7r8ftm2DwYPh3nuhatUIpBURERERkYJKhV8E7doFNWpA//4LqV69ZfY39B4+/hhuuQX+/BM6d4annoKjj45cWBERERERKbDU1TOC6tSxnpmNGm3P/kbz5sHpp0O3bjY33yef2Hx9KvpERERERCSXVPhFWLYnaF+3DgYOhObNbVL255+3IvCss/I4y7uIiIiIiBR26uoZawkJ8Oyz8OCD1jf0+uvhrrugUqVYJxMRERERkQJChV+seA+TJ8Ntt8Hff8M558Djj0OjRrFOJiIiIiIiBYy6esbCr7/CySdDr15Qpgx8+SV8+KGKPhERERERiQgVftG0Zg307w/HHw+LF8OoUXY+32mnxTqZiIiIiIgUYOrqGQVFEhJgxAh45BFISoJbb4U774RDDol1NBERERERKQRU+EVSSgq88w6tb7rJZnPv0QMeewwOPzzWyUREREREpBBR4RdJ8+bBxReT2LAhJSdNgg4dYp1IREREREQKIRV+kdSsGUybxq8pKXRU0SciIiIiIjES0cFdnHNdnHNLnHNLnXN3ZLC8o3Nuq3NuTuhyd3a3zTc6doQiGkNHRERERERiJ2Itfs65OOBF4HRgFTDTOfeh9/73dKt+773vmsttRURERERE5CAi2RTVGljqvV/mvd8DTAC6R2FbERERERERSSOShd9hwMo091eFHkuvjXNurnPuU+fc0TncVkRERERERA7Cee8js2PnegFneO+vDN2/BGjtvb8uzTrlgRTvfbxz7izgWe99w+xsm2YfA4GBANWrV285YcKEvcvi4+MpW7ZsRF5fTgQhRxAyBCVHEDIEJUcQMihH8DIEJUcQMgQlRxAyKEfwMgQlRxAyBCVHEDIoR/AyRDtHp06dfvXetzpggfc+IhegDfB5mvtDgaEH2WY5UCU323rvadmypU9r2rRpPgiCkCMIGbwPRo4gZPA+GDmCkMF75QhaBu+DkSMIGbwPRo4gZPBeOYKWwftg5AhCBu+DkSMIGbxXjqBl8D66OYBZPoNaKZJdPWcCDZ1z9Z1zxYE+wIdpV3DOHeqcc6HbrbGup5uys62IiIiIiIhkT8RG9fTeJznnrgU+B+KAMd77hc65waHlo4CewFXOuSRgF9AnVKVmuG2ksoqIiIiIiBRkEZ3A3Xs/FZia7rFRaW6/ALyQ3W1FREREREQk5zSzuIiIiIiISAGnwk9ERERERKSAi9h0DrHgnNsA/JPmoSrAxhjFSSsIOYKQAYKRIwgZIBg5gpABlCNoGSAYOYKQAYKRIwgZQDmClgGCkSMIGSAYOYKQAZQjaBkgujnqeu+rpn+wQBV+6TnnZvmM5rAohDmCkCEoOYKQISg5gpBBOYKXISg5gpAhKDmCkEE5gpchKDmCkCEoOYKQQTmClyEoOdTVU0REREREpIBT4SciIiIiIlLAFfTCb3SsA4QEIUcQMkAwcgQhAwQjRxAygHKkFYQMEIwcQcgAwcgRhAygHGkFIQMEI0cQMkAwcgQhAyhHWkHIAAHIUaDP8RMREREREZGC3+InIiIiIiJS6BWows8518s5t9A5l+Kcy3TUHOfccufcfOfcHOfcrDA9dxfn3BLn3FLn3B0ZLHfOuedCy+c551qE43nTPccY59x659yCTJZ3dM5tDb3uOc65uyOQoaRzboZzbm7oZ3FfButE/Fikea4459xvzrmPM1gW8eMRep4KzrlJzrnFzrlFzrk26ZZH9Hg45xqleY1znHPbnHM3plsnWsfiBufcgtB748YMlkfkWGT02XDOVXLOfemc+zN0XTGTbcPy+yKTDI+H3hfznHMfOOcqZLJtlr9fwpBjRCjDHOfcF865mplsG8ljca9zbnWa9+BZmWwb0WMRevy60HMsdM49lsm2kTwWzZxzP6fu2znXOpNtI/2+aOqc+7/Q6/zIOVc+k23DdSxqO+emhX5PLnTO3RB6PLt/28NyPLLIkd3Pa56PR2YZ0iwf4pzzzrkqmWwf6WOR3c9rxI6Fc+7dNM+/3Dk3J5PtI30ssvt5DdfnJMPvWS77f9PyfDyyyBDV7+FZ5Mju37SIHYs0yw/2WQ17TZIl732BuQBHAY2A6UCrLNZbDlQJ4/PGAX8BhwPFgblAk3TrnAV8CjjgROCXCLz+DkALYEEmyzsCH0f4Z+CAsqHbxYBfgBOjfSzSPNfNwPiMXnc0jkfoed4ErgzdLg5UiOHxiAPWYvO7RPu9cQywACgNFAW+AhpG41hk9NkAHgPuCN2+A3g0k23D8vsikwydgaKh249mlCE7v1/CkKN8mtvXA6NicCzuBYZk4/0b6WPRKfTeLBG6Xy0Gx+IL4MzQ7bOA6TE6FjOBk0O3LwdGRPhY1ABahG6XA/4AmpCNv+3hPB5Z5Djo5zVcxyOzDKH7tYHPsXmLD3ieKB2Lg35eo3Es0qzzJHB3jI7FQT+vYf6cZPg9i2z8TQvX8cgiQ1S/h2eR46B/0yJ9LEL3s/yshvNYZPdSoFr8vPeLvPdLYvDUrYGl3vtl3vs9wASge7p1ugNjvfkZqOCcqxHOEN7774DN4dxnLjJ473186G6x0CX9iaQRPxYAzrlawNnAq+Hedw4ylMe+UL0G4L3f473/L91qUTkeIacCf3nv/4nQ/rNyFPCz936n9z4J+BY4L906ETkWmXw2umNFOaHrc/P6PDnN4L3/InQsAH4GamWwaXZ+v+Q1x7Y0d8tw4Gc2rPLwuyrixwK4CnjEe787tM763O4/Dxk8kNq6dgiwJoNNo3EsGgHfhW5/CfTI7f6zmeFf7/3s0O3twCLgsGz+bQ/b8cgiR3Y+r2GRWYbQ4qeB28j8cxrxY5GbfeXWwTI45xxwAfBOBptH41hk5/MaNll8z8rO37SwHI/MMkT7e3gWObLzNy2ixyJ0/2Cf1agrUIVfDnjgC+fcr865gWHY32HAyjT3V3HgL8bsrBMNbULN0Z86546OxBM46145B1gPfOm9/yXdKtE6Fs9gH7iULNaJ9PE4HNgAvO6sy+mrzrky6daJ5nujDxn/cYTIH4sFQAfnXGXnXGnsP6O1060TzWNR3Xv/L9gfdKBaJuuF+/dFZi7HWjvTi8oxcc496JxbCVwEZNbVN9LH4tpQ95wxmXRTisaxOBJo75z7xTn3rXPu+EzWi+SxuBF4PPTzeAIYmsE60TgWC4Buodu9OPDzmirsx8I5Vw9ojv33PDsicjyyyJHZ5xXCfDzSZnDOdQNWe+/nZrFJtI7FwT6vEMFjkebh9sA67/2fGWwSjWNxIwf/vEIYj0Um37Oy8zctbMcjG9/1shLpY5Gdv2kRPRbZ/KxC9L5jAPmw8HPOfeXsHKH0l5xU6e289y2AM4FrnHMd8horg8fSV/fZWSfSZmNd/JoCzwNTIvEk3vtk730z7L+hrZ1zx6RbJeLHwjnXFVjvvf81i9WicTyKYt2nRnrvmwM7sC4Y+8XNYLuwvzecc8WxL3HvZbA44sfCe78I6x71JfAZ1q0iKd1qQficpBfu3xcHcM4Nw47F2xktzuCxsB8T7/0w733tUIZrM1ktksdiJHAE0Az4F+u6lV40jkVRoCLWXehWYGKoRSG9SB6Lq4CbQj+Pmwj1GEgnGsficuy1/Yp1bduTyXphPRbOubLAZODGdP+5z3KzDB7L0/HILMdBPq8QxuORNkPoOYeR+T9m9m6WwWPhPhbZ+bxChI5FuvdFXzL/h2Y0jkV2Pq8QxmORje9ZmcbPaHdRzgBROBbZ+JsWyWNxHNn7rEIUvmOkle8KP+/9ad77YzK4/C8H+1gTul4PfIA19+bFKvb/T2gtDmzqz846EeW935baHO29nwoUy+xk0zA9339YP+8u6RZF41i0A7o555ZjzfenOOfeSpcvGsdjFbAqzX/CJmGFYPp1ovHeOBOY7b1fl35BtN4b3vvXvPctvPcdsG5l6f9DG83PybrUbqSh6wy79EXg98V+nHP9gK7ARd77jP7oRPt3x3gy6dIXyWPhvV8X+uOZArySyb6jcSxWAe+Huu/MwHoMHPBZiPD7oh/wfuj2e5nsO+LHwnu/2Hvf2XvfEvti/Vcm64XtWDjnimFfqt/23r9/sPXTCOvxyCxHNj6vYTseGWQ4AqgPzA39basFzHbOHZpu04gfi2x+XiN5LFIfLwqcD7ybyabReF9k5/Makd8Z6b5nZedvWth/b2TxXS+rbSJ9LNLK7G9aJI9Fd7L3WY34d4z08l3hl1fOuTLOuXKpt7GTtTMcBTMHZgINnXP1Q60qfYAP063zIXCpMycCW1Ob5KPFOXdo6n+unY06VQTYFObnqOpCI50550oBpwGL060W8WPhvR/qva/lva+H/Ty+8d5fnC5rxI+H934tsNI51yj00KnA7+lWi9Z7I9P/ikbjWIT2XS10XQf7Y50+TzQ/Jx9if7AJXR/wz6MI/b5Iu/8uwO1AN+/9zkxWy87vl7zmaJjmbjcO/MxG41ikPZfzvEz2HfFjgbV2nxLKdCR20v/GdFkjeiywLx8nh26fwoH/IIHovC9SP69FgOHAqAzWCduxCP0Oeg1Y5L1/Koebh+14ZJYjO5/XcB2PjDJ47+d776t57+uF/ratwgYbWZtu82gci4N+XiN5LNI4DVjsvV+VyeYRPxZk4/Ma5s9JZt+zDvo3jTAdj2x+18ts24gfi+z8TSOyx+K37HxWo/C35EA+SqPIROOC/fJZBewG1gGfhx6vCUwN3T4c62I2F1gIDAvTc5+FjfD0V+o+gcHAYL9v1J8XQ8vnk8VoR3nI8A7W5SIxdByuSJfh2tBrnoudmN42AhmOA34D5mFv3rtjcSzSZepIaMTKaB+P0PM0A2aFjskUrBtZtN8bpbFC7pA0j8XiWHyPFb5zgVOj9d7I5LNRGfga+yP9NVAptG5Efl9kkmEpdo7BnNBlVPoMofsH/H4Jc47Joc/rPOAjbCCLaB+LcaGf+Tzsj2+NGB2L4sBboeMxGzglBsfiJODX0P5/AVrG6FjcENr/H8AjgIvwsTgJ62o1L81n4iyy8bc9nMcjixwH/byG63hkliHdOssJjQYYg2Nx0M9rNI4F8Aahvx9p1o/2sTjo5zVcxyK0r8y+Zx30b1q4jkcWGaL6PTyLHAf9mxbpY5Gdz2o4j0V2L6m/xEVERERERKSAKnRdPUVERERERAobFX4iIiIiIiIFnAo/ERERERGRAk6Fn4iIiIiISAGnwk9ERERERKSAU+EnIiKSjnMu2Tk3J83ljjDuu55zLrJzNYmIiKRTNNYBREREAmiX975ZrEOIiIiEi1r8REREssk5t9w596hzbkbo0iD0eF3n3NfOuXmh6zqhx6s75z5wzs0NXdqGdhXnnHvFObfQOfeFc65UzF6UiIgUCir8REREDlQqXVfP3mmWbfPetwZeAJ4JPfYCMNZ7fxzwNvBc6PHngG+9902BFsDC0OMNgRe990cD/wE9IvpqRESk0HPe+1hnEBERCRTnXLz3vmwGjy8HTvHeL3POFQPWeu8rO+c2AjW894mhx//13ldxzm0Aannvd6fZRz3gS+99w9D924Fi3vsHovDSRESkkFKLn4iISM74TG5ntk5Gdqe5nYzOuRcRkQhT4SciIpIzvdNc/1/o9k9An9Dti4AfQre/Bq4CcM7FOefKRyukiIhIWvoPo4iIyIFKOefmpLn/mfc+dUqHEs65X7B/nvYNPXY9MMY5dyuwAbgs9PgNwGjn3BVYy95VwL+RDi8iIpKezvETERHJptA5fq289xtjnUVERCQn1NVTRERERESkgFOLn4iIiIiISAGnFj8REREREZECToWfiIiIiIhIAafCT0REREREpIBT4SciIiIiIlLAqfATEREREREp4FT4iYiIiIiIFHD/D9ULPX7n9fTTAAAAAElFTkSuQmCC"></figcaption></figure>



<p class="wp-block-paragraph">Next, let&#8217;s print the accuracy and a confusion matrix on the predictions from the validation dataset.  </p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># function that returns the label for a given probability
def getLabel(prob):
    if(prob &gt; .5):
               return 'dog'
    else:
               return 'cat'

# get the predictions for the validation data
val_df = validate_df.copy()
val_df['pred'] = &quot;&quot;
val_pred_prob = model.predict(validation_generator)

for i in range(val_pred_prob.shape[0]):
    val_df['pred'][i] = getLabel(val_pred_prob[i])
          
# create a confusion matrix
y_val = val_df['category']
y_pred = val_df['pred']

print('Accuracy: {:.2f}'.format(accuracy_score(y_val, y_pred)))
cnf_matrix = confusion_matrix(y_val, y_pred)

# plot the confusion matrix in form of a heatmap

%matplotlib inline
class_names=[False, True] # name  of classes
fig, ax = plt.subplots(figsize=(8, 8))
tick_marks = np.arange(len(class_names))
plt.xticks(tick_marks, class_names)
plt.yticks(tick_marks, class_names)
sns.heatmap(pd.DataFrame(cnf_matrix), annot=True, cmap=&quot;YlGnBu&quot;, fmt='g')
plt.title('Confusion matrix')
plt.ylabel('Actual label')
plt.xlabel('Predicted label')</pre></div>



<pre class="wp-block-preformatted">Accuracy: 0.82</pre>



<figure class="wp-block-image size-large"><img decoding="async" width="479" height="496" data-attachment-id="2716" data-permalink="https://www.relataly.com/image-classification-with-deep-learning/2485/image-24-4/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2020/12/image-24.png" data-orig-size="479,496" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-24" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2020/12/image-24.png" src="https://www.relataly.com/wp-content/uploads/2020/12/image-24.png" alt="confusion matrix for an image classification model" class="wp-image-2716" srcset="https://www.relataly.com/wp-content/uploads/2020/12/image-24.png 479w, https://www.relataly.com/wp-content/uploads/2020/12/image-24.png 290w" sizes="(max-width: 479px) 100vw, 479px" /></figure>



<h3 class="wp-block-heading" id="h-step-9-image-classification-on-sample-images">Step #9 Image Classification on Sample Images</h3>



<p class="wp-block-paragraph">Now that we have trained the model, I bet you can&#8217;t wait to test the image classifier on some sample data. For this purpose, ensure that you have some sample images in the &#8220;sample&#8221; folder. Running the code below will feed the image classifier with the test dataset. Based on this dataset, the model will then predict the labels for the images from the sample folder. Finally, the code below prints the images in an image grid and the predicted labels. </p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># set the path to the sample images
sample_path = &quot;data/images/cats-and-dogs/sample/&quot;
sample_df = createImageDf(sample_path)
sample_df['category'] = sample_df['category'].replace({0:'cat',1:'dog'})
sample_df['pred'] = &quot;&quot;

# create an image data generator for the sample images - we will only rescale the images
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_dataframe(
    sample_df, 
    sample_path,    
    shuffle=False,
    x_col='filename', y_col='category',
    target_size=target_size)

# make the predictions 
pred_prob = model.predict(test_generator)
image_number = pred_prob.shape[0]

# define the plot size
for i in range(pred_prob.shape[0]):
    sample_df['pred'][i] = getLabel(pred_prob[i])
    
print('Accuracy: {:.2f}'.format(accuracy_score(sample_df['category'], sample_df['pred'])))

nrows = 6
ncols = int(round(image_number / nrows, 0))
fig, axs = plt.subplots(nrows, ncols, figsize=(15, 15))
for i, ax in enumerate(fig.axes):
    if i &lt; sample_df.shape[0]:
        filepath = sample_path + sample_df.at[i ,'filename']
        ax = ax
        img = Image.open(filepath).resize(target_size)
        ax.imshow(img)
        ax.set_title(sample_df.at[i ,'filename'] + '\n' + ' predicted: '  + str(sample_df.at[i ,'pred']))
        result = [True if sample_df.at[i ,'pred'] == sample_df.at[i ,'category'] else False]
        ax.set_xlabel(str(result))
        ax.set_xticks([]); ax.set_yticks([])</pre></div>



<figure class="wp-block-image size-full"><img decoding="async" width="862" height="864" data-attachment-id="6516" data-permalink="https://www.relataly.com/image-classification-with-deep-learning/2485/output-5/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2022/04/output.png" data-orig-size="862,864" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="output" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2022/04/output.png" src="https://www.relataly.com/wp-content/uploads/2022/04/output.png" alt="image classification - the image shows several dogs and cats" class="wp-image-6516" srcset="https://www.relataly.com/wp-content/uploads/2022/04/output.png 862w, https://www.relataly.com/wp-content/uploads/2022/04/output.png 300w, https://www.relataly.com/wp-content/uploads/2022/04/output.png 150w, https://www.relataly.com/wp-content/uploads/2022/04/output.png 768w" sizes="(max-width: 862px) 100vw, 862px" /></figure>



<p class="wp-block-paragraph">Our image classifier achieves an accuracy of around 83% on the validation set. The model is not perfect, but it should have labeled most images correctly. With deeper architectures, more data, and training runs, you can create classification models that achieve better results over 95%.</p>



<h2 class="wp-block-heading" id="h-summary">Summary</h2>



<p class="wp-block-paragraph">In this tutorial, you learned how to train an image classification model. We have prepared a dataset and performed several transformations to bring the data in shape for training. Finally, we have trained a convolutional neural network to distinguish between dogs and cats. You can now use this knowledge to train image classification models that determine other objects. </p>



<p class="wp-block-paragraph">There are many other cool things that you can do with CNNs. For example, object localization in images and videos and even stock market prediction. But these are topics for further articles.</p>



<p class="wp-block-paragraph">I am always happy to receive feedback. I hope you enjoyed the article and would be happy if you left a comment. Cheers</p>



<h2 class="wp-block-heading" id="h-sources-and-further-reading">Sources and Further Reading</h2>



<ol class="wp-block-list"><li><a href="https://amzn.to/3MAy8j5" target="_blank" rel="noreferrer noopener">Andriy Burkov Machine Learning Engineering</a></li><li><a href="https://amzn.to/3D0gB0e" target="_blank" rel="noreferrer noopener">Oliver Theobald (2020) Machine Learning For Absolute Beginners: A Plain English Introduction</a></li><li><a href="https://amzn.to/3MyU6Tj" target="_blank" rel="noreferrer noopener">Charu C. Aggarwal (2018) Neural Networks and Deep Learning</a></li><li><a href="https://amzn.to/3S9Nfkl" target="_blank" rel="noreferrer noopener">Aurélien Géron (2019) Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems </a></li><li><a href="https://amzn.to/3EKidwE" target="_blank" rel="noreferrer noopener">David Forsyth (2019) Applied Machine Learning Springer</a></li><li>[1] D. H. Hubel and T. N. Wiesel &#8211; Receptive Fields of Neurons in the Cat&#8217;s Striate Cortex, The Journal of physiology (1959)</li></ol>



<p class="has-contrast-2-color has-base-3-background-color has-text-color has-background wp-block-paragraph"><em>The links above to Amazon are affiliate links. By buying through these links, you support the Relataly.com blog and help to cover the hosting costs. Using the links does not affect the price.</em></p>
<p>The post <a href="https://www.relataly.com/image-classification-with-deep-learning/2485/">Image Classification with Convolutional Neural Networks &#8211; Classifying Cats and Dogs in Python</a> appeared first on <a href="https://www.relataly.com">relataly.com</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.relataly.com/image-classification-with-deep-learning/2485/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">2485</post-id>	</item>
		<item>
		<title>Customer Churn Prediction &#8211; Understanding Models with Feature Permutation Importance using Python</title>
		<link>https://www.relataly.com/predicting-the-customer-churn-of-a-telecommunications-provider/2378/</link>
					<comments>https://www.relataly.com/predicting-the-customer-churn-of-a-telecommunications-provider/2378/#respond</comments>
		
		<dc:creator><![CDATA[Florian Follonier]]></dc:creator>
		<pubDate>Sun, 02 Aug 2020 13:24:28 +0000</pubDate>
				<category><![CDATA[Churn Prediction]]></category>
		<category><![CDATA[Classification (two-class)]]></category>
		<category><![CDATA[Data Science]]></category>
		<category><![CDATA[Data Sources]]></category>
		<category><![CDATA[Feature Permutation Importance]]></category>
		<category><![CDATA[Hyperparameter Tuning]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Random Decision Forests]]></category>
		<category><![CDATA[Retail]]></category>
		<category><![CDATA[Scikit-Learn]]></category>
		<category><![CDATA[Seaborn]]></category>
		<category><![CDATA[Use Cases]]></category>
		<category><![CDATA[AI in E-Commerce]]></category>
		<category><![CDATA[AI in Marketing]]></category>
		<category><![CDATA[Digital Transformation]]></category>
		<category><![CDATA[Intermediate Tutorials]]></category>
		<category><![CDATA[Model Interpretation]]></category>
		<category><![CDATA[Multivariate Models]]></category>
		<category><![CDATA[Permutation Feature Importance]]></category>
		<category><![CDATA[Supervised Learning]]></category>
		<category><![CDATA[Two-Label Classification]]></category>
		<guid isPermaLink="false">https://www.relataly.com/?p=2378</guid>

					<description><![CDATA[<p>Customer retention is a prime objective for service companies, and understanding the patterns that lead to customer churn can be the key to maintaining long-lasting client relationships. Businesses incur significant costs when customers discontinue their services, hence it&#8217;s vital to identify potential churn risks and take preemptive actions to retain these customers. Machine Learning models ... <a title="Customer Churn Prediction &#8211; Understanding Models with Feature Permutation Importance using Python" class="read-more" href="https://www.relataly.com/predicting-the-customer-churn-of-a-telecommunications-provider/2378/" aria-label="Read more about Customer Churn Prediction &#8211; Understanding Models with Feature Permutation Importance using Python">Read more</a></p>
<p>The post <a href="https://www.relataly.com/predicting-the-customer-churn-of-a-telecommunications-provider/2378/">Customer Churn Prediction &#8211; Understanding Models with Feature Permutation Importance using Python</a> appeared first on <a href="https://www.relataly.com">relataly.com</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">Customer retention is a prime objective for service companies, and understanding the patterns that lead to customer churn can be the key to maintaining long-lasting client relationships. Businesses incur significant costs when customers discontinue their services, hence it&#8217;s vital to identify potential churn risks and take preemptive actions to retain these customers. Machine Learning models can be instrumental in identifying these patterns and providing valuable insights into customer behavior.</p>



<p class="wp-block-paragraph">An intriguing technique, Permutation Feature Importance, allows us to discern the significance of different features of our machine learning model, thereby shedding light on their influence on customer churn. This tutorial guides you through the intricacies of this technique and its implementation.</p>



<p class="wp-block-paragraph">The structure of this tutorial is as follows:</p>



<ul class="wp-block-list">
<li>We begin by discussing the business problem of customer churn and its implications.</li>



<li>We introduce the concept of Permutation Feature Importance, a powerful tool to identify essential features in our machine learning model.</li>



<li>We transition into the hands-on coding segment, where we build a churn prediction model using Python.</li>



<li>Our model undergoes a classification process and hyperparameter tuning to select the most effective parameters.</li>



<li>Utilizing the trained model, we predict the churn probabilities for a test set of customers.</li>



<li>Finally, we create a feature ranking based on their impact on the model&#8217;s performance.</li>
</ul>



<p class="wp-block-paragraph">By employing permutation feature importance, this tutorial offers a deep-dive into the correlation between input variables and model predictions, providing actionable insights for effective customer churn management.</p>



<p class="wp-block-paragraph">Also: </p>



<ul class="wp-block-list">
<li><a href="https://www.relataly.com/building-fair-machine-machine-learning-models-with-fairlearn/12804/" target="_blank" rel="noreferrer noopener">Using Fairlearn to Build Fair Machine Machine Learning Models with Python: Step-by-Step Towards More Responsible AI</a></li>



<li><a href="https://www.relataly.com/customer-segmentation-using-hierarchical-clustering-in-python/11335/" target="_blank" rel="noreferrer noopener">How to Use Hierarchical Clustering For Customer Segmentation in Python</a></li>
</ul>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"><div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><img decoding="async" data-attachment-id="2402" data-permalink="https://www.relataly.com/predicting-the-customer-churn-of-a-telecommunications-provider/2378/image-47/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2020/08/image.png" data-orig-size="448,173" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2020/08/image.png" src="https://www.relataly.com/wp-content/uploads/2020/08/image.png" alt="machine learning. It is particularly effective when combined with feature permutation importance" class="wp-image-2402" width="324" height="127"/><figcaption class="wp-element-caption">Customer churn prediction is a compelling use case for machine learning. It is particularly effective when combined with feature permutation importance.</figcaption></figure>
</div></div>
</div>



<div style="height:26px" aria-hidden="true" class="wp-block-spacer"></div>



<h2 class="wp-block-heading" id="h-what-is-churn-prediction">What is Churn Prediction?</h2>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">A company&#8217;s effort to persuade a new customer to sign a contract is many times higher than the costs incurred in retaining existing customers. According to industry experts, winning a new customer is four times more expensive than keeping an existing one. Providers that can identify churn candidates and manage to retain them can significantly reduce costs. </p>



<p class="wp-block-paragraph">A crucial point is whether the provider succeeds in getting the churn candidates to stay. Sometimes it may be enough to contact the churn candidate and inquire about customer satisfaction. In other cases, this may not be enough, and the provider needs to increase the service value, for example, by offering free services or a discount. However, actions should be well thought out, as they can also negatively affect. For instance, if a customer hardly ever uses his contract, a call from the provider may even increase the desire to cancel the contract. Machine learning can help assess cases individually and identify the optimal anti-churn action. </p>



<h2 class="wp-block-heading" id="h-about-permutation-feature-importance">About Permutation Feature Importance</h2>



<p class="wp-block-paragraph">Feature importance is a helpful technique for understanding the contribution of input variables (features) to a predictive model. The results from this technique can be as valuable as the predictions themselves, as they can help us understand the business context better. For example, let&#8217;s say we have trained a model that predicts which of our customers will likely churn. Wouldn&#8217;t it be interesting to know why specific customers are more likely to churn than others? Permutation feature importance can help us answer this question by providing us with a ranking of the input variables in our model by their usefulness. The order can validate assumptions about the business context and uncover causal relations in the data.</p>



<p class="wp-block-paragraph">Compared to neural networks, one of the most significant advantages of traditional prediction models, such as a decision tree, is their interpretability. Neural networks are black boxes because it is tough to understand the relationship between input and model predictions. In traditional models, on the other hand, we can calculate the meaning of the features and use it to interpret the model and optimize its performance, for example, by removing features from the model that are not important. We, therefore, start with a simple model first and move on to more complex models once we understand the data.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<h2 class="wp-block-heading" id="h-implementing-a-customer-churn-prediction-model-in-python">Implementing a Customer Churn Prediction Model in Python</h2>



<p class="wp-block-paragraph">In the following, we will implement a customer churn prediction model. We will train a decision forest model on a data set from Kaggle and optimize it using <a aria-label="undefined (opens in a new tab)" href="https://www.relataly.com/hyperparameter-tuning-with-grid-search/2261/" target="_blank" rel="noreferrer noopener">grid search</a>. The data contains customer-level information for a telecom provider and a binary prediction label of which customers canceled their contracts and did not. Finally, we will calculate the feature importance to understand how the model works. </p>



<p class="wp-block-paragraph">The code is available on the GitHub repository.</p>



<div class="wp-block-kadence-advancedbtn kb-buttons-wrap kb-btns_bddeda-14"><a class="kb-button kt-button button kb-btn_b5bf96-e2 kt-btn-size-standard kt-btn-width-type-full kb-btn-global-inherit kt-btn-has-text-true kt-btn-has-svg-true wp-block-button__link wp-block-kadence-singlebtn" href="https://github.com/flo7up/relataly-public-python-tutorials/blob/master/02%20Classification/017%20Permutation%20Feature%20Importance%20-%20Customer%20Churn%20Prediction%20using%20Random%20Decision%20Forest.ipynb" target="_blank" rel="noreferrer noopener"><span class="kb-svg-icon-wrap kb-svg-icon-fe_eye kt-btn-icon-side-left"><svg viewBox="0 0 24 24"  fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"  aria-hidden="true"><path d="M1 12s4-8 11-8 11 8 11 8-4 8-11 8-11-8-11-8z"/><circle cx="12" cy="12" r="3"/></svg></span><span class="kt-btn-inner-text">View on GitHub </span></a>

<a class="kb-button kt-button button kb-btn_8e2f54-ca kt-btn-size-standard kt-btn-width-type-full kb-btn-global-inherit kt-btn-has-text-true kt-btn-has-svg-true wp-block-button__link wp-block-kadence-singlebtn" href="https://github.com/flo7up/relataly-public-python-API-tutorials" target="_blank" rel="noreferrer noopener"><span class="kb-svg-icon-wrap kb-svg-icon-fa_github kt-btn-icon-side-left"><svg viewBox="0 0 496 512"  fill="currentColor" xmlns="http://www.w3.org/2000/svg"  aria-hidden="true"><path d="M165.9 397.4c0 2-2.3 3.6-5.2 3.6-3.3.3-5.6-1.3-5.6-3.6 0-2 2.3-3.6 5.2-3.6 3-.3 5.6 1.3 5.6 3.6zm-31.1-4.5c-.7 2 1.3 4.3 4.3 4.9 2.6 1 5.6 0 6.2-2s-1.3-4.3-4.3-5.2c-2.6-.7-5.5.3-6.2 2.3zm44.2-1.7c-2.9.7-4.9 2.6-4.6 4.9.3 2 2.9 3.3 5.9 2.6 2.9-.7 4.9-2.6 4.6-4.6-.3-1.9-3-3.2-5.9-2.9zM244.8 8C106.1 8 0 113.3 0 252c0 110.9 69.8 205.8 169.5 239.2 12.8 2.3 17.3-5.6 17.3-12.1 0-6.2-.3-40.4-.3-61.4 0 0-70 15-84.7-29.8 0 0-11.4-29.1-27.8-36.6 0 0-22.9-15.7 1.6-15.4 0 0 24.9 2 38.6 25.8 21.9 38.6 58.6 27.5 72.9 20.9 2.3-16 8.8-27.1 16-33.7-55.9-6.2-112.3-14.3-112.3-110.5 0-27.5 7.6-41.3 23.6-58.9-2.6-6.5-11.1-33.3 2.6-67.9 20.9-6.5 69 27 69 27 20-5.6 41.5-8.5 62.8-8.5s42.8 2.9 62.8 8.5c0 0 48.1-33.6 69-27 13.7 34.7 5.2 61.4 2.6 67.9 16 17.7 25.8 31.5 25.8 58.9 0 96.5-58.9 104.2-114.8 110.5 9.2 7.9 17 22.9 17 46.4 0 33.7-.3 75.4-.3 83.6 0 6.5 4.6 14.4 17.3 12.1C428.2 457.8 496 362.9 496 252 496 113.3 383.5 8 244.8 8zM97.2 352.9c-1.3 1-1 3.3.7 5.2 1.6 1.6 3.9 2.3 5.2 1 1.3-1 1-3.3-.7-5.2-1.6-1.6-3.9-2.3-5.2-1zm-10.8-8.1c-.7 1.3.3 2.9 2.3 3.9 1.6 1 3.6.7 4.3-.7.7-1.3-.3-2.9-2.3-3.9-2-.6-3.6-.3-4.3.7zm32.4 35.6c-1.6 1.3-1 4.3 1.3 6.2 2.3 2.3 5.2 2.6 6.5 1 1.3-1.3.7-4.3-1.3-6.2-2.2-2.3-5.2-2.6-6.5-1zm-11.4-14.7c-1.6 1-1.6 3.6 0 5.9 1.6 2.3 4.3 3.3 5.6 2.3 1.6-1.3 1.6-3.9 0-6.2-1.4-2.3-4-3.3-5.6-2z"/></svg></span><span class="kt-btn-inner-text">Relataly GitHub Repo </span></a></div>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<h3 class="wp-block-heading">Prerequisites</h3>



<p class="wp-block-paragraph">Before starting the coding part, make sure that you have set up your <a href="https://www.python.org/downloads/" target="_blank" rel="noreferrer noopener">Python 3</a> environment and required packages. If you don&#8217;t have an environment, you can follow&nbsp;<a href="https://www.relataly.com/anaconda-python-environment-machine-learning/1663/" target="_blank" rel="noreferrer noopener">this tutorial</a>&nbsp;to set up the&nbsp;<a href="https://www.anaconda.com/products/individual" target="_blank" rel="noreferrer noopener">Anaconda environment</a>.</p>



<p class="wp-block-paragraph">Make sure you install all required packages. In this tutorial, we will be working with the following packages:&nbsp;</p>



<ul class="wp-block-list">
<li>Pandas</li>



<li>NumPy</li>



<li>Matplotlib</li>



<li>Seaborn</li>
</ul>



<p class="wp-block-paragraph">In addition, we will be using <strong><em>Keras&nbsp;</em></strong>(2.0 or higher) with <strong><em>Tensorflow</em> </strong>backend and the machine learning library <strong><em>Scikit-learn</em></strong>.</p>



<p class="wp-block-paragraph">You can install packages using console commands:</p>



<ul class="wp-block-list">
<li><em>pip install &lt;package name&gt;</em></li>



<li><em>conda install &lt;package name&gt;</em>&nbsp;(if you are using the anaconda packet manager)</li>
</ul>



<h3 class="wp-block-heading" id="h-step-1-loading-the-customer-churn-data">Step #1 Loading the Customer Churn Data</h3>



<p class="wp-block-paragraph">We begin by loading a customer churn <a href="https://www.kaggle.com/barun2104/telecom-churn" target="_blank" rel="noreferrer noopener">dataset from Kaggle</a>. If you work with the Kaggle Python environment, you can directly save the dataset into your Kaggle project. After completing the download, put the dataset under the file path of your choice, but don&#8217;t forget to adjust the file path variable in the code. </p>



<p class="wp-block-paragraph">The dataset contains 3333 records and the following attributes.</p>



<ul class="wp-block-list">
<li><strong>Churn</strong>: The prediction label: 1 if the customer canceled service, 0 if not.</li>



<li><strong>AccountWeeks</strong>: number of weeks the customer has had an active account</li>



<li><strong>ContractRenewal</strong>: 1 if customer recently renewed contract, 0 if not</li>



<li><strong>DataPlan</strong>: 1 if the customer has a data plan, 0 if not</li>



<li><strong>DataUsage</strong>: gigabytes of monthly data usage</li>



<li><strong>CustServCalls</strong>: number of calls into customer service</li>



<li><strong>DayMins</strong>: average daytime minutes per month</li>



<li><strong>DayCalls</strong>: average number of daytime calls</li>



<li><strong>MonthlyCharge</strong>: average monthly bill</li>



<li><strong>OverageFee</strong>: The most considerable overage fee in the last 12 months</li>
</ul>



<p class="wp-block-paragraph">The following code will load the data from your local folder into your anaconda Python project:</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}">import numpy as np 
import pandas as pd 
import math
from pandas.plotting import register_matplotlib_converters
import matplotlib.pyplot as plt 
import matplotlib.colors as mcolors
import matplotlib.dates as mdates 

from sklearn.metrics import confusion_matrix, classification_report
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.inspection import permutation_importance
import seaborn as sns


# set file path
filepath = &quot;data/Churn-prediction/&quot;

# Load train and test datasets
train_df = pd.read_csv(filepath + 'telecom_churn.csv')
train_df.head()</pre></div>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:false,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;null&quot;,&quot;mime&quot;:&quot;text/plain&quot;,&quot;theme&quot;:&quot;3024-day&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Plain Text&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;text&quot;}">	Churn	AccountWeeks	ContractRenewal	DataPlan	DataUsage	CustServCalls	DayMins	DayCalls	MonthlyCharge	OverageFee	RoamMins
0	0		128				1				1			2.7			1				265.1		110		89.0			9.87		10.0
1	0		107				1				1			3.7			1				161.6		123		82.0			9.78		13.7
2	0		137				1				0			0.0			0				243.4		114		52.0			6.06		12.2
3	0		84				0				0			0.0			2				299.4		71		57.0			3.10		6.6
4	0		75				0				0			0.0			3				166.7		113		41.0			7.42		10.1</pre></div>



<h3 class="wp-block-heading" id="h-step-2-exploring-the-data">Step #2 Exploring the Data</h3>



<p class="wp-block-paragraph">Before we begin with the preprocessing, we will quickly explore the data. For this purpose, we will create histograms for the different attributes in our data.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># # Create histograms for feature columns separated by prediction label value
df_plot = train_df.copy()

# class_columnname = 'Churn'
sns.pairplot(df_plot, hue=&quot;Churn&quot;, height=2.5, palette='muted')</pre></div>



<figure class="wp-block-image size-large"><img decoding="async" width="1024" height="990" data-attachment-id="6808" data-permalink="https://www.relataly.com/predicting-the-customer-churn-of-a-telecommunications-provider/2378/pairplots-churn-prediction/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2022/04/pairplots-churn-prediction.png" data-orig-size="1828,1768" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="pairplots-churn-prediction" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2022/04/pairplots-churn-prediction.png" src="https://www.relataly.com/wp-content/uploads/2022/04/pairplots-churn-prediction-1024x990.png" alt="" class="wp-image-6808" srcset="https://www.relataly.com/wp-content/uploads/2022/04/pairplots-churn-prediction.png 1024w, https://www.relataly.com/wp-content/uploads/2022/04/pairplots-churn-prediction.png 300w, https://www.relataly.com/wp-content/uploads/2022/04/pairplots-churn-prediction.png 768w, https://www.relataly.com/wp-content/uploads/2022/04/pairplots-churn-prediction.png 1536w, https://www.relataly.com/wp-content/uploads/2022/04/pairplots-churn-prediction.png 1828w" sizes="(max-width: 1024px) 100vw, 1024px" /><figcaption class="wp-element-caption">Histograms of the churn prediction dataset separated by prediction label (red=churn, blue= no churn)</figcaption></figure>



<p class="wp-block-paragraph">We can see that the data distribution for several attributes looks quite good and resembles a normal distribution, for example, for OverageFeed, DayMins, and DayCalls. However, the distribution for the prediction label is unbalanced. This is because more customers remain with their contract (prediction label class = 0) than those that cancel their contract (prediction label class = 1). </p>



<h3 class="wp-block-heading" id="h-step-3-data-preprocessing">Step #3 Data Preprocessing</h3>



<p class="wp-block-paragraph">The next step is to preprocess the data. I have reduced this part to a minimum to keep this tutorial simple. For example, I do not treat the unbalanced label classes. However, this would be appropriate to improve the model performance in a real business context. The imbalanced data is also why I chose a decision forest as a model type. Decision forests can handle unbalanced data relatively well compared to traditional models such as logistic regression. </p>



<p class="wp-block-paragraph">The following code splits the data into the train (x_train) and test data (x_test) and creates the respective datasets, which only contain the label class (y_train, y_test). The ratio is 0.7, resulting in 2333 records in the training dataset and 1000 in the test dataset.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Create Training Dataset
x_df = train_df[train_df.columns[train_df.columns.isin(['AccountWeeks', 'ContractRenewal', 'DataPlan','DataUsage', 'CustServCalls', 'DayCalls', 'MonthlyCharge', 'OverageFee', 'RoamMins'])]].copy()
y_df = train_df['Churn'].copy()

# Split the data into x_train and y_train data sets
x_train, x_test, y_train, y_test = train_test_split(x_df, y_df, train_size=0.7, random_state=0)
x_train</pre></div>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:false,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;null&quot;,&quot;mime&quot;:&quot;text/plain&quot;,&quot;theme&quot;:&quot;3024-day&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Plain Text&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;text&quot;}">		AccountWeeks	ContractRenewal	DataPlan	DataUsage	CustServCalls	DayCalls	MonthlyCharge	OverageFee	RoamMins
2918	58				1				0			0.00		4				112			53.0			13.29		0.0
1884	51				0				1			3.32		2				60			74.2			10.03		12.3
2823	87				1				0			0.00		2				80			50.0			9.35		16.6
2319	83				1				1			2.35		3				105			91.5			12.65		8.7
2980	84				1				0			0.00		3				86			62.0			13.78		14.3
...		...				...				...			...			...				...			...				...			...
835	27	1				0				0.00		1			75				31.0		10.43			9.9
3264	89				1				1			1.59		0				98			50.9			10.36		5.9
1653	93				0				0			0.00		1				78			42.0			10.99		11.1
2607	91				1				0			0.00		3				100			53.0			11.97		9.9
2732	130				0				0			0.00		5				106			68.0			18.19		16.9</pre></div>



<h3 class="wp-block-heading" id="h-step-4-fit-an-optimized-decision-forest-model-for-churn-prediction-using-grid-search">Step #4 Fit an Optimized Decision Forest Model for Churn Prediction using Grid Search</h3>



<p class="wp-block-paragraph">Now comes the exciting part. We will train a series of 36 decision forests and then choose the best-performing model. The technique used in this process is called hyperparameter tuning (more specifically, grid search), and I have recently published <a aria-label="undefined (opens in a new tab)" href="https://www.relataly.com/hyperparameter-tuning-with-grid-search/2261/" target="_blank" rel="noreferrer noopener">a separate article on this topic</a>.</p>



<p class="wp-block-paragraph">The following code defines the parameters the grid search will test (max_depth, n_estimators, and min_samples_split). Then the code runs the grid search and trains the decision forests. Finally, we print out the model ranking along with model parameters. </p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Define parameters
max_depth=[2, 4, 8, 16]
n_estimators = [64, 128, 256]
min_samples_split = [5, 20, 30]

param_grid = dict(max_depth=max_depth, n_estimators=n_estimators, min_samples_split=min_samples_split)

# Build the gridsearch
dfrst = RandomForestClassifier(n_estimators=n_estimators, max_depth=max_depth, min_samples_split=min_samples_split, class_weight='balanced')
grid = GridSearchCV(estimator=dfrst, param_grid=param_grid, cv = 5)
grid_results = grid.fit(x_train, y_train)

# Summarize the results in a readable format
results_df = pd.DataFrame(grid_results.cv_results_)
results_df.sort_values(by=['rank_test_score'], ascending=True, inplace=True)

# Reduce the results to selected columns
results_filtered = results_df[results_df.columns[results_df.columns.isin(['param_max_depth', 'param_min_samples_split', 'param_n_estimators','std_fit_time', 'rank_test_score', 'std_test_score', 'mean_test_score'])]].copy()
results_filtered</pre></div>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:false,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;null&quot;,&quot;mime&quot;:&quot;text/plain&quot;,&quot;theme&quot;:&quot;3024-day&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Plain Text&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;text&quot;}">std_fit_time	param_max_depth	param_min_samples_split	param_n_estimators	mean_test_score	std_test_score	rank_test_score
28				0.004742		16						5					128	0.931415	0.006950		1
27				0.002620		16						5					64	0.925848	0.008177		2
29				0.015711		16						5					256	0.925846	0.006156		3
20				0.006258		8						5					256	0.923704	0.007961		4
19				0.001816		8						5					128	0.921988	0.006458		5
18				0.002161		8						5					64	0.919847	0.007716		6
31				0.003728		16						20					128	0.902690	0.011642		7
30				0.002057		16						20					64	0.901836	0.009789		8
32				0.004940		16						20					256	0.899691	0.009813		9
21				0.001994		8						20					64	0.898408	0.008710		10
22				0.003761		8						20					128	0.897121	0.007529		11
23				0.003828		8						20					256	0.895833	0.009159		12
33				0.003798		16						30					64	0.885546	0.010394		13
26				0.005560		8						30					256	0.885541	0.014937		14
...</pre></div>



<p class="wp-block-paragraph">The best-performing model is model number 29, which scores 92,7 %. Its hyperparameters are as follows:</p>



<ul class="wp-block-list">
<li>max_depth = 16</li>



<li>min_samples_split = 5</li>



<li>n_estimators 256</li>
</ul>



<p class="wp-block-paragraph">We will proceed with this model. So what does this model tell us?</p>



<p class="wp-block-paragraph">We can gain an overview of the distributions of our customers according to their churn probability. Just use the following code:</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Predicting Probabilities
y_pred_prob = best_clf.predict_proba(x_test) 
churnproba = y_pred_prob[:,1]

# Create histograms for feature columns separated by prediction label value
sns.histplot(data=churnproba)</pre></div>



<figure class="wp-block-image size-full"><img decoding="async" width="517" height="324" data-attachment-id="6810" data-permalink="https://www.relataly.com/predicting-the-customer-churn-of-a-telecommunications-provider/2378/image-12-12/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2022/04/image-12.png" data-orig-size="517,324" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-12" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2022/04/image-12.png" src="https://www.relataly.com/wp-content/uploads/2022/04/image-12.png" alt="Customer Base According to their Churn Rate" class="wp-image-6810" srcset="https://www.relataly.com/wp-content/uploads/2022/04/image-12.png 517w, https://www.relataly.com/wp-content/uploads/2022/04/image-12.png 300w" sizes="(max-width: 517px) 100vw, 517px" /><figcaption class="wp-element-caption">Customer Base According to their Churn Rate</figcaption></figure>



<p class="wp-block-paragraph">Customers who tend to churn have a churn probability greater than 0.5. They are further to the right in the diagram. So, we don&#8217;t have to worry about the customers on the far left (&lt;0.5).</p>



<h3 class="wp-block-heading" id="h-step-5-best-model-performance-insights">Step #5 Best Model Performance Insights</h3>



<p class="wp-block-paragraph">Let&#8217;s take a more detailed look at the performance of the best model. We do this by calculating the confusion matrix. </p>



<p class="wp-block-paragraph">If you want to learn more about measuring the performance of classification models, check out<a href="https://www.relataly.com/measuring-classification-performance-with-python-and-scikit-learn/846/" target="_blank" rel="noreferrer noopener"> this tutorial</a>.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Extract the best decision forest 
best_clf = grid_results.best_estimator_
y_pred = best_clf.predict(x_test)

# Create a confusion matrix
cnf_matrix = confusion_matrix(y_test, y_pred)

# Create heatmap from the confusion matrix
class_names=[False, True] 
tick_marks = [0.5, 1.5]
fig, ax = plt.subplots(figsize=(7, 6))
sns.heatmap(pd.DataFrame(cnf_matrix), annot=True, cmap=&quot;Blues&quot;, fmt='g')
ax.xaxis.set_label_position(&quot;top&quot;)
plt.tight_layout()
plt.title('Confusion matrix')
plt.ylabel('Actual label'); plt.xlabel('Predicted label')
plt.yticks(tick_marks, class_names); plt.xticks(tick_marks, class_names)</pre></div>



<figure class="wp-block-image size-large"><img decoding="async" width="486" height="452" data-attachment-id="2387" data-permalink="https://www.relataly.com/predicting-the-customer-churn-of-a-telecommunications-provider/2378/image-14-4/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2020/07/image-14.png" data-orig-size="486,452" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-14" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2020/07/image-14.png" src="https://www.relataly.com/wp-content/uploads/2020/07/image-14.png" alt="Confusion matrix on churn probabilities calculated with feature permutation importance" class="wp-image-2387" srcset="https://www.relataly.com/wp-content/uploads/2020/07/image-14.png 486w, https://www.relataly.com/wp-content/uploads/2020/07/image-14.png 300w" sizes="(max-width: 486px) 100vw, 486px" /></figure>



<p class="wp-block-paragraph">From 1000 customers in the test dataset, our model correctly classified 100 customers as churn candidates. For 832 customers, the model accurately predicted that these customers are unlikely to churn. In 30 cases, the model falsely classified customers as churn candidates, and 38 were missed and falsely classified as non-churn candidates. The result is a model accuracy of 93,2 % (based on a 0.5 threshold). </p>



<h3 class="wp-block-heading" id="h-step-6-permutation-feature-importance">Step #6 Permutation Feature Importance</h3>



<p class="wp-block-paragraph">Now that we have trained a model that gives good results, we want to understand the importance of the model&#8217;s features. With the following code, we calculate the Feature Importance score. Then we visualize the results in a barplot.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Load the data
r = permutation_importance(best_clf, x_test, y_test, n_repeats=30, random_state=0)

# Set the color range
clist = [(0, &quot;purple&quot;), (1, &quot;blue&quot;)]
rvb = mcolors.LinearSegmentedColormap.from_list(&quot;&quot;, clist)
colors = rvb(data_im['feature_permuation_score']/len(x_test.columns))

# Plot the barchart
data_im = pd.DataFrame(r.importances_mean, columns=['feature_permuation_score'])
data_im['feature_names'] = x_test.columns
data_im = data_im.sort_values('feature_permuation_score', ascending=False)

fig, ax = plt.subplots(figsize=(16, 5))
sns.barplot(y=data_im['feature_names'], x=&quot;feature_permuation_score&quot;, data=data_im, palette='nipy_spectral')
ax.set_title(&quot;Random Forest Feature Importances&quot;)</pre></div>



<figure class="wp-block-image size-full"><img decoding="async" width="1013" height="334" data-attachment-id="6801" data-permalink="https://www.relataly.com/predicting-the-customer-churn-of-a-telecommunications-provider/2378/output-2-2/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2022/04/output-2.png" data-orig-size="1013,334" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="output-2" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2022/04/output-2.png" src="https://www.relataly.com/wp-content/uploads/2022/04/output-2.png" alt="" class="wp-image-6801" srcset="https://www.relataly.com/wp-content/uploads/2022/04/output-2.png 1013w, https://www.relataly.com/wp-content/uploads/2022/04/output-2.png 300w, https://www.relataly.com/wp-content/uploads/2022/04/output-2.png 768w" sizes="(max-width: 1013px) 100vw, 1013px" /></figure>



<p class="wp-block-paragraph">The feature ranking can provide the starting point for deeper analysis. As we can see, the most important features are the monthly fee, data usage, and customer service calls (CustServCalls). Of particular interest is the importance of customer service calls, as this could indicate that customers who encounter customer service have negative experiences.</p>



<h2 class="wp-block-heading" id="h-summary">Summary</h2>



<p class="wp-block-paragraph">This article has shown how to implement a churn prediction model using Python and scikit-learn Machine Learning. We have calculated the permutation feature importance to analyze which features contribute to the performance of our model. You have learned that permutation feature importance can provide data scientists with new insights into the context of a prediction model. Therefore, the technique is often a good starting point for forthleading investigations. </p>



<p class="wp-block-paragraph">I am always interested in improving my articles and learning from my audience. If you liked this article, show your appreciation by leaving a comment. And if you didn&#8217;t, let me know too. Cheers </p>



<h2 class="wp-block-heading">Sources and Further Reading</h2>



<ol class="wp-block-list">
<li><a href="https://amzn.to/3MAy8j5" target="_blank" rel="noreferrer noopener">Andriy Burkov (2020) Machine Learning Engineering</a></li>



<li><a href="https://amzn.to/3D0gB0e" target="_blank" rel="noreferrer noopener">Oliver Theobald (2020) Machine Learning For Absolute Beginners: A Plain English Introduction</a></li>



<li><a href="https://amzn.to/3S9Nfkl" target="_blank" rel="noreferrer noopener">Aurélien Géron (2019) Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems </a></li>



<li><a href="https://amzn.to/3EKidwE" target="_blank" rel="noreferrer noopener">David Forsyth (2019) Applied Machine Learning Springer</a></li>
</ol>



<p class="has-contrast-2-color has-base-3-background-color has-text-color has-background wp-block-paragraph"><em>The links above to Amazon are affiliate links. By buying through these links, you support the Relataly.com blog and help to cover the hosting costs. Using the links does not affect the price.</em></p>



<p class="wp-block-paragraph">And if you are interested in text mining and customer satisfaction, consider taking a look at my recent blog about sentiment analysis:</p>



<figure class="wp-block-embed is-type-wp-embed is-provider-relataly-com wp-block-embed-relataly-com"><div class="wp-block-embed__wrapper">
<blockquote class="wp-embedded-content" data-secret="HQ0lUMzbZR"><a href="https://www.relataly.com/simple-sentiment-analysis-using-naive-bayes-and-logistic-regression/2007/">Sentiment Analysis with Naive Bayes and Logistic Regression in Python</a></blockquote><iframe loading="lazy" class="wp-embedded-content" sandbox="allow-scripts" security="restricted"  title="&#8220;Sentiment Analysis with Naive Bayes and Logistic Regression in Python&#8221; &#8212; relataly.com" src="https://www.relataly.com/simple-sentiment-analysis-using-naive-bayes-and-logistic-regression/2007/embed/#?secret=WMWtohaT3c#?secret=HQ0lUMzbZR" data-secret="HQ0lUMzbZR" width="600" height="338" frameborder="0" marginwidth="0" marginheight="0" scrolling="no"></iframe>
</div></figure>
<p>The post <a href="https://www.relataly.com/predicting-the-customer-churn-of-a-telecommunications-provider/2378/">Customer Churn Prediction &#8211; Understanding Models with Feature Permutation Importance using Python</a> appeared first on <a href="https://www.relataly.com">relataly.com</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.relataly.com/predicting-the-customer-churn-of-a-telecommunications-provider/2378/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">2378</post-id>	</item>
		<item>
		<title>Classifying Purchase Intention of Online Shoppers with Python</title>
		<link>https://www.relataly.com/predicting-the-purchase-intention-of-online-shoppers/982/</link>
					<comments>https://www.relataly.com/predicting-the-purchase-intention-of-online-shoppers/982/#respond</comments>
		
		<dc:creator><![CDATA[Florian Follonier]]></dc:creator>
		<pubDate>Mon, 11 May 2020 21:42:35 +0000</pubDate>
				<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[Classification (two-class)]]></category>
		<category><![CDATA[Data Science]]></category>
		<category><![CDATA[Data Sources]]></category>
		<category><![CDATA[Feature Permutation Importance]]></category>
		<category><![CDATA[Insurance]]></category>
		<category><![CDATA[Kaggle Competitions]]></category>
		<category><![CDATA[Logistic Regression]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Marketing Automation]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Retail]]></category>
		<category><![CDATA[Sales Forecasting]]></category>
		<category><![CDATA[Scikit-Learn]]></category>
		<category><![CDATA[Seaborn]]></category>
		<category><![CDATA[AI in E-Commerce]]></category>
		<category><![CDATA[AI in Marketing]]></category>
		<category><![CDATA[Beginner Tutorials]]></category>
		<category><![CDATA[Classic Machine Learning]]></category>
		<category><![CDATA[Classification Error Metrics]]></category>
		<category><![CDATA[Confusion Matrix]]></category>
		<category><![CDATA[Supervised Learning]]></category>
		<category><![CDATA[Whisker Plots]]></category>
		<guid isPermaLink="false">https://www.relataly.com/?p=982</guid>

					<description><![CDATA[<p>Online shopping has become a part of our daily lives, and online stores are continually seeking to improve their sales. One way to achieve this is by using machine learning to predict customers&#8217; purchase intentions. This innovative process can help businesses understand their customers&#8217; behavior and tailor their marketing strategies accordingly. In this article, we ... <a title="Classifying Purchase Intention of Online Shoppers with Python" class="read-more" href="https://www.relataly.com/predicting-the-purchase-intention-of-online-shoppers/982/" aria-label="Read more about Classifying Purchase Intention of Online Shoppers with Python">Read more</a></p>
<p>The post <a href="https://www.relataly.com/predicting-the-purchase-intention-of-online-shoppers/982/">Classifying Purchase Intention of Online Shoppers with Python</a> appeared first on <a href="https://www.relataly.com">relataly.com</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">Online shopping has become a part of our daily lives, and online stores are continually seeking to improve their sales. One way to achieve this is by using machine learning to predict customers&#8217; purchase intentions. This innovative process can help businesses understand their customers&#8217; behavior and tailor their marketing strategies accordingly.</p>



<p class="wp-block-paragraph">In this article, we will explore the practical side of purchase intention prediction. Our focus is on developing a classification model that predicts whether a visitor will make a purchase or not. We&#8217;ll use Scikit-Learn&#8217;s machine learning library to train a Logistic Regression algorithm, and evaluate the model&#8217;s performance. Our ultimate goal is to provide insights into the circumstances under which customers make purchase decisions.</p>



<p class="wp-block-paragraph">Predicting purchase intentions can offer significant benefits to online stores, such as identifying potential customers who are most likely to buy and targeting their marketing efforts accordingly. By understanding the practical application of machine learning for purchase intention prediction, online businesses can gain a competitive edge and increase their revenue.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<p class="wp-block-paragraph">Also: <a href="https://www.relataly.com/simple-sentiment-analysis-using-naive-bayes-and-logistic-regression/2007/" target="_blank" rel="noreferrer noopener">Sentiment Analysis with Naive Bayes and Logistic Regression in Python</a></p>



<h2 class="wp-block-heading">About Modeling Customer Purchase Intentions</h2>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">Customer purchase intention prediction is the process of using machine learning algorithms to predict the likelihood that a particular customer will make a purchase. This can be useful for various applications, such as identifying potential customers most likely interested in a particular product or service and targeting marketing and sales efforts accordingly.</p>



<p class="wp-block-paragraph">To make accurate predictions about customer purchase intentions, it is important to have access to high-quality data about the customer, such as their demographic information, purchasing history, and other relevant factors. By analyzing this data and applying appropriate machine learning algorithms, it is possible to identify patterns and trends that can predict the likelihood that a particular customer will make a purchase.</p>



<p class="wp-block-paragraph">There are many different approaches to customer purchase intention prediction, and the specific methods used can vary depending on the application and the data available. Some common techniques for predicting customer purchase intentions include using regression analysis to model the relationship between purchase intentions and other variables and using classification algorithms to classify customers as likely or unlikely to make a purchase. By using these techniques, it is possible to make more accurate and useful predictions about customer purchase intentions.</p>



<p class="wp-block-paragraph">Also: <a href="https://www.relataly.com/predicting-the-customer-churn-of-a-telecommunications-provider/2378/" target="_blank" rel="noreferrer noopener">Customer Churn Prediction &#8211; Understanding Models with Feature Permutation Importance</a></p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%">
<figure class="wp-block-image size-full"><img decoding="async" width="478" height="500" data-attachment-id="12685" data-permalink="https://www.relataly.com/men-and-woman-doing-groceries-machine-learning-customer-purchase-intention-prediction-relataly-midjourney-min/" data-orig-file="https://www.relataly.com/wp-content/uploads/2023/03/men-and-woman-doing-groceries-machine-learning-customer-purchase-intention-prediction-relataly-midjourney-min.png" data-orig-size="478,500" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="men and woman doing groceries machine learning customer purchase intention prediction relataly midjourney-min" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2023/03/men-and-woman-doing-groceries-machine-learning-customer-purchase-intention-prediction-relataly-midjourney-min.png" src="https://www.relataly.com/wp-content/uploads/2023/03/men-and-woman-doing-groceries-machine-learning-customer-purchase-intention-prediction-relataly-midjourney-min.png" alt="Customer purchase intentions sometimes follow patterns that can be used for predictive purposes. Image created with Midjourney." class="wp-image-12685" srcset="https://www.relataly.com/wp-content/uploads/2023/03/men-and-woman-doing-groceries-machine-learning-customer-purchase-intention-prediction-relataly-midjourney-min.png 478w, https://www.relataly.com/wp-content/uploads/2023/03/men-and-woman-doing-groceries-machine-learning-customer-purchase-intention-prediction-relataly-midjourney-min.png 287w" sizes="(max-width: 478px) 100vw, 478px" /><figcaption class="wp-element-caption">Customer purchase intentions sometimes follow patterns that can be used for predictive purposes. Image created with <a href="http://www.midjourney.com" target="_blank" rel="noreferrer noopener">Midjourney</a>.</figcaption></figure>
</div>
</div>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<h2 class="wp-block-heading">How Modeling Purchase Intentions can Lead to a Better Customer Understanding</h2>



<p class="wp-block-paragraph">Predicting the purchase intentions of online shoppers can be a step for online stores to understand their customers better. Creating predictive models makes it possible to conclude the factors influencing customers&#8217; buying behavior. At what time of day are our customers most inclined to buy? For which products do customers often abandon the purchase process? Such questions are fascinating for marketing departments. Once understood, they can enable marketers to optimize their customers&#8217; buying experience and achieve a higher conversion rate. In this way, intention prediction can help online stores target customers with the right products at the right time and thus take a step toward marketing automation.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="6828" data-permalink="https://www.relataly.com/predicting-the-purchase-intention-of-online-shoppers/982/image-13-12/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2022/04/image-13.png" data-orig-size="1846,861" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="Classifying Purchase Intentions of Online Shoppers with Python" data-image-description="&lt;p&gt;Classifying Purchase Intentions of Online Shoppers with Python&lt;/p&gt;
" data-image-caption="&lt;p&gt;Classifying Purchase Intentions of Online Shoppers with Python&lt;/p&gt;
" data-large-file="https://www.relataly.com/wp-content/uploads/2022/04/image-13.png" src="https://www.relataly.com/wp-content/uploads/2022/04/image-13-1024x478.png" alt="A classification model that predicts the buying intention of online shoppers" class="wp-image-6828" width="760" height="355" srcset="https://www.relataly.com/wp-content/uploads/2022/04/image-13.png 1024w, https://www.relataly.com/wp-content/uploads/2022/04/image-13.png 300w, https://www.relataly.com/wp-content/uploads/2022/04/image-13.png 768w, https://www.relataly.com/wp-content/uploads/2022/04/image-13.png 1536w, https://www.relataly.com/wp-content/uploads/2022/04/image-13.png 1846w" sizes="(max-width: 760px) 100vw, 760px" /></figure>



<h2 class="wp-block-heading" id="h-implementing-a-prediction-model-for-purchase-intentions-with-python">Implementing a Prediction Model for Purchase Intentions with Python</h2>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">Logistic regression is a widely-used algorithm in machine learning that is particularly useful for solving two-class classification problems. One of the primary benefits of using logistic regression models is that they can help us understand the factors that influence the predictions made by the model. This interpretability is a key advantage of logistic regression, making it a popular choice in many real-world applications.</p>



<p class="wp-block-paragraph">In the next steps of our analysis, we will develop a two-class classification model that utilizes the logistic regression algorithm to predict the purchase intentions of online shoppers. By analyzing a set of features that are likely to influence a shopper&#8217;s decision to purchase, such as product price, customer reviews, and shipping time, we can build a model that accurately predicts the likelihood of a shopper completing a purchase. The logistic regression algorithm will be particularly useful in this case, as it allows us to identify which features are the most significant predictors of purchase intention.</p>



<p class="wp-block-paragraph">The code is available on the GitHub repository.</p>



<div class="wp-block-kadence-advancedbtn kb-buttons-wrap kb-btns_d5d832-9e"><a class="kb-button kt-button button kb-btn_7d1c88-9e kt-btn-size-standard kt-btn-width-type-full kb-btn-global-inherit kt-btn-has-text-true kt-btn-has-svg-true wp-block-button__link wp-block-kadence-singlebtn" href="https://github.com/flo7up/relataly-public-python-tutorials/blob/master/02%20Classification/019%20%20Classifying%20Shopper%20Buying%20Intention%20using%20Logistic%20Regression.ipynb" target="_blank" rel="noreferrer noopener"><span class="kb-svg-icon-wrap kb-svg-icon-fe_eye kt-btn-icon-side-left"><svg viewBox="0 0 24 24"  fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"  aria-hidden="true"><path d="M1 12s4-8 11-8 11 8 11 8-4 8-11 8-11-8-11-8z"/><circle cx="12" cy="12" r="3"/></svg></span><span class="kt-btn-inner-text">View on GitHub </span></a>

<a class="kb-button kt-button button kb-btn_040040-16 kt-btn-size-standard kt-btn-width-type-full kb-btn-global-inherit kt-btn-has-text-true kt-btn-has-svg-true wp-block-button__link wp-block-kadence-singlebtn" href="https://github.com/flo7up/relataly-public-python-API-tutorials" target="_blank" rel="noreferrer noopener"><span class="kb-svg-icon-wrap kb-svg-icon-fa_github kt-btn-icon-side-left"><svg viewBox="0 0 496 512"  fill="currentColor" xmlns="http://www.w3.org/2000/svg"  aria-hidden="true"><path d="M165.9 397.4c0 2-2.3 3.6-5.2 3.6-3.3.3-5.6-1.3-5.6-3.6 0-2 2.3-3.6 5.2-3.6 3-.3 5.6 1.3 5.6 3.6zm-31.1-4.5c-.7 2 1.3 4.3 4.3 4.9 2.6 1 5.6 0 6.2-2s-1.3-4.3-4.3-5.2c-2.6-.7-5.5.3-6.2 2.3zm44.2-1.7c-2.9.7-4.9 2.6-4.6 4.9.3 2 2.9 3.3 5.9 2.6 2.9-.7 4.9-2.6 4.6-4.6-.3-1.9-3-3.2-5.9-2.9zM244.8 8C106.1 8 0 113.3 0 252c0 110.9 69.8 205.8 169.5 239.2 12.8 2.3 17.3-5.6 17.3-12.1 0-6.2-.3-40.4-.3-61.4 0 0-70 15-84.7-29.8 0 0-11.4-29.1-27.8-36.6 0 0-22.9-15.7 1.6-15.4 0 0 24.9 2 38.6 25.8 21.9 38.6 58.6 27.5 72.9 20.9 2.3-16 8.8-27.1 16-33.7-55.9-6.2-112.3-14.3-112.3-110.5 0-27.5 7.6-41.3 23.6-58.9-2.6-6.5-11.1-33.3 2.6-67.9 20.9-6.5 69 27 69 27 20-5.6 41.5-8.5 62.8-8.5s42.8 2.9 62.8 8.5c0 0 48.1-33.6 69-27 13.7 34.7 5.2 61.4 2.6 67.9 16 17.7 25.8 31.5 25.8 58.9 0 96.5-58.9 104.2-114.8 110.5 9.2 7.9 17 22.9 17 46.4 0 33.7-.3 75.4-.3 83.6 0 6.5 4.6 14.4 17.3 12.1C428.2 457.8 496 362.9 496 252 496 113.3 383.5 8 244.8 8zM97.2 352.9c-1.3 1-1 3.3.7 5.2 1.6 1.6 3.9 2.3 5.2 1 1.3-1 1-3.3-.7-5.2-1.6-1.6-3.9-2.3-5.2-1zm-10.8-8.1c-.7 1.3.3 2.9 2.3 3.9 1.6 1 3.6.7 4.3-.7.7-1.3-.3-2.9-2.3-3.9-2-.6-3.6-.3-4.3.7zm32.4 35.6c-1.6 1.3-1 4.3 1.3 6.2 2.3 2.3 5.2 2.6 6.5 1 1.3-1.3.7-4.3-1.3-6.2-2.2-2.3-5.2-2.6-6.5-1zm-11.4-14.7c-1.6 1-1.6 3.6 0 5.9 1.6 2.3 4.3 3.3 5.6 2.3 1.6-1.3 1.6-3.9 0-6.2-1.4-2.3-4-3.3-5.6-2z"/></svg></span><span class="kt-btn-inner-text">Relataly GitHub Repo </span></a></div>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<h3 class="wp-block-heading" id="h-prerequisites">Prerequisites</h3>



<p class="wp-block-paragraph">Before starting the coding part, make sure that you have set up your <a href="https://www.python.org/downloads/" target="_blank" rel="noreferrer noopener">Python 3</a> environment and required packages. If you don&#8217;t have an environment, consider the&nbsp;<a href="https://www.anaconda.com/products/individual" target="_blank" rel="noreferrer noopener">Anaconda Python environment</a>. To set it up, you can follow the steps in&nbsp;<a href="https://www.relataly.com/category/data-science/setup-anaconda-environment/" target="_blank" rel="noreferrer noopener">this tutorial</a>. Please ensure to install all required packages:</p>



<ul class="wp-block-list">
<li><em><a href="https://pandas.pydata.org/" target="_blank" rel="noreferrer noopener">pandas</a></em></li>



<li><em><a href="https://numpy.org/" target="_blank" rel="noreferrer noopener">NumPy</a></em></li>



<li><em><a href="https://matplotlib.org/" target="_blank" rel="noreferrer noopener">matplotlib</a></em></li>
</ul>



<p class="wp-block-paragraph">In addition, we will be using the machine learning library <a href="https://scikit-learn.org/stable/" target="_blank" rel="noreferrer noopener">Scikit-learn</a> and <a data-type="URL" data-id="https://seaborn.pydata.org/" href="https://seaborn.pydata.org/" target="_blank" rel="noreferrer noopener">Seaborn</a> for visualization. You can install packages using console commands:</p>



<ul class="wp-block-list">
<li><em>pip install &lt;package name&gt;</em></li>



<li><em>conda install &lt;package name&gt;</em>&nbsp;(if you are using the anaconda packet manager)</li>
</ul>



<h3 class="wp-block-heading" id="h-about-the-dataset">About the Dataset</h3>



<p class="wp-block-paragraph">In this tutorial, we will be working with a public dataset from <a href="https://www.kaggle.com/roshansharma/online-shoppers-intention" target="_blank" rel="noreferrer noopener">Kaggle.com</a>. The data consists of 18 feature vectors belonging to 12,330 shopping sessions. You can download the data via the link below:</p>



<div class="wp-block-file"><a id="wp-block-file--media-3f304c01-ab35-4462-bda0-88dce356d27e" href="https://www.relataly.com/wp-content/uploads/2020/05/online_shoppers_intention.csv">online_shoppers_intention.csv</a><a href="https://www.relataly.com/wp-content/uploads/2020/05/online_shoppers_intention.csv" class="wp-block-file__button wp-element-button" download aria-describedby="wp-block-file--media-3f304c01-ab35-4462-bda0-88dce356d27e">Download</a></div>



<p class="wp-block-paragraph">The data stems from a big shopping website that has recorded the session for one year. Each record belongs to a separate shopping session and user. Thus, there is no bias in the data, such as a specific period, user, or day to avoid. </p>



<p class="wp-block-paragraph">Below you will find an overview of the features contained in the data (Source: Kaggle.com): </p>



<ul class="wp-block-list">
<li>&#8220;Administrative,&#8221; &#8220;Administrative Duration,&#8221; &#8220;Informational,&#8221; &#8220;Informational Duration,&#8221; &#8220;Product Related,&#8221; and &#8220;Product-Related Duration&#8221; represent the number of different types of pages visited by the visitor in that session and the total time spent in each of these page categories.&nbsp;</li>



<li>The &#8220;Bounce Rate,&#8221; &#8220;Exit Rate,&#8221; and &#8220;Page Value&#8221; features represent the metrics measured by &#8220;Google Analytics&#8221; for each page on the e-commerce site. </li>



<li>The &#8220;Special Day&#8221; feature indicates the closeness of the site visiting time to a specific special day (e.g., Mother&#8217;s Day, Valentine&#8217;s Day)</li>



<li>The dataset also includes an operating system, browser, region, traffic type, visitor type as returning or new visitor, a Boolean value indicating whether the date of the visit is a weekend, and the month of the year.</li>
</ul>



<p class="wp-block-paragraph">The &#8216;Revenue&#8217; attribute is the class label, called the &#8220;prediction label.&#8221;</p>



<h3 class="wp-block-heading" id="h-step-1-load-the-data">Step #1 Load the Data</h3>



<p class="wp-block-paragraph">We begin by loading the shopping dataset into a Pandas DataFrame. Afterward, we will print a brief overview of the data.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}">import calendar
import math 
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt 
from matplotlib import cm
import seaborn as sns

from sklearn.model_selection import train_test_split as train_test_split
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import confusion_matrix, roc_curve, auc, roc_auc_score
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

# Load train data
filepath = &quot;data/classification-online-shopping/&quot;
df_shopping_base = pd.read_csv(filepath + 'online_shoppers_intention.csv') 
df_shopping_base</pre></div>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:false,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;null&quot;,&quot;mime&quot;:&quot;text/plain&quot;,&quot;theme&quot;:&quot;3024-day&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Plain Text&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;text&quot;}">	Administrative	Administrative_Duration	Informational	Informational_Duration	ProductRelated	ProductRelated_Duration	BounceRates	ExitRates	PageValues	SpecialDay	Month	OperatingSystems	Browser	Region	TrafficType	VisitorType			Weekend	Revenue
0	0.0				0.0						0.0				0.0						1.0				0.000000				0.20		0.20		0.0			0.0			Feb		1					1		1		1			Returning_Visitor	False	False
1	0.0				0.0						0.0				0.0						2.0				64.000000				0.00		0.10		0.0			0.0			Feb		2					2		1		2			Returning_Visitor	False	False
2	0.0				-1.0					0.0				-1.0					1.0				-1.000000				0.20		0.20		0.0			0.0			Feb		4					1		9		3			Returning_Visitor	False	False
3	0.0				0.0						0.0				0.0						2.0				2.666667				0.05		0.14		0.0			0.0			Feb		3					2		2		4			Returning_Visitor	False	False
4	0.0				0.0						0.0				0.0						10.0			627.500000				0.02		0.05		0.0			0.0			Feb		3					3		1		4			Returning_Visitor	True	False</pre></div>



<h3 class="wp-block-heading" id="h-step-2-cleaning-the-data">Step #2 Cleaning the Data</h3>



<p class="wp-block-paragraph">Before we can start training our prediction model, we&#8217;ll do some cleanups (handling missing data, data type conversions, treating outliers, and so on).</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Replacing visitor_type to int
print(df_shopping_base['VisitorType'].unique())
df_shop = df_shopping_base.replace({'VisitorType' : { 'New_Visitor' : 0, 'Returning_Visitor' : 1, 'Other' : 2 }})

# Coverting month column to numeric numeric values
monthlist = df_shop['Month'].replace('June', 'Jun')
mlist = []
m = np.array(monthlist)
for mi in m:
    a = list(calendar.month_abbr).index(mi)
    mlist.append(a)
df_shop['Month'] =  mlist

# Delete records with NAs
df_shop.dropna(inplace=True)

df_shop.head()</pre></div>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:false,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;null&quot;,&quot;mime&quot;:&quot;text/plain&quot;,&quot;theme&quot;:&quot;3024-day&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Plain Text&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;text&quot;}">['Returning_Visitor' 'New_Visitor' 'Other']
	Administrative	Administrative_Duration	Informational	Informational_Duration	ProductRelated	ProductRelated_Duration	BounceRates	ExitRates	PageValues	SpecialDay	Month	OperatingSystems	Browser	Region	TrafficType	VisitorType	Weekend	Revenue
  0	0.0				0.0						0.0				0.0						1.0				0.000000				0.20		0.20		0.0			0.0			2		1					1		1		1			1			False	False
1	0.0				0.0						0.0				0.0						2.0				64.000000				0.00		0.10		0.0			0.0			2		2					2		1		2			1			False	False
2	0.0				-1.0					0.0				-1.0					1.0				-1.000000				0.20		0.20		0.0			0.0			2		4					1		9		3			1			False	False
3	0.0				0.0						0.0				0.0						2.0				2.666667				0.05		0.14		0.0			0.0			2		3					2		2		4			1			False	False
4	0.0				0.0						0.0				0.0						10.0			627.50</pre></div>



<h3 class="wp-block-heading" id="h-step-3-exploring-the-data">Step #3 Exploring the Data</h3>



<p class="wp-block-paragraph">Next, we will familiarize ourselves with the data. </p>



<h4 class="wp-block-heading" id="h-3-1-class-labels">3.1 Class Labels</h4>



<p class="wp-block-paragraph">First, we take a look at the class labels to see how balanced they are. If class labels are balanced, it means that each class has an approximately equal number of examples in the training data. This is important because it helps ensure that the trained model will be able to make accurate predictions on new data. If the class labels are unbalanced, then the model is more likely to be biased towards the more common classes, which can lead to poor performance on less common classes. Additionally, unbalanced class labels can make it more difficult to evaluate the performance of a machine learning model, because the model&#8217;s accuracy may not be an accurate reflection of its ability to generalize to new data.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Checking the balance of prediction labels
plt.figure(figsize=(16,2))
fig = sns.countplot(y=&quot;Revenue&quot;, data=df_shop, palette=&quot;muted&quot;)
plt.show()</pre></div>



<figure class="wp-block-image size-full is-resized"><img decoding="async" data-attachment-id="6830" data-permalink="https://www.relataly.com/predicting-the-purchase-intention-of-online-shoppers/982/output-3-2/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2022/04/output-3.png" data-orig-size="953,154" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="output-3" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2022/04/output-3.png" src="https://www.relataly.com/wp-content/uploads/2022/04/output-3.png" alt="" class="wp-image-6830" width="946" height="153" srcset="https://www.relataly.com/wp-content/uploads/2022/04/output-3.png 953w, https://www.relataly.com/wp-content/uploads/2022/04/output-3.png 300w, https://www.relataly.com/wp-content/uploads/2022/04/output-3.png 768w" sizes="(max-width: 946px) 100vw, 946px" /></figure>



<p class="wp-block-paragraph">Our class labels are somewhat imbalanced, as there are much more cases in the data with a prediction &#8220;false.&#8221; The reason is that more visitors won&#8217;t buy anything. Imbalanced data can affect the performance of classification models. But now that we are aware of the imbalance in our data, we can choose appropriate evaluation metrics later.</p>



<h4 class="wp-block-heading" id="h-3-2-feature-correlation">3.2 Feature Correlation</h4>



<p class="wp-block-paragraph">When developing classification models, not all features are usually equally useful. It is important that features are not correlated because correlated features can provide redundant information to a machine learning model. If two or more features are highly correlated, they may convey the same information to the model, which can make the model&#8217;s predictions less accurate. Additionally, having correlated features can make it more difficult to interpret the model&#8217;s predictions, because it is not clear which features are actually contributing to the model&#8217;s decision-making process. </p>



<p class="wp-block-paragraph">Let&#8217;s check which of our features are correlated. First, we will create a series of Whiskerplots for the features in our dataset. They help us identify potential outliers and get a better idea of how the data looks.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Whiskerplots
c= 'black'
df_shop.drop('Revenue', axis=1).plot(kind='box', 
                                subplots=True, layout=(4,4), 
                                sharex=False, sharey=False, 
                                figsize=(14,14), 
                                title='Whister plot for input variables')
plt.show()</pre></div>



<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="986" data-permalink="https://www.relataly.com/predicting-the-purchase-intention-of-online-shoppers/982/image-35-2/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2020/05/image-35.png" data-orig-size="821,893" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-35" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2020/05/image-35.png" src="https://www.relataly.com/wp-content/uploads/2020/05/image-35.png" alt="Purchase Intention Prediction, Feature Permutation Importance, Feature Correlation plot" class="wp-image-986" width="664" height="721" srcset="https://www.relataly.com/wp-content/uploads/2020/05/image-35.png 821w, https://www.relataly.com/wp-content/uploads/2020/05/image-35.png 276w, https://www.relataly.com/wp-content/uploads/2020/05/image-35.png 768w" sizes="(max-width: 664px) 100vw, 664px" /><figcaption class="wp-element-caption">Feature Whiskerplots</figcaption></figure>



<p class="wp-block-paragraph">The Whiskerplots show that there are a couple of outliers in the data. However, the outliers are not significant enough to worry about them.</p>



<p class="wp-block-paragraph">Histograms are another way of visualizing the distribution of numerical or categorical variables. They give a rough sense of the density of the distribution. To create the histograms, run the code below.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># # Create pariplots for feature columns separated by prediction label value
df_plot = df_shop.copy()

# class_columnname = 'Revenue'
sns.pairplot(df_plot, hue=&quot;Revenue&quot;, height=2.5)</pre></div>



<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="6829" data-permalink="https://www.relataly.com/predicting-the-purchase-intention-of-online-shoppers/982/shopper-buying-intention/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2022/04/Shopper-Buying-Intention.png" data-orig-size="2560,2485" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="Shopper-Buying-Intention pair plots with seaborn" data-image-description="&lt;p&gt;Shopper-Buying-Intention pair plots with seaborn&lt;/p&gt;
" data-image-caption="&lt;p&gt;Shopper-Buying-Intention pair plots with seaborn&lt;/p&gt;
" data-large-file="https://www.relataly.com/wp-content/uploads/2022/04/Shopper-Buying-Intention.png" src="https://www.relataly.com/wp-content/uploads/2022/04/Shopper-Buying-Intention-1024x994.png" alt="Purchase Intention Prediction, Feature Permutation Importance, Feature Correlation plot" class="wp-image-6829" width="1117" height="1085" srcset="https://www.relataly.com/wp-content/uploads/2022/04/Shopper-Buying-Intention.png 1024w, https://www.relataly.com/wp-content/uploads/2022/04/Shopper-Buying-Intention.png 300w, https://www.relataly.com/wp-content/uploads/2022/04/Shopper-Buying-Intention.png 768w, https://www.relataly.com/wp-content/uploads/2022/04/Shopper-Buying-Intention.png 1536w, https://www.relataly.com/wp-content/uploads/2022/04/Shopper-Buying-Intention.png 2048w, https://www.relataly.com/wp-content/uploads/2022/04/Shopper-Buying-Intention.png 2475w" sizes="(max-width: 1117px) 100vw, 1117px" /></figure>



<p class="wp-block-paragraph">Finally, we create a correlation matrix and visualize it as a heat map. The matrix provides a quick overview of which features are correlated and not.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Feature correlation
plt.figure(figsize=(15,4))
f_cor = df_shop.corr()
sns.heatmap(f_cor, cmap=&quot;Blues_r&quot;)</pre></div>



<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="4662" data-permalink="https://www.relataly.com/predicting-the-purchase-intention-of-online-shoppers/982/image-50-3/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2021/06/image-50.png" data-orig-size="899,367" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-50" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2021/06/image-50.png" src="https://www.relataly.com/wp-content/uploads/2021/06/image-50.png" alt="Purchase Intention Prediction, Feature Permutation Importance" class="wp-image-4662" width="674" height="275" srcset="https://www.relataly.com/wp-content/uploads/2021/06/image-50.png 899w, https://www.relataly.com/wp-content/uploads/2021/06/image-50.png 300w, https://www.relataly.com/wp-content/uploads/2021/06/image-50.png 768w" sizes="(max-width: 674px) 100vw, 674px" /></figure>



<p class="wp-block-paragraph">The correlation plot shows that some features are highly correlated. The following features are highly correlated:</p>



<ul class="wp-block-list">
<li>ProductRelated and ProductRelated_Duration. </li>



<li>BounceRates and ExitRates</li>
</ul>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}">plt.figure(figsize=(8,5))
sns.scatterplot(x= 'BounceRates',y='ExitRates',data=df_shop,hue='Revenue')
plt.title('Bounce Rate vs. Exit Rate', fontweight='bold', fontsize=15)
plt.show()</pre></div>



<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="4674" data-permalink="https://www.relataly.com/predicting-the-purchase-intention-of-online-shoppers/982/image-51-3/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2021/06/image-51.png" data-orig-size="510,335" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-51" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2021/06/image-51.png" src="https://www.relataly.com/wp-content/uploads/2021/06/image-51.png" alt="Purchase Intention Prediction, Feature Permutation Importance" class="wp-image-4674" width="537" height="352" srcset="https://www.relataly.com/wp-content/uploads/2021/06/image-51.png 510w, https://www.relataly.com/wp-content/uploads/2021/06/image-51.png 300w" sizes="(max-width: 537px) 100vw, 537px" /></figure>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}">plt.figure(figsize=(8,5))
sns.scatterplot(x= 'ProductRelated',y='ProductRelated_Duration',data=df_shop,hue='Revenue')
plt.title('Bounce Rate vs. Exit Rate', fontweight='bold', fontsize=15)
plt.show()</pre></div>



<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="4675" data-permalink="https://www.relataly.com/predicting-the-purchase-intention-of-online-shoppers/982/image-52-3/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2021/06/image-52.png" data-orig-size="514,335" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-52" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2021/06/image-52.png" src="https://www.relataly.com/wp-content/uploads/2021/06/image-52.png" alt="Purchase Intention Prediction, Feature Permutation Importance" class="wp-image-4675" width="528" height="343" srcset="https://www.relataly.com/wp-content/uploads/2021/06/image-52.png 514w, https://www.relataly.com/wp-content/uploads/2021/06/image-52.png 300w" sizes="(max-width: 528px) 100vw, 528px" /></figure>



<p class="wp-block-paragraph">When we start to train our model, we will only use one of the features from the two pairs.</p>



<h3 class="wp-block-heading" id="h-step-4-data-preprocessing">Step #4 Data Preprocessing </h3>



<p class="wp-block-paragraph">Now that we are familiar with the data, we can prepare the data to train the purchase intention classification model. Firstly, we will include only selecting the features from the original shopping dataset. Second, we will split the data into two separate datasets: train and test with a ratio of 70%. Train X_train and X_test datasets contain the features, while y_train and y_test include the respective prediction labels. Thirdly, we will use the MinMaxScaler to scale the numeric features between 0 and 1. Scaling makes it easier for the algorithm to interpret the data and improve classification performance.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Separate labels from training data
features = ['Administrative', 'Administrative_Duration', 'Informational', 
            'Informational_Duration', 'ProductRelated', 'BounceRates', 'PageValues', 
            'Month', 'Region', 'TrafficType', 'VisitorType']
X = df_shop[features] #Training data
y = df_shop['Revenue'] #Prediction label

# Split the data into x_train and y_train data sets
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.7, random_state=0)

# Scale the numeric values
scaler = MinMaxScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)</pre></div>



<h3 class="wp-block-heading" id="h-step-5-train-a-purchase-intention-classifier">Step #5 Train a Purchase Intention Classifier</h3>



<p class="wp-block-paragraph">Next, it is time to train our prediction model. Various classification algorithms could be used to solve this problem, for example, decision trees, random forests, neural networks, or support-vector machines. We will use the logistic regression algorithm, a common choice for simple two-class prediction problems. </p>



<p class="wp-block-paragraph">We start the training process using the &#8220;fit&#8221; method of the logistic regression algorithm. </p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Training a classification model using logistic regression 
logreg = LogisticRegression(solver='lbfgs')
score = logreg.fit(X_train, y_train).decision_function(X_test)</pre></div>



<p class="wp-block-paragraph">The trained model returns a training score showing how well the model has performed on the test dataset. </p>



<h3 class="wp-block-heading" id="h-step-6-evaluate-model-performance">Step #6 Evaluate Model Performance</h3>



<p class="wp-block-paragraph">Finally, we will evaluate the performance of our classification model. For this purpose, we first create a confusion matrix. Then we calculate and compare different error metrics.</p>



<h4 class="wp-block-heading" id="h-6-1-confusion-matrix">6.1 Confusion Matrix</h4>



<p class="wp-block-paragraph">The confusion matrix is a holistic and clean way to illustrate the results of a classification model. It differentiates between predicted labels and actual labels. For a binary classification model, the matrix comprises 2&#215;2 quadrants that show the number of cases in each quadrant. </p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># create a confusion matrix
y_pred = logreg.predict(X_test)
cnf_matrix = confusion_matrix(y_test, y_pred)

# create heatmap
%matplotlib inline
class_names=[False, True] # name  of classes
fig, ax = plt.subplots(figsize=(7, 6))
tick_marks = np.arange(len(class_names))
plt.xticks(tick_marks, class_names)
plt.yticks(tick_marks, class_names)
sns.heatmap(pd.DataFrame(cnf_matrix), annot=True, cmap=&quot;YlGnBu&quot;, fmt='g')
ax.xaxis.set_label_position(&quot;top&quot;)
plt.tight_layout()
plt.title('Confusion matrix')
plt.ylabel('Actual label')
plt.xlabel('Predicted label')</pre></div>



<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="990" data-permalink="https://www.relataly.com/predicting-the-purchase-intention-of-online-shoppers/982/image-39-2/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2020/05/image-39.png" data-orig-size="492,452" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-39" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2020/05/image-39.png" src="https://www.relataly.com/wp-content/uploads/2020/05/image-39.png" alt="confusion matrix on the results of our classification model that predicts purchase intentions, purchase intention prediction model" class="wp-image-990" width="374" height="344" srcset="https://www.relataly.com/wp-content/uploads/2020/05/image-39.png 492w, https://www.relataly.com/wp-content/uploads/2020/05/image-39.png 300w" sizes="(max-width: 374px) 100vw, 374px" /></figure>



<p class="wp-block-paragraph">In the upper left (0,0), we see that the model correctly predicted for 3102 online shopping sessions that these sessions will not lead to a purchase (True negatives). In 30 cases, the model was wrong and expected that there would be a purchase, but there wasn&#8217;t (False positives). For 412 buyers, the model predicted that they would not buy anything, even though they were buying something (False negatives). In the lower right corner, we see that only in 151 cases could buyers be correctly identified as such (True positives). </p>



<h4 class="wp-block-heading" id="h-6-2-performance-metrics-for-classification-models">6.2 Performance Metrics for Classification Models</h4>



<p class="wp-block-paragraph">Next, let&#8217;s take a brief look at the performance metrics. Four standard metrics that measure the performance of classification models are Accuracy, Precision, Recall, and  f1_score. </p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}">print('Accuracy: {:.2f}'.format(accuracy_score(y_test, y_pred)))
print('Precision: {:.2f}'.format(precision_score(y_test, y_pred)))
print('Recall: {:.2f}'.format(recall_score(y_test, y_pred)))
print('f1_score: {:.2f}'.format(f1_score(y_test, y_pred)))</pre></div>



<h5 class="wp-block-heading" id="h-accuracy"><strong>Accuracy</strong></h5>



<p class="wp-block-paragraph">The accuracy of the test set shows that 88% of the online shopper sessions were correctly classified. However, our data is imbalanced. That is to say, most labels have the value &#8220;False,&#8221; and only a few target labels are &#8220;True.&#8221; Consequently, we must ensure that our model does not classify all online shoppers as &#8220;non-buyers&#8221; (label: False) but also correctly predicts the buyers (label: True). </p>



<h5 class="wp-block-heading" id="h-precision"><strong>Precision</strong></h5>



<p class="wp-block-paragraph">We calculate the precision as the number of True Positives divided by the number of True Positives and False Positives. Similar to Accuracy, Precision puts too much emphasis on the True negatives. Therefore, it does not say much about our model. The precision score for our model is just a little lower than the accuracy (83%).</p>



<h5 class="wp-block-heading" id="h-recall"><strong>Recall</strong></h5>



<p class="wp-block-paragraph">We calculate the Recall&nbsp;by dividing the number of True Positives by the sum of the True Positives and the False Negatives. The Recall of our model is 27%, which is significantly below accuracy and precision. In our case, the precision call is more meaningful than precision and Recall because it puts a higher penalty on the low number of True positives.</p>



<h5 class="wp-block-heading" id="h-f1-score"><strong>F1-Score</strong></h5>



<p class="wp-block-paragraph">The formula for the F1-Score is 2*((precision*recall)/(precision+recall)). Because the formula includes the Recall, the F-1 Score of our model is only 41%. Imagine we want to optimize our classification model further. In this case, we should look out for both F1-Score and Recall.</p>



<h4 class="wp-block-heading" id="h-6-3-interpretation">6.3 Interpretation</h4>



<p class="wp-block-paragraph">Metrics for classification models can be misleading. We should thus choose them carefully. Depending on which use case we are dealing with, False-negative and False-positive predictions can have different costs. Therefore, model evaluation is not always about exactness (precision and accuracy). Instead, the choice of performance metrics depends on what we want to achieve.</p>



<p class="wp-block-paragraph">The challenge for our model is to correctly classify the smaller group of buyers (True positives). So, optimizing our model would be about achieving a balance between good accuracy without significantly lowering the F1_Score and Recall.</p>



<h3 class="wp-block-heading" id="h-step-7-insights-on-customer-purchase-intentions">Step #7 Insights on Customer Purchase Intentions</h3>



<p class="wp-block-paragraph">Finally, we will use permutation feature importance to gain additional insights into our prediction model&#8217;s features. Permutation Feature Importance is a technique that measures the influence of features on the predictions of our model. Features with a high positive or negative score substantially impact predicting the prediction label. In contrast, features with scores close to zero play a lesser role in the predictions.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Load the data
r = permutation_importance(model_lgr, X_test, y_test, n_repeats=30, random_state=0)

# Plot the barchart
data_im = pd.DataFrame(r.importances_mean, columns=['feature_permuation_score'])
data_im['feature_names'] = X.columns
data_im = data_im.sort_values('feature_permuation_score', ascending=False)

fig, ax = plt.subplots(figsize=(16, 5))
sns.barplot(y=data_im['feature_names'], x=&quot;feature_permuation_score&quot;, data=data_im, palette='nipy_spectral')
ax.set_title(&quot;Logistic Regression Feature Importances&quot;)</pre></div>



<figure class="wp-block-image size-large"><img decoding="async" width="1024" height="326" data-attachment-id="4684" data-permalink="https://www.relataly.com/predicting-the-purchase-intention-of-online-shoppers/982/image-56-3/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2021/06/image-56.png" data-orig-size="1050,334" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-56" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2021/06/image-56.png" src="https://www.relataly.com/wp-content/uploads/2021/06/image-56-1024x326.png" alt="online purchase intention prediction - results of the feature permutation importance technique" class="wp-image-4684" srcset="https://www.relataly.com/wp-content/uploads/2021/06/image-56.png 1024w, https://www.relataly.com/wp-content/uploads/2021/06/image-56.png 300w, https://www.relataly.com/wp-content/uploads/2021/06/image-56.png 768w, https://www.relataly.com/wp-content/uploads/2021/06/image-56.png 1050w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<p class="wp-block-paragraph">We can see that the three features with the highest impact are PageValues, BounceRates and Administration_Duration. </p>



<ul class="wp-block-list">
<li>The higher the page&#8217;s value, the higher the customer&#8217;s chance to make a purchase. </li>



<li>The higher the average bounce rate that the customer visits, the higher the chance the customer makes a purchase.</li>



<li>In contrast, the more time a customer spends on administrative settings, the lower the chance the customer completes the purchase.</li>
</ul>



<p class="wp-block-paragraph">These were just a few sample findings. There is much more to explore in the data, and deeper analysis can uncover much more about the customers&#8217; buying decisions.</p>



<h2 class="wp-block-heading" id="h-summary">Summary</h2>



<p class="wp-block-paragraph">This article has presented customer purchase prediction as an interesting use case for machine learning in e-commerce. After discussing the use case, we have developed a classification model that predicts the purchase intentions of online shoppers. You have learned to preprocess the data, train a logistic regression model and evaluate the model&#8217;s performance. Classifying purchase intentions can help online shops understand their customers better and automate certain online marketing activities. The previous section showed how marketers could use this to gain further insights into their customers&#8217; behavior.</p>



<p class="wp-block-paragraph">Thanks for reading and if you have any questions, let me know in the comments. </p>



<h2 class="wp-block-heading">Sources and Further Reading</h2>



<p class="wp-block-paragraph">I hope this article was helpful. If you have any remarks or questions, please write them in the comments. </p>



<div style="display: inline-block;">
  <iframe sandbox="allow-popups allow-scripts allow-modals allow-forms allow-same-origin" style="width:120px;height:240px;" marginwidth="0" marginheight="0" scrolling="no" frameborder="0" src="//ws-eu.amazon-adsystem.com/widgets/q?ServiceVersion=20070822&amp;OneJS=1&amp;Operation=GetAdHtml&amp;MarketPlace=DE&amp;source=ss&amp;ref=as_ss_li_til&amp;ad_type=product_link&amp;tracking_id=flo7up-21&amp;language=de_DE&amp;marketplace=amazon&amp;region=DE&amp;placement=3030181162&amp;asins=3030181162&amp;linkId=669e46025028259138fbb5ccec12dfbe&amp;show_border=true&amp;link_opens_in_new_window=true"></iframe>
<iframe sandbox="allow-popups allow-scripts allow-modals allow-forms allow-same-origin" style="width:120px;height:240px;" marginwidth="0" marginheight="0" scrolling="no" frameborder="0" src="//ws-eu.amazon-adsystem.com/widgets/q?ServiceVersion=20070822&amp;OneJS=1&amp;Operation=GetAdHtml&amp;MarketPlace=DE&amp;source=ss&amp;ref=as_ss_li_til&amp;ad_type=product_link&amp;tracking_id=flo7up-21&amp;language=de_DE&amp;marketplace=amazon&amp;region=DE&amp;placement=1999579577&amp;asins=1999579577&amp;linkId=91d862698bf9010ff4c09539e4c49bf4&amp;show_border=true&amp;link_opens_in_new_window=true"></iframe>
<iframe sandbox="allow-popups allow-scripts allow-modals allow-forms allow-same-origin" style="width:120px;height:240px;" marginwidth="0" marginheight="0" scrolling="no" frameborder="0" src="//ws-eu.amazon-adsystem.com/widgets/q?ServiceVersion=20070822&amp;OneJS=1&amp;Operation=GetAdHtml&amp;MarketPlace=DE&amp;source=ss&amp;ref=as_ss_li_til&amp;ad_type=product_link&amp;tracking_id=flo7up-21&amp;language=de_DE&amp;marketplace=amazon&amp;region=DE&amp;placement=1839217715&amp;asins=1839217715&amp;linkId=356ba074068849ff54393f527190825d&amp;show_border=true&amp;link_opens_in_new_window=true"></iframe>
<iframe sandbox="allow-popups allow-scripts allow-modals allow-forms allow-same-origin" style="width:120px;height:240px;" marginwidth="0" marginheight="0" scrolling="no" frameborder="0" src="//ws-eu.amazon-adsystem.com/widgets/q?ServiceVersion=20070822&amp;OneJS=1&amp;Operation=GetAdHtml&amp;MarketPlace=DE&amp;source=ss&amp;ref=as_ss_li_til&amp;ad_type=product_link&amp;tracking_id=flo7up-21&amp;language=de_DE&amp;marketplace=amazon&amp;region=DE&amp;placement=1492032646&amp;asins=1492032646&amp;linkId=2214804dd039e7103577abd08722abac&amp;show_border=true&amp;link_opens_in_new_window=true"></iframe>
</div>



<p class="has-contrast-2-color has-base-3-background-color has-text-color has-background wp-block-paragraph"><em>The links above to Amazon are affiliate links. By buying through these links, you support the Relataly.com blog and help to cover the hosting costs. Using the links does not affect the price.</em></p>
<p>The post <a href="https://www.relataly.com/predicting-the-purchase-intention-of-online-shoppers/982/">Classifying Purchase Intention of Online Shoppers with Python</a> appeared first on <a href="https://www.relataly.com">relataly.com</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.relataly.com/predicting-the-purchase-intention-of-online-shoppers/982/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">982</post-id>	</item>
		<item>
		<title>Geographic Heat Maps with GeoPandas: Visualizing COVID-19 Data in Python</title>
		<link>https://www.relataly.com/visualize-covid-19-data-on-a-geographic-heat-maps/291/</link>
					<comments>https://www.relataly.com/visualize-covid-19-data-on-a-geographic-heat-maps/291/#comments</comments>
		
		<dc:creator><![CDATA[Florian Follonier]]></dc:creator>
		<pubDate>Wed, 08 Apr 2020 22:03:00 +0000</pubDate>
				<category><![CDATA[Data Science]]></category>
		<category><![CDATA[Data Sources]]></category>
		<category><![CDATA[Data Visualization]]></category>
		<category><![CDATA[Geo Heat Maps]]></category>
		<category><![CDATA[GeoPandas]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Covid-19 Analytics]]></category>
		<category><![CDATA[Geographic Maps]]></category>
		<category><![CDATA[Heat Map]]></category>
		<guid isPermaLink="false">https://www.relataly.com/?p=291</guid>

					<description><![CDATA[<p>The spreading of COVID-19 has led to an increased interest in displaying region and country-specific information on geographic heat maps. Geographic heat maps use color shadings to visualize data that includes a spatial component and refers, for example, to countries, cities, towns, mountains, etc. The color shades are defined in a color palette and determined ... <a title="Geographic Heat Maps with GeoPandas: Visualizing COVID-19 Data in Python" class="read-more" href="https://www.relataly.com/visualize-covid-19-data-on-a-geographic-heat-maps/291/" aria-label="Read more about Geographic Heat Maps with GeoPandas: Visualizing COVID-19 Data in Python">Read more</a></p>
<p>The post <a href="https://www.relataly.com/visualize-covid-19-data-on-a-geographic-heat-maps/291/">Geographic Heat Maps with GeoPandas: Visualizing COVID-19 Data in Python</a> appeared first on <a href="https://www.relataly.com">relataly.com</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">The spreading of COVID-19 has led to an increased interest in displaying region and country-specific information on geographic heat maps. Geographic heat maps use color shadings to visualize data that includes a spatial component and refers, for example, to countries, cities, towns, mountains, etc. The color shades are defined in a color palette and determined by numerical values on a scale. In this way, geographic heat maps give the viewer a quick overview of what is happening in different regions. This tutorial shows how to create geographic heat maps in Python using the GeoPandas library. We will work with COVID-19 data and visualize it using various color-coded maps.</p>



<p class="wp-block-paragraph">The rest of this article proceeds as follows: We begin by going through the steps to visualize COVID-19 data on a geographic heat map. We will be using the GeoPandas library to plot the maps. Geopandas is an open-source project for working with geospatial data in Python. Our heat map will use color shades to visualize growth rates and total cases of COVID-19 in different countries. In addition, we will zoom in on specific map regions. </p>



<p class="wp-block-paragraph">Also: <a href="https://www.relataly.com/predicting-crimes-in-san-francisco-creatingsf-crime-map-using-xgboost/2960/" target="_blank" rel="noreferrer noopener">Predictive Policing: Preventing Crime in San Francisco using XGBoost</a></p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<h2 class="wp-block-heading">What are Geographic Heat Maps?</h2>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">Geographic heat maps are visual representations of data that use color or other visual encodings to show the density or intensity of data points in a geographic region. They are commonly used to represent data that is associated with a geographic location, such as population data, economic data, or weather data.</p>



<p class="wp-block-paragraph">Geographic heat maps are typically created by overlaying a grid or mesh on a map and then assigning a color or other visual encoding to each grid cell based on the density or intensity of data points in that cell. The resulting heat map shows the distribution or pattern of data points across the geographic region and can provide valuable insights and information about the data.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%">
<figure class="wp-block-image"><img decoding="async" src="https://www.relataly.com/wp-content/uploads/2020/04/image-41-1024x443.png" alt="Geographic heat map showing COVID-19 growth rates in different countries of the world. In this Python tutorial we will create similar maps."/><figcaption class="wp-element-caption">Geographic heat map showing COVID-19 growth rates in different countries of the world. In this Python tutorial we will create similar maps.</figcaption></figure>
</div>
</div>



<p class="wp-block-paragraph">Also: <a href="https://www.relataly.com/cryptocurrency-price-charts-with-color-overlay-python/2820/" target="_blank" rel="noreferrer noopener">Color-Coded Cryptocurrency Price Charts in Python</a></p>



<h2 class="wp-block-heading">What are Geographic Heat Maps used for?</h2>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">Geographic heat maps are used for a variety of purposes, such as:</p>



<ul class="wp-block-list">
<li><strong>Visualizing data:</strong> geographic heat maps can provide a clear and intuitive way to visualize data that is associated with a geographic location, allowing analysts and users to quickly and easily understand the data and identify patterns, trends, and relationships.</li>



<li><strong>Identifying spatial patterns: </strong>geographic heat maps can help to identify spatial patterns or trends in the data, such as clusters, outliers, or trends over time. This can provide valuable insights and information about the data and can help to inform decision-making and analysis.</li>



<li><strong>Analyzing and comparing data:</strong> geographic heat maps can be used to compare and contrast different datasets or to analyze the relationship between different variables or data sources. This can help to identify correlations, trends, or patterns that may not be immediately apparent from the raw data.</li>
</ul>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<h2 class="wp-block-heading">What are the Potential Pitfalls of Using Geographic Heat Maps?</h2>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">While geographic heat maps are useful for all kinds of purposes, there are a few potential limitations and pitfalls to consider when using heat maps:</p>



<ol class="wp-block-list">
<li>Choosing an appropriate color scale: It&#8217;s essential to choose a color scale that accurately reflects the data being represented and is easy for viewers to interpret. If the color scale is not well-suited to the data, it can be difficult for viewers to understand the patterns being shown accurately.</li>



<li>Overloading the map with too much data: It&#8217;s possible to add too much data to a heat map, which can make it difficult to interpret and potentially obscure important patterns. It&#8217;s important to balance the need for detail with the need for clarity when creating a heat map.</li>



<li>Visual distortion: When working with large or irregularly shaped regions, it can be challenging to depict the data using a heat map accurately. This can lead to visual distortion, where the map does not accurately reflect the actual distribution of the data.</li>



<li>Misinterpretation of the data: Heat maps are a visual representation of data, which can be subject to misinterpretation. It&#8217;s important to carefully consider how the data is represented and provide clear context and explanations for the presented patterns.</li>
</ol>



<p class="wp-block-paragraph">Let&#8217;s keep these potential pitfalls in mind during the following tutorial.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<h2 class="wp-block-heading">What is GeoPandas?</h2>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">GeoPandas is a Python package that provides tools for working with geospatial data. It extends the popular pandas package, which provides data manipulation and analysis tools, to include support for geographic data. GeoPandas allows users to manipulate and analyze geospatial data in a familiar pandas DataFrame structure and includes functions for reading and writing spatial data in various formats, as well as tools for visualizing and mapping data. </p>



<p class="wp-block-paragraph">GeoPandas is built on top of other popular packages, such as Shapely and Fiona, and is a popular choice for working with geospatial data in Python.</p>



<p class="wp-block-paragraph">With GeoPandas, users can:</p>



<ol class="wp-block-list">
<li>Read and write spatial data in various formats, such as Shapefile, GeoJSON, and GeoPackage.</li>



<li>Perform geometric operations on spatial data, such as buffering, intersection, and union.</li>



<li>Create maps and visualize spatial data using matplotlib, a popular Python plotting library.</li>



<li>Analyze and manipulate spatial data in a pandas DataFrame structure, allowing users to use the powerful data manipulation and analysis tools provided by pandas.</li>
</ol>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<h2 class="wp-block-heading">Creating Geographic Heat Maps with Python and GeoPandas</h2>



<p class="wp-block-paragraph">In this tutorial, we will learn how to create geographic heat maps using Python and the GeoPandas package. Geographic heat maps are visualizations that show the intensity of data at different locations on a map. They are commonly used to represent the distribution of a variable across a geographic area, and can be useful for identifying patterns, trends, and anomalies in the data. In this tutorial, we will learn how to create geographic heat maps using Python and the GeoPandas package. In the following, we will walk through the steps of loading, manipulating, and visualizing spatial data with GeoPandas, and demonstrate how to create geographic heat maps for covid-19.</p>



<p class="wp-block-paragraph">The code is available on the GitHub repository.</p>



<div class="wp-block-kadence-advancedbtn kb-buttons-wrap kb-btns_63bafd-55"><a class="kb-button kt-button button kb-btn_753aa7-f4 kt-btn-size-standard kt-btn-width-type-full kb-btn-global-inherit kt-btn-has-text-true kt-btn-has-svg-true wp-block-button__link wp-block-kadence-singlebtn" href="https://github.com/flo7up/relataly-public-python-tutorials/blob/master/00%20Data%20Visualization/070%20Geographic%20Heatmaps%20using%20Python.ipynb" target="_blank" rel="noreferrer noopener"><span class="kb-svg-icon-wrap kb-svg-icon-fe_eye kt-btn-icon-side-left"><svg viewBox="0 0 24 24"  fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"  aria-hidden="true"><path d="M1 12s4-8 11-8 11 8 11 8-4 8-11 8-11-8-11-8z"/><circle cx="12" cy="12" r="3"/></svg></span><span class="kt-btn-inner-text">View on GitHub </span></a>

<a class="kb-button kt-button button kb-btn_5b88f4-b4 kt-btn-size-standard kt-btn-width-type-full kb-btn-global-inherit kt-btn-has-text-true kt-btn-has-svg-true wp-block-button__link wp-block-kadence-singlebtn" href="https://github.com/flo7up/relataly-public-python-API-tutorials" target="_blank" rel="noreferrer noopener"><span class="kb-svg-icon-wrap kb-svg-icon-fa_github kt-btn-icon-side-left"><svg viewBox="0 0 496 512"  fill="currentColor" xmlns="http://www.w3.org/2000/svg"  aria-hidden="true"><path d="M165.9 397.4c0 2-2.3 3.6-5.2 3.6-3.3.3-5.6-1.3-5.6-3.6 0-2 2.3-3.6 5.2-3.6 3-.3 5.6 1.3 5.6 3.6zm-31.1-4.5c-.7 2 1.3 4.3 4.3 4.9 2.6 1 5.6 0 6.2-2s-1.3-4.3-4.3-5.2c-2.6-.7-5.5.3-6.2 2.3zm44.2-1.7c-2.9.7-4.9 2.6-4.6 4.9.3 2 2.9 3.3 5.9 2.6 2.9-.7 4.9-2.6 4.6-4.6-.3-1.9-3-3.2-5.9-2.9zM244.8 8C106.1 8 0 113.3 0 252c0 110.9 69.8 205.8 169.5 239.2 12.8 2.3 17.3-5.6 17.3-12.1 0-6.2-.3-40.4-.3-61.4 0 0-70 15-84.7-29.8 0 0-11.4-29.1-27.8-36.6 0 0-22.9-15.7 1.6-15.4 0 0 24.9 2 38.6 25.8 21.9 38.6 58.6 27.5 72.9 20.9 2.3-16 8.8-27.1 16-33.7-55.9-6.2-112.3-14.3-112.3-110.5 0-27.5 7.6-41.3 23.6-58.9-2.6-6.5-11.1-33.3 2.6-67.9 20.9-6.5 69 27 69 27 20-5.6 41.5-8.5 62.8-8.5s42.8 2.9 62.8 8.5c0 0 48.1-33.6 69-27 13.7 34.7 5.2 61.4 2.6 67.9 16 17.7 25.8 31.5 25.8 58.9 0 96.5-58.9 104.2-114.8 110.5 9.2 7.9 17 22.9 17 46.4 0 33.7-.3 75.4-.3 83.6 0 6.5 4.6 14.4 17.3 12.1C428.2 457.8 496 362.9 496 252 496 113.3 383.5 8 244.8 8zM97.2 352.9c-1.3 1-1 3.3.7 5.2 1.6 1.6 3.9 2.3 5.2 1 1.3-1 1-3.3-.7-5.2-1.6-1.6-3.9-2.3-5.2-1zm-10.8-8.1c-.7 1.3.3 2.9 2.3 3.9 1.6 1 3.6.7 4.3-.7.7-1.3-.3-2.9-2.3-3.9-2-.6-3.6-.3-4.3.7zm32.4 35.6c-1.6 1.3-1 4.3 1.3 6.2 2.3 2.3 5.2 2.6 6.5 1 1.3-1.3.7-4.3-1.3-6.2-2.2-2.3-5.2-2.6-6.5-1zm-11.4-14.7c-1.6 1-1.6 3.6 0 5.9 1.6 2.3 4.3 3.3 5.6 2.3 1.6-1.3 1.6-3.9 0-6.2-1.4-2.3-4-3.3-5.6-2z"/></svg></span><span class="kt-btn-inner-text">Relataly GitHub Repo </span></a></div>



<h3 class="wp-block-heading" id="h-prerequisites">Prerequisites</h3>



<p class="wp-block-paragraph">Before starting the coding part, make sure that you have set up your <a href="https://www.python.org/downloads/" target="_blank" rel="noreferrer noopener">Python 3</a> environment and required packages. If you don&#8217;t have an environment set up yet, you can follow&nbsp;the steps in <a href="https://www.relataly.com/anaconda-python-environment-machine-learning/1663/" target="_blank" rel="noreferrer noopener">this tutorial</a>&nbsp;to set up the&nbsp;<a href="https://www.anaconda.com/products/individual" target="_blank" rel="noreferrer noopener">Anaconda environment</a>.</p>



<p class="wp-block-paragraph">Also, make sure you install all required packages. We will be working with the following standard packages:&nbsp;</p>



<ul class="wp-block-list">
<li><em><a href="https://pandas.pydata.org/" target="_blank" rel="noreferrer noopener">pandas</a></em></li>



<li><em><a href="https://numpy.org/" target="_blank" rel="noreferrer noopener">NumPy</a></em></li>



<li><a href="https://docs.python.org/3/library/math.html" target="_blank" rel="noreferrer noopener">math</a></li>



<li><em><a href="https://matplotlib.org/" target="_blank" rel="noreferrer noopener">matplotlib</a></em></li>
</ul>



<p class="wp-block-paragraph">You can install packages using console commands:</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:false,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;null&quot;,&quot;mime&quot;:&quot;text/plain&quot;,&quot;theme&quot;:&quot;3024-day&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Plain Text&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;text&quot;}">pip install &lt;package name&gt;
conda install &lt;package name&gt; (if you are using the anaconda packet manager)</pre></div>



<p class="wp-block-paragraph">We will create geographic heat maps with the GeoPandas Python library. You can install GeoPandas via the console by using the following command:</p>



<ul class="wp-block-list">
<li>conda install &#8211;channel conda-forge geopandas</li>



<li>pip install geopandas</li>
</ul>



<p class="wp-block-paragraph">Update (2020-09-23): With the release of Python 3.8, there is a new <a href="https://geopandas.org/install.html" target="_blank" rel="noreferrer noopener">install procedure</a>:</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:false,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;null&quot;,&quot;mime&quot;:&quot;text/plain&quot;,&quot;theme&quot;:&quot;3024-day&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Plain Text&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;text&quot;}">conda create -n geo_env
conda activate geo_env
conda config --env --add channels conda-forge
conda config --env --set channel_priority strict
conda install python=3 geopandas</pre></div>



<h3 class="wp-block-heading" id="h-download-the-geographic-map-data-from-naturalearthdata">Download the Geographic Map Data From Naturalearthdata</h3>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">First, we will get the map with the geospatial data. Rendering maps with GeoPandas requires a shapefile. A shapefile is a DataFrame with some graphical data attached. For instance, some shapefiles show cities, countries, continents, or maps of the entire world. So in our case, the shapefile is a list of countries, whereby each country has its graphical representation in polygons. The example presented in this tutorial will use a world map.</p>



<p class="wp-block-paragraph">Various sources on the web provide shapefiles for different geographical regions and in varying detail. For example, <a href="https://www.naturalearthdata.com/downloads/10m-cultural-vectors/">n</a><a href="https://www.naturalearthdata.com/downloads/10m-cultural-vectors/" target="_blank" rel="noreferrer noopener">aturalearthdata.com</a> provides a map of the world. To download the map, go to the natualearthdata webpage, and with a click on the green button, you can download version 4.1.0.</p>



<p class="wp-block-paragraph">Once the download is complete, unpack the files into the folder of your Python notebook or a subfolder in the folder of your Python notebook (e.g., data/shapefiles/worldmap/).</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%">
<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="295" data-permalink="https://www.relataly.com/visualize-covid-19-data-on-a-geographic-heat-maps/291/image-14/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2020/04/image-14.png" data-orig-size="616,212" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-14" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2020/04/image-14.png" src="https://www.relataly.com/wp-content/uploads/2020/04/image-14.png" alt="natualearthdata.com geographic shapefiles " class="wp-image-295" width="295" height="102" srcset="https://www.relataly.com/wp-content/uploads/2020/04/image-14.png 616w, https://www.relataly.com/wp-content/uploads/2020/04/image-14.png 300w" sizes="(max-width: 295px) 100vw, 295px" /><figcaption class="wp-element-caption"><a href="https://www.naturalearthdata.com/downloads/10m-cultural-vectors/" target="_blank" rel="noreferrer noopener">naturalearthdata.com</a></figcaption></figure>
</div>
</div>



<h3 class="wp-block-heading" id="h-step-1-loading-the-covid-19-data">Step #1 Loading the COVID-19 Data </h3>



<p class="wp-block-paragraph">Next, we retrieve the COVID-19 data for all countries via the statworx API. If you want to learn more about using REST APIs, check out this tutorial<a href="https://www.relataly.com/access-data-sources-using-apis-in-python/278/" target="_blank" rel="noreferrer noopener"> on accessing data sources via REST APIs</a>. </p>



<p class="wp-block-paragraph">Also: <a href="https://www.relataly.com/access-remote-data-sources-using-rest-apis-in-python/278/" target="_blank" rel="noreferrer noopener">Accessing Remote Data Sources via REST APIs in Python</a></p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Setting up Packages
import json
import country_converter as coco
from datetime import datetime, timedelta
import requests
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
# Getting the data
PAYLOAD = {'code': 'ALL'}
URL = 'https://api.statworx.com/covid'
RESPONSE = requests.post(url=URL, data=json.dumps(PAYLOAD))
# Convert the response to a data frame
covid_df = pd.DataFrame.from_dict(json.loads(RESPONSE.text))
covid_df.head(3)</pre></div>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:false,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;null&quot;,&quot;mime&quot;:&quot;text/plain&quot;,&quot;theme&quot;:&quot;3024-day&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Plain Text&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;text&quot;}">      date		day	month	year	cases	deaths	country		code	population	continent	cases_cum	deaths_cum
0	2019-12-31	31	12		2019	0		0		Afghanistan	AF		38041757.0	Asia		0			0
1	2020-01-01	1	1		2020	0		0		Afghanistan	AF		38041757.0	Asia		0			0
2	2020-01-02	2	1		2020	0		0		Afghanistan	AF		38041757.0	Asia		0			0</pre></div>



<p class="wp-block-paragraph">We continue by preparing the COVID-19 data for visualizing them on a heat map.</p>



<h3 class="wp-block-heading" id="h-step-2-specifying-a-shapefile">Step #2 Specifying a Shapefile</h3>



<p class="wp-block-paragraph">Next, we use the Geopandas library to read in a shapefile at &#8220;data/shapefiles/worldmap/ne_10m_admin_0_countries.shp&#8221;. We then select the columns &#8220;ADMIN,&#8221; &#8220;ADM0_A3&#8221;, and &#8220;geometry&#8221; from the shapefile and store them in a GeoDataFrame called &#8220;geo_df.&#8221; Finally, we display the first three rows of the GeoDataFrame.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Setting the path to the shapefile
SHAPEFILE = 'data/shapefiles/worldmap/ne_10m_admin_0_countries.shp'
# Read shapefile using Geopandas
geo_df = gpd.read_file(SHAPEFILE)[['ADMIN', 'ADM0_A3', 'geometry']]
# Rename columns.
geo_df.columns = ['country', 'country_code', 'geometry']
geo_df.head(3)</pre></div>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:false,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;null&quot;,&quot;mime&quot;:&quot;text/plain&quot;,&quot;theme&quot;:&quot;3024-day&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Plain Text&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;text&quot;}">	country		country_code	geometry
0	Indonesia	IDN				MULTIPOLYGON (((117.70361 4.16341, 117.70361 4...
1	Malaysia	MYS				MULTIPOLYGON (((117.70361 4.16341, 117.69711 4...
2	Chile		CHL				MULTIPOLYGON (((-69.51009 -17.50659, -69.50611</pre></div>



<p class="wp-block-paragraph">We have created a dataframe with three columns, as you can see above. The column geometry contains the graphical representation of countries. Now that we have prepared the data, we can plot our first geographic map. We create the map by using the GeoPandas plot function.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Drop row for 'Antarctica'. It takes a lot of space in the map and is not of much use
geo_df = geo_df.drop(geo_df.loc[geo_df['country'] == 'Antarctica'].index)
# Print the map
geo_df.plot(figsize=(20, 20), edgecolor='white', linewidth=1, color='lightblue')</pre></div>



<figure class="wp-block-image size-large"><img decoding="async" width="1024" height="420" data-attachment-id="8034" data-permalink="https://www.relataly.com/visualize-covid-19-data-on-a-geographic-heat-maps/291/map-of-the-world-python/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2022/04/map-of-the-world-python.png" data-orig-size="1158,475" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="map-of-the-world-python" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2022/04/map-of-the-world-python.png" src="https://www.relataly.com/wp-content/uploads/2022/04/map-of-the-world-python-1024x420.png" alt="Geographic map of the world. This is an empty shema we will use as the basis for color-coded geo heatmpaps. " class="wp-image-8034" srcset="https://www.relataly.com/wp-content/uploads/2022/04/map-of-the-world-python.png 1024w, https://www.relataly.com/wp-content/uploads/2022/04/map-of-the-world-python.png 300w, https://www.relataly.com/wp-content/uploads/2022/04/map-of-the-world-python.png 768w, https://www.relataly.com/wp-content/uploads/2022/04/map-of-the-world-python.png 1158w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<div class="wp-block-kadence-infobox kt-info-box_729bf0-5d"><span class="kt-blocks-info-box-link-wrap info-box-link kt-blocks-info-box-media-align-left kt-info-halign-left"><div class="kt-blocks-info-box-media-container"><div class="kt-blocks-info-box-media kt-info-media-animate-none"><div class="kadence-info-box-icon-container kt-info-icon-animate-none"><div class="kadence-info-box-icon-inner-container"><span class="kb-svg-icon-wrap kb-svg-icon-fe_alertCircle kt-info-svg-icon"><svg viewBox="0 0 24 24"  fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"  aria-hidden="true"><circle cx="12" cy="12" r="10"/><line x1="12" y1="8" x2="12" y2="12"/><line x1="12" y1="16" x2="12" y2="16"/></svg></span></div></div></div></div><div class="kt-infobox-textcontent"><p class="kt-blocks-info-box-text">If you get an error: &#8220;ImportError: The Descartes package is required for plotting polygons in GeoPandas.&#8221; you first have to install the Descartes package. You can do this by typing in your console: <code>conda install descartes</code></p></div></span></div>



<h3 class="wp-block-heading" id="h-step-3-bringing-it-all-together">Step #3 Bringing It All Together</h3>



<p class="wp-block-paragraph">Next, we need to ensure that our data matches the country codes. The dataframe with the geospatial data of the world map contains country codes that adhere to iso3. However, our COVID-19 data uses iso2_codes. Luckily there is a country_converter available that does this job for us:</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Next, we need to ensure that our data matches with the country codes. 
iso3_codes = geo_df['country'].to_list()
# Convert to iso3_codes
iso2_codes_list = coco.convert(names=iso3_codes, to='ISO2', not_found='NULL')
# Add the list with iso2 codes to the dataframe
geo_df['iso2_code'] = iso2_codes_list
# There are some countries for which the converter could not find a country code. 
# We will drop these countries.
geo_df = geo_df.drop(geo_df.loc[geo_df['iso2_code'] == 'NULL'].index)</pre></div>



<p class="wp-block-paragraph">We have a list with all nations&#8217; names (country) and codes (country_code). An additional column includes the geographical representation of each country.</p>



<h3 class="wp-block-heading" id="h-step-4-preprocessing">Step #4 Preprocessing</h3>



<p class="wp-block-paragraph">Our COVID-19 data so far contains historical Covid-19 cases. We want to drop these historical cases and only get the data from the last day. Then we merge the data frames. </p>



<p class="wp-block-paragraph">Before we plot the heat map, we have to specify a variable that determines the color of the countries on the map. Our goal is to color the countries depending on the growth rate of COVID-19 cases per day. The formula for the growth rate is &#8216;new cases&#8217; / total present cases. </p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># We want to drop the history and only get the data from the last day
d = datetime.today()-timedelta(days=1)
date_yesterday = d.strftime(&quot;%Y-%m-%d&quot;)
# Preparing the data
covid_df = covid_df[covid_df['date'] == date_yesterday]
# Merge the two dataframes
merged_df = pd.merge(left=geo_df, right=covid_df, how='left', left_on='iso2_code', right_on='code')
# Delete some columns that we won't use
df = merged_df.drop(['day', 'month', 'year', 'country_y', 'code'], axis=1)
#Create the indicator values
df['case_growth_rate'] = round(df['cases']/df['cases_cum'], 2)
df['case_growth_rate'].fillna(0, inplace=True) 
df.head(3)</pre></div>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:false,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;null&quot;,&quot;mime&quot;:&quot;text/plain&quot;,&quot;theme&quot;:&quot;3024-day&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Plain Text&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;text&quot;}">	country_x	country_code		geometry												iso2_code	date		cases	deaths	population	continent	cases_cum	deaths_cum	case_growth_rate
0	Indonesia	IDN					MULTIPOLYGON 	(((117.70361 4.16341, 117.70361 4...	ID			2020-06-28	1385.0	37.0	270625567.0	Asia		52812.0		2720.0		0.03
1	Malaysia	MYS					MULTIPOLYGON 	(((117.70361 4.16341, 117.69711 4...	MY			2020-06-28	10.0	0.0		31949789.0	Asia		8616.0		121.0		0.00
2	Chile		CHL					MULTIPOLYGON 	(((-69.51009 -17.50659, -69.50611...	CL			2020-06-28	4406.0	279.0	18952035.0	America		267766.0	5347.0		0.02</pre></div>



<h3 class="wp-block-heading" id="h-step-5-creating-a-geographic-heat-map">Step #5 Creating a Geographic Heat Map</h3>



<p class="wp-block-paragraph">In the previous step, we set up the data for our map. Next, we create the geographical heat map for the world. </p>



<p class="wp-block-paragraph">We set the path to the shapefile and use Geopandas to read it. We then rename these columns. Next, we set the range for the choropleth and create a figure and axes for Matplotlib. We remove the axis and plot the choropleth using the data from the &#8216;df&#8217; dataframe and the &#8216;case_growth_rate&#8217; column, setting the edgecolor, linewidth, and cmap. We also add a title to the map and an annotation for the data source. Additionally, we create a colorbar as a legend, using the ScalarMappable function, and add it to the figure with a specified position.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Print the map
# Set the range for the choropleth
title = 'Daily COVID-19 Growth Rates'
col = 'case_growth_rate'
source = 'Source: relataly.com \nGrowth Rate = New cases / All previous cases'
vmin = df[col].min()
vmax = df[col].max()
cmap = 'viridis'
# Create figure and axes for Matplotlib
fig, ax = plt.subplots(1, figsize=(20, 8))
# Remove the axis
ax.axis('off')
df.plot(column=col, ax=ax, edgecolor='0.8', linewidth=1, cmap=cmap)
# Add a title
ax.set_title(title, fontdict={'fontsize': '25', 'fontweight': '3'})
# Create an annotation for the data source
ax.annotate(source, xy=(0.1, .08), xycoords='figure fraction', horizontalalignment='left', 
            verticalalignment='bottom', fontsize=10)
            
# Create colorbar as a legend
sm = plt.cm.ScalarMappable(norm=plt.Normalize(vmin=vmin, vmax=vmax), cmap=cmap)
# Empty array for the data range
sm._A = []
# Add the colorbar to the figure
cbaxes = fig.add_axes([0.15, 0.25, 0.01, 0.4])
cbar = fig.colorbar(sm, cax=cbaxes)</pre></div>



<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="363" data-permalink="https://www.relataly.com/visualize-covid-19-data-on-a-geographic-heat-maps/291/image-41/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2020/04/image-41.png" data-orig-size="1155,500" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-41" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2020/04/image-41.png" src="https://www.relataly.com/wp-content/uploads/2020/04/image-41-1024x443.png" alt="Geographic heat map showing COVID-19 growth rates in different countries of the world" class="wp-image-363" width="1097" height="474" srcset="https://www.relataly.com/wp-content/uploads/2020/04/image-41.png 1024w, https://www.relataly.com/wp-content/uploads/2020/04/image-41.png 300w, https://www.relataly.com/wp-content/uploads/2020/04/image-41.png 768w, https://www.relataly.com/wp-content/uploads/2020/04/image-41.png 1155w" sizes="(max-width: 1097px) 100vw, 1097px" /><figcaption class="wp-element-caption">Geographic heat map showing COVID-19 growth rates in different countries of the world</figcaption></figure>



<p class="wp-block-paragraph">As shown in the map above, countries in Central Asia and Africa currently report the highest COVID-19 growth rates. </p>



<p class="wp-block-paragraph">There are different color palettes. You can use them by altering the cmap variable. Below is a sample of ready-to-use color scales. You can find more color scales on the <a href="https://matplotlib.org/tutorials/colors/colormaps.html" target="_blank" rel="noreferrer noopener">matblotlib page</a>.</p>



<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="311" data-permalink="https://www.relataly.com/visualize-covid-19-data-on-a-geographic-heat-maps/291/image-25/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2020/04/image-25.png" data-orig-size="627,454" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-25" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2020/04/image-25.png" src="https://www.relataly.com/wp-content/uploads/2020/04/image-25.png" alt="Color scales, useful for creating geographic heat maps" class="wp-image-311" width="314" height="227" srcset="https://www.relataly.com/wp-content/uploads/2020/04/image-25.png 627w, https://www.relataly.com/wp-content/uploads/2020/04/image-25.png 300w" sizes="(max-width: 314px) 100vw, 314px" /><figcaption class="wp-element-caption">Colormaps</figcaption></figure>



<h3 class="wp-block-heading" id="h-step-6-zooming-in-on-specific-regions">Step #6 Zooming in on Specific Regions</h3>



<p class="wp-block-paragraph">We have observed that many African countries are currently reporting rising case numbers, so we create a new dataframe based on a filter for African countries using the list of country codes.</p>



<p class="wp-block-paragraph">In the following, we create a geographic map specifically for Africa. We can zoom in on a continent or a country by filtering our dataframe. The code below will filter the spatial-geo data to African countries and plot the heat map. We plot the map for Africa using this new dataframe, setting the title of the map to &#8216;COVID-19 Growth Rate per Day in Africa&#8217; and adding a source annotation to the bottom left corner of the map.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># The map shows that many african countries are currently reporting increasing case numbers
# Next we create a new df based on a filter for african countries
africa_country_list = ['ZM', 'BF', 'TZ', 'EG', 'UG', 'TN', 'TG', 'SZ', 'SD', 
                       'EH', 'SS', 'ZW', 'ZA', 'SO', 'SL', 'SC', 'SN', 'ST', 
                       'SH', 'RW', 'RE', 'GW', 'NG', 'NE', 'NA', 'MZ', 'MA', 
                       'MU', 'MR', 'ML', 'MW', 'MG', 'LY', 'LR', 'LS', 'KE', 
                       'CI', 'GN', 'GH', 'GM', 'GA', 'DJ', 'ER', 'ET', 'GQ', 
                       'BJ', 'CD', 'CG', 'YT', 'KM', 'TD', 'CF', 'CV', 'CM', 
                       'BI', 'BW', 'AO', 'DZ']
africa_map_df = df[df['iso2_code'].isin(africa_country_list)]
# Plot the map for Africa
title = 'COVID-19 Growth Rate per Day in Africa'
col = 'case_growth_rate'
source = 'Source: relataly.com \nGrowth Rate = New cases / All previous cases'
vmin = df[col].min()
vmax = df[col].max()
fig, ax = plt.subplots(1, figsize=(20, 9))
ax.axis('off')
africa_map_df.plot(column=col, ax=ax, edgecolor='0.8', linewidth=1, cmap=cmap)
ax.set_title(title, fontdict={'fontsize': '25', 'fontweight': '3'})
ax.annotate(source, xy=(0.24, .08), xycoords='figure fraction',
            horizontalalignment='left',
            verticalalignment='bottom', fontsize=10)
sm = plt.cm.ScalarMappable(norm=plt.Normalize(vmin=vmin, vmax=vmax), cmap=cmap)
cbaxes = fig.add_axes([0.35, 0.25, 0.01, 0.5])
{&quot;type&quot;:&quot;block&quot;,&quot;srcIndex&quot;:53,&quot;srcClientId&quot;:&quot;2ddd9666-6def-46e0-803e-4bf7b0366a27&quot;,&quot;srcRootClientId&quot;:&quot;&quot;}cbar = fig.colorbar(sm, cax=cbaxes)</pre></div>



<figure class="wp-block-image size-full"><img decoding="async" width="664" height="557" data-attachment-id="7660" data-permalink="https://www.relataly.com/visualize-covid-19-data-on-a-geographic-heat-maps/291/output-1/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2022/04/output-1.png" data-orig-size="664,557" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="output-1" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2022/04/output-1.png" src="https://www.relataly.com/wp-content/uploads/2022/04/output-1.png" alt="geographic map of Affirca, colored by covid-19 cases. created with geopandas in python" class="wp-image-7660" srcset="https://www.relataly.com/wp-content/uploads/2022/04/output-1.png 664w, https://www.relataly.com/wp-content/uploads/2022/04/output-1.png 300w" sizes="(max-width: 664px) 100vw, 664px" /><figcaption class="wp-element-caption">Geographic heat map of Africa showing COVID-19 growth rates in different countries</figcaption></figure>



<p class="has-kb-palette-1-color has-text-color wp-block-paragraph">In case you encounter an error with the mapclassify-package, you can try the following command to reinstall it: conda install -c conda-forge mapclassify</p>



<p class="wp-block-paragraph">Voilá, now we only see the African continent. The map shows that the countries in Africa that currently report the highest total case numbers are South Africa, Algeria, Morocco, Kamerun, and Egypt. </p>



<p class="wp-block-paragraph">Let&#8217;s take a look at the total cases per country in Africa: </p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Insert cases per population
# Alternative: africa_map_df2['cases_population'] = round(africa_map_df['cases_cum'] / africa_map_df['population'] * 100)
africa_map_df2 = africa_map_df.copy()
# Remove NAs
africa_map_df2.loc[: , 'cases_cum'].fillna(0, inplace=True)
# Show the data
africa_map_df2.head()
# Plot the map
title = 'Total COVID-19 Cases on the African Continent'
col = 'cases_cum'
source = 'Source: relataly.com '
vmin = africa_map_df2[col].min()
vmax = africa_map_df2[col].max()
fig, ax = plt.subplots(1, figsize=(20, 9))
ax.axis('off')
africa_map_df2.plot(column=col, ax=ax, edgecolor='1', linewidth=1, cmap=cmap)
ax.set_title(title, fontdict={'fontsize': '25', 'fontweight' : '3'})
ax.annotate(
    source, xy=(0.24, .08), xycoords='figure fraction', horizontalalignment='left', 
    verticalalignment='bottom', fontsize=10)
sm = plt.cm.ScalarMappable(norm=plt.Normalize(vmin=vmin, vmax=vmax), cmap=cmap)
cbaxes = fig.add_axes([0.35, 0.25, 0.01, 0.5])
cbar = fig.colorbar(sm, cax=cbaxes)</pre></div>



<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="494" data-permalink="https://www.relataly.com/visualize-covid-19-data-on-a-geographic-heat-maps/291/image-59/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2020/04/image-59.png" data-orig-size="701,557" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-59" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2020/04/image-59.png" src="https://www.relataly.com/wp-content/uploads/2020/04/image-59.png" alt="Geographic heat map of Africa showing COVID-19 total cases in different countries, created with geopandasin python" class="wp-image-494" width="553" height="440" srcset="https://www.relataly.com/wp-content/uploads/2020/04/image-59.png 701w, https://www.relataly.com/wp-content/uploads/2020/04/image-59.png 300w" sizes="(max-width: 553px) 100vw, 553px" /></figure>



<p class="wp-block-paragraph">The highest growth rate was reported by South Sudan, followed by Botswana and Niger.</p>



<h3 class="wp-block-heading" id="h-step-7-saving-a-geo-heat-maps-to-png">Step #7 Saving a Geo-Heat Maps to PNG</h3>



<p class="wp-block-paragraph">If you want to save the map, you can do this with the following command.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Safe the map to a png
fig.savefig('map_export.png', dpi=300)</pre></div>



<h2 class="wp-block-heading" id="h-summary">Summary</h2>



<p class="wp-block-paragraph">This article showed how to create geographic heat maps using the Geopandas library in Python. It showed how to read in a shapefile and create a choropleth map using the data from a dataframe. Additionally, the article explained how to filter the data to display maps of specific regions, in this case Africa. We showed how to prepare spatial data and color-code the maps using COVID-19 data. In addition, we filtered the DataFrame to create maps for specific regions, zoom in on specific areas, and alter the color style using different color maps. </p>



<p class="wp-block-paragraph">Geographic heat maps can provide valuable insights into the distribution of data and help to identify patterns and trends. The technique of creating heat maps with Geopandas is a powerful tool for data visualization and can be applied to a wide range of geographical data.</p>



<p class="wp-block-paragraph">I hope this article was helpful. If you have any questions or remarks, please write them in the comments.</p>



<p class="wp-block-paragraph">Looking for more exciting map visualizations? Consider this relataly tutorial on <a href="https://www.relataly.com/predicting-crime-types-in-san-francisco-creatingsf-crime-map-using-xgboost/2960/" target="_blank" rel="noreferrer noopener">predicting and visualizing crimes on a map of San Francisco.</a></p>



<h2 class="wp-block-heading">Sources and Further Reading</h2>



<p class="wp-block-paragraph"><a href="https://geopandas.org/en/stable/getting_started.html">https://geopandas.org/en/stable/getting_started.html</a></p>
<p>The post <a href="https://www.relataly.com/visualize-covid-19-data-on-a-geographic-heat-maps/291/">Geographic Heat Maps with GeoPandas: Visualizing COVID-19 Data in Python</a> appeared first on <a href="https://www.relataly.com">relataly.com</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.relataly.com/visualize-covid-19-data-on-a-geographic-heat-maps/291/feed/</wfw:commentRss>
			<slash:comments>3</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">291</post-id>	</item>
	</channel>
</rss>
