<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Sentiment Analysis Archives - relataly.com</title>
	<atom:link href="https://www.relataly.com/category/use-case/sentiment-analysis-use-case/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.relataly.com/category/use-case/sentiment-analysis-use-case/</link>
	<description>The Business AI Blog</description>
	<lastBuildDate>Thu, 01 Jun 2023 18:39:23 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=7.0</generator>

<image>
	<url>https://www.relataly.com/wp-content/uploads/2023/04/cropped-AI-cat-Icon-White.png</url>
	<title>Sentiment Analysis Archives - relataly.com</title>
	<link>https://www.relataly.com/category/use-case/sentiment-analysis-use-case/</link>
	<width>32</width>
	<height>32</height>
</image> 
<site xmlns="com-wordpress:feed-additions:1">175977316</site>	<item>
		<title>Building &#8220;Chat with your Data&#8221; Apps using Embeddings, ChatGPT, and Cosmos DB for Mongo DB vCore</title>
		<link>https://www.relataly.com/step-by-step-guide-to-building-your-own-chatgpt-on-a-custom-knowledge-base-in-python-leveraging-mongo-db-and-embeddings/13687/</link>
					<comments>https://www.relataly.com/step-by-step-guide-to-building-your-own-chatgpt-on-a-custom-knowledge-base-in-python-leveraging-mongo-db-and-embeddings/13687/#comments</comments>
		
		<dc:creator><![CDATA[Florian Follonier]]></dc:creator>
		<pubDate>Sat, 27 May 2023 13:25:08 +0000</pubDate>
				<category><![CDATA[ChatBots]]></category>
		<category><![CDATA[Clustering]]></category>
		<category><![CDATA[Data Science]]></category>
		<category><![CDATA[Finance]]></category>
		<category><![CDATA[Generative AI]]></category>
		<category><![CDATA[Healthcare]]></category>
		<category><![CDATA[Insurance]]></category>
		<category><![CDATA[Language Generation]]></category>
		<category><![CDATA[Logistics]]></category>
		<category><![CDATA[Marketing Automation]]></category>
		<category><![CDATA[Natural Language Processing]]></category>
		<category><![CDATA[OpenAI]]></category>
		<category><![CDATA[Prompt Engineering]]></category>
		<category><![CDATA[Retail]]></category>
		<category><![CDATA[Sentiment Analysis]]></category>
		<category><![CDATA[Use Cases]]></category>
		<category><![CDATA[Vector Databases]]></category>
		<guid isPermaLink="false">https://www.relataly.com/?p=13687</guid>

					<description><![CDATA[<p>Artificial Intelligence (AI), in particular, the advent of OpenAI&#8217;s ChatGPT, has revolutionized how we interact with technology. Chatbots powered by this advanced language model can engage users in intricate, natural language conversations, marking a significant shift in AI capabilities. However, one thing that ChatGPT isn&#8217;t designed for is integrating personalized or proprietary knowledge – it&#8217;s ... <a title="Building &#8220;Chat with your Data&#8221; Apps using Embeddings, ChatGPT, and Cosmos DB for Mongo DB vCore" class="read-more" href="https://www.relataly.com/step-by-step-guide-to-building-your-own-chatgpt-on-a-custom-knowledge-base-in-python-leveraging-mongo-db-and-embeddings/13687/" aria-label="Read more about Building &#8220;Chat with your Data&#8221; Apps using Embeddings, ChatGPT, and Cosmos DB for Mongo DB vCore">Read more</a></p>
<p>The post <a href="https://www.relataly.com/step-by-step-guide-to-building-your-own-chatgpt-on-a-custom-knowledge-base-in-python-leveraging-mongo-db-and-embeddings/13687/">Building &#8220;Chat with your Data&#8221; Apps using Embeddings, ChatGPT, and Cosmos DB for Mongo DB vCore</a> appeared first on <a href="https://www.relataly.com">relataly.com</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">Artificial Intelligence (AI), in particular, the advent of OpenAI&#8217;s ChatGPT, has revolutionized how we interact with technology. Chatbots powered by this advanced language model can engage users in intricate, natural language conversations, marking a significant shift in AI capabilities. However, one thing that ChatGPT isn&#8217;t designed for is integrating personalized or proprietary knowledge – it&#8217;s built to draw upon general knowledge, not specifics about you or your organization. That&#8217;s where the concept of Retrieval Augmented Generation (RAG) comes into play. This article explores the exciting prospect of building your own ChatGPT that lets users ask questions on a custom knowledge base.</p>



<p class="wp-block-paragraph">In this tutorial, we&#8217;ll unveil the mystery behind enterprise ChatGPT, guiding you through the process of creating your very own custom ChatGPT &#8211; an AI-powered chatbot based on OpenAI&#8217;s powerful Generative Pretrained Transformers (GPT) technology. We&#8217;ll use Python and delve into the world of <a href="https://www.relataly.com/vector-databases-the-rising-star-in-generative-ai-infrastructure/13599/" target="_blank" rel="noreferrer noopener">vector databases</a>, specifically, <a href="https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/introduction" target="_blank" rel="noreferrer noopener">Mongo API for Azure Cosmos DB</a>, to show you how you can make a large knowledgebase available to ChatGPT that can go way beyond the <a href="https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them" target="_blank" rel="noreferrer noopener">typical token limitation of GPT models</a>.</p>



<p class="wp-block-paragraph">For experts, AI fans, or tech newbies, this guide simplifies building your ChatGPT. With clear instructions, useful examples, and tips, we aim to make it informative and empowering.</p>



<p class="wp-block-paragraph">We&#8217;ll explore AI, showing you how to customize your chatbot. We&#8217;ll simplify complex concepts and show you how to start your AI adventure from home or office. Ready to start this exciting journey? Keep reading!</p>



<p class="wp-block-paragraph">Also: </p>



<ul class="wp-block-list">
<li><a href="https://www.relataly.com/how-to-build-a-twitter-news-bot-with-openai-and-newsapi/13581/" target="_blank" rel="noreferrer noopener">How to Build a Twitter News Bot with OpenAI ChatGPT and NewsAPI in Python</a></li>



<li><a href="https://www.relataly.com/from-pirates-to-nobleman-simulating-conversations-between-openais-chatgpt-and-itself-using-python/13525/" target="_blank" rel="noreferrer noopener">From Pirates to Nobleman: Simulating Conversations between Various Characters using OpenAI’s ChatGPT and Python</a></li>
</ul>



<h2 class="wp-block-heading">Note on the use of Vector DBs and Costs.</h2>



<p class="wp-block-paragraph">Please note that this tutorial describes a business use case that utilizes a Cosmos DB for Mongo DB vCore hosted on the Azure cloud. </p>



<p class="wp-block-paragraph">Alternatively, you can set up an open-source vector database on your local machine, such as Milvus. Be aware that certain code adjustments will be necessary to proceed with the open-source alternative. </p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<h2 class="wp-block-heading" id="h-why-custom-chatgpt-is-so-powerful-and-versatile">Why Custom ChatGPT is so Powerful and Versatile</h2>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">I believe we have all tested ChatGPT, and probably like me, you have been impressed by its remarkable capabilities. However, ChatGPT has a significant limitation: it can only answer questions and perform tasks based on the public knowledge base it was trained on. </p>



<p class="wp-block-paragraph">Imagine having a chatbot based on ChatGPT that communicates effectively and truly understands the nuances of your business, sector, or even a particular topic of interest. That&#8217;s the power of a custom ChatGPT. A tailor-made chatbot allows for specialized conversations, providing the needed information and drawing from a unique database you&#8217;ve developed. </p>



<p class="wp-block-paragraph">This becomes particularly beneficial in industries with specific terminologies or when you have a large database of knowledge that you want to make easily accessible and interactive. A custom ChatGPT, with its personalized and relevant responses, ensures a better user experience, effectively saving time and increasing productivity. </p>



<p class="wp-block-paragraph">Let&#8217;s delve into how to build such a solution. Spoiler it does not work by putting all the content into the prompt. But there is a great alternative. </p>



<h2 class="wp-block-heading">Understanding the Building Blocks of Custom ChatGPT with Retrieval Augmented Generation</h2>



<p class="wp-block-paragraph">The foundational technology behind ChatGPT is OpenAI&#8217;s Generative Pre-trained Transformer models (GPT). These models understand language by predicting the next word in a sentence and are trained on a diverse range of internet text. However, the GPT models, such as the GPT-3.5, have a limitation of processing 4096 tokens at a time. A token in this context is a chunk of text which can be as small as one character or as long as one word. For example, the phrase &#8220;ChatGPT is great&#8221; is four tokens long.</p>



<p class="wp-block-paragraph">Another challenge with Foundation Models such as ChatGPT is that they are trained on large-scale datasets that were available at the time of their training. This means they are not aware of any data created after their training period. Also, because they&#8217;re trained on broad, general-domain datasets, they may be less effective for tasks requiring domain-specific knowledge.</p>



<h3 class="wp-block-heading">How Retrieval Augmented Generation (RAG) Helps </h3>



<p class="wp-block-paragraph">Retrieval-Augmented Generation (RAG) is a method that combines the strength of transformer models with external knowledge to augment their understanding and applicability. Here&#8217;s a brief explanation:</p>



<p class="wp-block-paragraph">To address this, RAG retrieves relevant information from an external data source and uses this information to augment the input to the foundation model. This can make the model&#8217;s responses more informed and relevant.</p>



<h3 class="wp-block-heading">Data Sources</h3>



<p class="wp-block-paragraph">The external data can come from various sources like databases, document repositories, or APIs. To make this data compatible with the RAG approach, both the data and user queries are converted into numerical representations (embeddings) using language models.</p>



<h3 class="wp-block-heading">Data Preparation as Embeddings</h3>



<p class="wp-block-paragraph">The embeddings, which are essentially vectors, need to be stored in a database that&#8217;s efficient at storing and searching through these high-dimensional data. This is where Azure&#8217;s Cosmos Mongo DB comes into play. It&#8217;s a vector search database specifically designed for this task.</p>



<p class="wp-block-paragraph">To circumvent the token limitation and make your extensive data available to ChatGPT, we turn the data into embeddings. These are mathematical representations of your data, converting words, sentences, or documents into vectors. The advantage of using embeddings is that they capture the semantic meaning of the text, going beyond keywords to understand the context. In essence, similar information will have similar vectors, allowing us to cluster related information together and separate them from a semantically different text.</p>



<h3 class="wp-block-heading">Storing the Data in Vector Databases</h3>



<p class="wp-block-paragraph">The embeddings, which are essentially vectors, need to be stored in a database that&#8217;s efficient at storing and searching through these high-dimensional data. This is where Azure&#8217;s Cosmos Mongo DB comes into play. It&#8217;s a vector search database specifically designed for this task.</p>



<h3 class="wp-block-heading">Matching Queries to Knowledge</h3>



<p class="wp-block-paragraph">The RAG model compares the embeddings of user queries with those in the knowledge base to identify relevant information. The user&#8217;s original query is then augmented with context from similar documents in the knowledge base.</p>



<h3 class="wp-block-heading">Input to the Foundation Model</h3>



<p class="wp-block-paragraph">This augmented input is sent to the foundation model, enhancing its understanding and response quality.</p>



<h3 class="wp-block-heading">Updates</h3>



<p class="wp-block-paragraph">Importantly, the knowledge base and associated embeddings can be updated asynchronously, ensuring that the model remains up-to-date even as new information is added to the data sources.</p>



<p class="wp-block-paragraph">In sum, RAG extends the utility of foundation models by incorporating external, up-to-date, domain-specific knowledge into their understanding and output.</p>



<p class="wp-block-paragraph">By incorporating these components, you&#8217;ll be creating a robust custom ChatGPT that not only understands the user&#8217;s queries but also has access to your own information, giving it the ability to respond with precision and relevance. </p>



<p class="wp-block-paragraph">Ready to dive into the technicalities? Stay tuned!</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%">
<figure class="wp-block-image size-large"><img fetchpriority="high" decoding="async" width="512" height="277" data-attachment-id="13775" data-permalink="https://www.relataly.com/step-by-step-guide-to-building-your-own-chatgpt-on-a-custom-knowledge-base-in-python-leveraging-mongo-db-and-embeddings/13687/flo7up_a_vector_database_colorful_popart_with_an_ai_robot_worki_46d21322-5bd9-49f0-b1a7-b7b1a17536d5-copy-min/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2023/05/Flo7up_a_vector_database_colorful_popart_with_an_AI_robot_worki_46d21322-5bd9-49f0-b1a7-b7b1a17536d5-Copy-min.png" data-orig-size="1432,776" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="Flo7up_a_vector_database_colorful_popart_with_an_AI_robot_worki_46d21322-5bd9-49f0-b1a7-b7b1a17536d5-Copy-min" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2023/05/Flo7up_a_vector_database_colorful_popart_with_an_AI_robot_worki_46d21322-5bd9-49f0-b1a7-b7b1a17536d5-Copy-min.png" src="https://www.relataly.com/wp-content/uploads/2023/05/Flo7up_a_vector_database_colorful_popart_with_an_AI_robot_worki_46d21322-5bd9-49f0-b1a7-b7b1a17536d5-Copy-min-512x277.png" alt="A tailor-made chatbot allows for specialized conversations, providing the exact information needed, drawing from a unique database that you've developed. " class="wp-image-13775" srcset="https://www.relataly.com/wp-content/uploads/2023/05/Flo7up_a_vector_database_colorful_popart_with_an_AI_robot_worki_46d21322-5bd9-49f0-b1a7-b7b1a17536d5-Copy-min.png 512w, https://www.relataly.com/wp-content/uploads/2023/05/Flo7up_a_vector_database_colorful_popart_with_an_AI_robot_worki_46d21322-5bd9-49f0-b1a7-b7b1a17536d5-Copy-min.png 300w, https://www.relataly.com/wp-content/uploads/2023/05/Flo7up_a_vector_database_colorful_popart_with_an_AI_robot_worki_46d21322-5bd9-49f0-b1a7-b7b1a17536d5-Copy-min.png 768w, https://www.relataly.com/wp-content/uploads/2023/05/Flo7up_a_vector_database_colorful_popart_with_an_AI_robot_worki_46d21322-5bd9-49f0-b1a7-b7b1a17536d5-Copy-min.png 1432w" sizes="(max-width: 512px) 100vw, 512px" /><figcaption class="wp-element-caption">A tailor-made chatbot allows for specialized conversations, providing the exact information needed, drawing from a unique database that you&#8217;ve developed. </figcaption></figure>
</div>
</div>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<h2 class="wp-block-heading">Building the Custom &#8220;Chat with Your Data&#8221; App in Python</h2>



<p class="wp-block-paragraph">Now that we&#8217;ve discussed the theory behind building a custom ChatGPT and seen some exciting real-world applications, it&#8217;s time to put our knowledge into action! In this practical segment of our guide, we&#8217;re going to demonstrate how you can build a custom ChatGPT solution using Python.</p>



<p class="wp-block-paragraph">Our project will involve storing a sample PDF document in Cosmos Mongo DB and developing a chatbot capable of answering questions based on the content of this document. This practical exercise will guide you through the entire process, including turning your PDF content into embeddings, storing these embeddings in the Cosmos Mongo DB, and finally integrating it all with ChatGPT to build an interactive chatbot.</p>



<p class="wp-block-paragraph">If you&#8217;re new to Python, don&#8217;t worry, we&#8217;ll be breaking down the code and explaining each step in a straightforward manner. Let&#8217;s roll up our sleeves, fire up our Python environments, and get coding! Stay tuned as we embark on this exciting hands-on journey into the world of custom chatbots.</p>



<p class="wp-block-paragraph">The code is available on the GitHub repository.</p>



<div class="wp-block-kadence-advancedbtn kb-buttons-wrap kb-btns_c8e02b-b1"><a class="kb-button kt-button button kb-btn_022d60-c9 kt-btn-size-standard kt-btn-width-type-full kb-btn-global-inherit kt-btn-has-text-true kt-btn-has-svg-true wp-block-button__link wp-block-kadence-singlebtn" href="https://github.com/flo7up/relataly-public-python-tutorials/blob/master/300%20Distributed%20Computing%20-%20Analyzing%20Zurich%20Weather%20Data%20using%20PySpark.ipynb" target="_blank" rel="noreferrer noopener"><span class="kb-svg-icon-wrap kb-svg-icon-fe_eye kt-btn-icon-side-left"><svg viewBox="0 0 24 24"  fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"  aria-hidden="true"><path d="M1 12s4-8 11-8 11 8 11 8-4 8-11 8-11-8-11-8z"/><circle cx="12" cy="12" r="3"/></svg></span><span class="kt-btn-inner-text">View on GitHub </span></a>

<a class="kb-button kt-button button kb-btn_8db802-ce kt-btn-size-standard kt-btn-width-type-full kb-btn-global-inherit kt-btn-has-text-true kt-btn-has-svg-true wp-block-button__link wp-block-kadence-singlebtn" href="https://github.com/flo7up/relataly-public-python-API-tutorials" target="_blank" rel="noreferrer noopener"><span class="kb-svg-icon-wrap kb-svg-icon-fa_github kt-btn-icon-side-left"><svg viewBox="0 0 496 512"  fill="currentColor" xmlns="http://www.w3.org/2000/svg"  aria-hidden="true"><path d="M165.9 397.4c0 2-2.3 3.6-5.2 3.6-3.3.3-5.6-1.3-5.6-3.6 0-2 2.3-3.6 5.2-3.6 3-.3 5.6 1.3 5.6 3.6zm-31.1-4.5c-.7 2 1.3 4.3 4.3 4.9 2.6 1 5.6 0 6.2-2s-1.3-4.3-4.3-5.2c-2.6-.7-5.5.3-6.2 2.3zm44.2-1.7c-2.9.7-4.9 2.6-4.6 4.9.3 2 2.9 3.3 5.9 2.6 2.9-.7 4.9-2.6 4.6-4.6-.3-1.9-3-3.2-5.9-2.9zM244.8 8C106.1 8 0 113.3 0 252c0 110.9 69.8 205.8 169.5 239.2 12.8 2.3 17.3-5.6 17.3-12.1 0-6.2-.3-40.4-.3-61.4 0 0-70 15-84.7-29.8 0 0-11.4-29.1-27.8-36.6 0 0-22.9-15.7 1.6-15.4 0 0 24.9 2 38.6 25.8 21.9 38.6 58.6 27.5 72.9 20.9 2.3-16 8.8-27.1 16-33.7-55.9-6.2-112.3-14.3-112.3-110.5 0-27.5 7.6-41.3 23.6-58.9-2.6-6.5-11.1-33.3 2.6-67.9 20.9-6.5 69 27 69 27 20-5.6 41.5-8.5 62.8-8.5s42.8 2.9 62.8 8.5c0 0 48.1-33.6 69-27 13.7 34.7 5.2 61.4 2.6 67.9 16 17.7 25.8 31.5 25.8 58.9 0 96.5-58.9 104.2-114.8 110.5 9.2 7.9 17 22.9 17 46.4 0 33.7-.3 75.4-.3 83.6 0 6.5 4.6 14.4 17.3 12.1C428.2 457.8 496 362.9 496 252 496 113.3 383.5 8 244.8 8zM97.2 352.9c-1.3 1-1 3.3.7 5.2 1.6 1.6 3.9 2.3 5.2 1 1.3-1 1-3.3-.7-5.2-1.6-1.6-3.9-2.3-5.2-1zm-10.8-8.1c-.7 1.3.3 2.9 2.3 3.9 1.6 1 3.6.7 4.3-.7.7-1.3-.3-2.9-2.3-3.9-2-.6-3.6-.3-4.3.7zm32.4 35.6c-1.6 1.3-1 4.3 1.3 6.2 2.3 2.3 5.2 2.6 6.5 1 1.3-1.3.7-4.3-1.3-6.2-2.2-2.3-5.2-2.6-6.5-1zm-11.4-14.7c-1.6 1-1.6 3.6 0 5.9 1.6 2.3 4.3 3.3 5.6 2.3 1.6-1.3 1.6-3.9 0-6.2-1.4-2.3-4-3.3-5.6-2z"/></svg></span><span class="kt-btn-inner-text">Relataly GitHub Repo </span></a></div>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<h2 class="wp-block-heading">How to Set Up Vector Search in Cosmos DB</h2>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">First, you must understand that you will need a database to store the embeddings. It does not necessarily have to be a vector database. Still, this type of database will make your solution more performant and robust, particularly when you want to store large amounts of data.</p>



<p class="wp-block-paragraph"><a href="https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/" target="_blank" rel="noreferrer noopener">Azure Cosmos DB for MongoDB vCore</a> is the first MongoDB-compatible offering to feature Vector Search. With this feature, you can store, index, and query high-dimensional vector data directly in Azure Cosmos DB for MongoDB vCore, eliminating the need for data transfer to alternative platforms for vector similarity search capabilities. Here are the steps to set it up:</p>



<ol class="wp-block-list">
<li><strong>Choose Your Azure Cosmos DB Architecture:</strong> Azure Cosmos DB for MongoDB provides two types of architectures, RU-based and vCore-based. Each has its strengths and is best suited for certain types of applications. Choose the one that best fits your needs. If you&#8217;re looking to lift and shift existing MongoDB apps and run them as-is on a fully supported managed service, the vCore-based option could be the perfect fit.</li>



<li><strong>Configure Your Vector Search:</strong> Once your database architecture is set up, you can integrate your AI-based applications, including those using OpenAI embeddings, with your data already stored in Cosmos DB.</li>



<li><strong>Build and Deploy Your AI Application:</strong> With the Vector Search set up, you can now build your AI application that takes advantage of this feature. You can create a Go app using Azure Cosmos DB for MongoDB or deploy Azure Cosmos DB for MongoDB vCore using a Bicep template as suggested next steps.</li>
</ol>



<p class="wp-block-paragraph">Azure Cosmos DB for MongoDB vCore&#8217;s Vector Search feature is a game-changer for AI application development. It enables you to unlock new insights from your data, leading to more accurate and powerful applications.</p>



<h2 class="wp-block-heading">Cosmos DB for Mongo DB Usage Models</h2>



<p class="wp-block-paragraph">Regarding <a href="https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/choose-model" target="_blank" rel="noreferrer noopener">Cosmos DB for Mongo DB</a>, there are two options to choose from: Request Unit (RU) Database Account and vCore Cluster. Each option follows a different pricing model to suit diverse needs.</p>



<p class="wp-block-paragraph">The Request Unit (RU) Database Account operates on a pay-per-use basis. With this model, you are billed based on the number of requests and the level of provisioned throughput consumed by your workload.</p>



<p class="wp-block-paragraph">As of 27th Mai 2023, <a href="https://devblogs.microsoft.com/cosmosdb/introducing-vector-search-in-azure-cosmos-db-for-mongodb-vcore/" target="_blank" rel="noreferrer noopener">the brand new vector search function is only available for the vCore Cluster option</a>, which is why we will use this setup for this tutorial. The vCore Cluster offers a reserved managed instance. Under this option, you are charged a fixed amount on a monthly basis, providing more predictable costs for your usage.</p>



<p class="wp-block-paragraph">Once you have created your vCore instance, you must collect your connection string and make it available to your Python script. You can do this either by storing it in <a href="https://azure.microsoft.com/en-us/products/key-vault/">Azure Key Vault</a> (which I would recommend) or by storing it locally on your computer or in the code (which I would not recommend for obvious security reasons).</p>



<figure class="wp-block-image size-full"><img decoding="async" width="1536" height="588" data-attachment-id="13774" data-permalink="https://www.relataly.com/step-by-step-guide-to-building-your-own-chatgpt-on-a-custom-knowledge-base-in-python-leveraging-mongo-db-and-embeddings/13687/image-7-6/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2023/05/image-7.png" data-orig-size="1536,588" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-7" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2023/05/image-7.png" src="https://www.relataly.com/wp-content/uploads/2023/05/image-7.png" alt="" class="wp-image-13774" srcset="https://www.relataly.com/wp-content/uploads/2023/05/image-7.png 1536w, https://www.relataly.com/wp-content/uploads/2023/05/image-7.png 300w, https://www.relataly.com/wp-content/uploads/2023/05/image-7.png 512w, https://www.relataly.com/wp-content/uploads/2023/05/image-7.png 768w" sizes="(max-width: 1237px) 100vw, 1237px" /><figcaption class="wp-element-caption">When it comes to Cosmos DB for Mongo DB, there are two options to choose from: Request Unit (RU) Database Account and vCore Cluster. </figcaption></figure>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%">
<figure class="wp-block-image size-large"><img decoding="async" width="334" height="512" data-attachment-id="13772" data-permalink="https://www.relataly.com/step-by-step-guide-to-building-your-own-chatgpt-on-a-custom-knowledge-base-in-python-leveraging-mongo-db-and-embeddings/13687/image-5-4/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2023/05/image-5.png" data-orig-size="606,929" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-5" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2023/05/image-5.png" src="https://www.relataly.com/wp-content/uploads/2023/05/image-5-334x512.png" alt="Azure Cosmos DB for Mongo DB is a new offering that is specifically designed for vector use cases (incl. embeddings)" class="wp-image-13772" srcset="https://www.relataly.com/wp-content/uploads/2023/05/image-5.png 334w, https://www.relataly.com/wp-content/uploads/2023/05/image-5.png 196w, https://www.relataly.com/wp-content/uploads/2023/05/image-5.png 606w" sizes="(max-width: 334px) 100vw, 334px" /><figcaption class="wp-element-caption">Azure Cosmos DB for Mongo DB is a new offering that is designed explicitly for vector use cases (incl. embeddings)</figcaption></figure>
</div>
</div>



<h2 class="wp-block-heading">Using other Vector Databases</h2>



<p class="wp-block-paragraph">While Cosmos DB is a popular choice for vector databases, I would like to note that other options are available in the market. You can still benefit from this tutorial if you decide to utilize a different vector database, such as Pinncecone or Chroma. However, it is necessary to make code adjustments tailored to the APIs and functionalities of the specific vector database you choose.</p>



<p class="wp-block-paragraph">Specifically, you will need to modify the &#8220;insert embedding functions&#8221; and &#8220;similarity search functions&#8221; to align with the requirements and capabilities of your chosen vector database. These functions typically have variations that are specific to each vector database.</p>



<p class="wp-block-paragraph">By customizing the code according to your selected vector database&#8217;s API, you can successfully adapt the tutorial to suit your specific database choice. This allows you to leverage the principles and concepts this tutorial covers, regardless of the vector database you opt for.</p>



<p class="wp-block-paragraph">Also: <a href="https://www.relataly.com/vector-databases-the-rising-star-in-generative-ai-infrastructure/13599/" target="_blank" rel="noreferrer noopener">Vector Databases: The Rising Star in Generative AI Infrastructure</a></p>



<h3 class="wp-block-heading">Prerequisites</h3>



<p class="wp-block-paragraph">Before diving into the code, it’s essential to ensure that you have the proper setup for your Python 3 environment and have installed all the necessary packages. If you do not have a Python environment, follow the instructions in&nbsp;<a href="https://www.relataly.com/anaconda-python-environment-machine-learning/1663/" target="_blank" rel="noreferrer noopener">this tutorial</a>&nbsp;to set up the&nbsp;<a href="https://www.anaconda.com/products/individual" target="_blank" rel="noreferrer noopener">Anaconda Python environment</a>. This will provide you with a robust and versatile environment well-suited for machine learning and data science tasks.</p>



<p class="wp-block-paragraph">In this tutorial, we will be working with several libraries:</p>



<ul class="wp-block-list">
<li>openai</li>



<li>pymongo</li>



<li>PyPDF2</li>



<li>dotenv</li>
</ul>



<p class="wp-block-paragraph">Should you decide to use <a href="https://azure.microsoft.com/en-us/products/key-vault/" target="_blank" rel="noreferrer noopener">Azure Key Vault</a>, then you also need the following Python libraries:</p>



<ul class="wp-block-list">
<li>azure-identity</li>



<li>azure-key-vault</li>
</ul>



<p class="wp-block-paragraph">You can install the OpenAI Python library using console commands:</p>



<ul class="wp-block-list">
<li><em>pip install&nbsp;</em>openai</li>



<li><em>conda install&nbsp;</em>openai&nbsp;(if you are using the Anaconda packet manager)</li>
</ul>



<h3 class="wp-block-heading">Step #1 Authentification and DB Setup</h3>



<p class="wp-block-paragraph">Let&#8217;s start with the authentification and setup of the API keys. After making necessary imports, the code gets things read to connect to essential services &#8211; OpenAI and Cosmos DB &#8211; and makes sure it can access these services properly.</p>



<ol class="wp-block-list">
<li><strong>Fetching Credentials:</strong> The script starts by setting up a connection to a service called Azure Key Vault to retrieve some crucial credentials securely. These are like &#8220;passwords&#8221; that the script needs to access various resources.</li>



<li><strong>Setting Up AI Services:</strong> Then, it prepares to connect to two different AI services. One is a version that&#8217;s hosted by Azure, and the other is the standard, public version.</li>



<li><strong>Establishing Database Connection:</strong> Lastly, the script sets up a connection to a database service, specifically to a certain collection within the Cosmos DB database. The script also checks if the connection to the database was successful by sending a &#8220;ping&#8221; &#8211; if it receives a response, it knows the connection is good.</li>
</ol>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}">from azure.identity import AzureCliCredential
from azure.keyvault.secrets import SecretClient
import openai
import logging
import tiktoken
import pandas as pd
import pymongo
from dotenv import load_dotenv
load_dotenv()
# Set up the Azure Key Vault client and retrieve the Blob Storage account credentials
keyvault_name = ''
openaiservicename = ''
client = SecretClient(f&quot;https://{keyvault_name}.vault.azure.net/&quot;, AzureCliCredential())
print('keyvault service ready')
# AzureOpenAI Service
def setup_azureopenai():
    openai.api_key = client.get_secret('openai-api-key').value
    openai.api_type = &quot;azure&quot;
    openai.api_base = f'https://{openaiservicename}.openai.azure.com'
    openai.api_version = '2023-05-15'
    print('azure openai service ready')
# public openai service
def setup_public_openai():
    openai.api_key = client.get_secret('openai-api-key-public').value
    print('public openai service ready')
DB_NAME = &quot;hephaestus&quot;
COLLECTION_NAME = 'isocodes'
def setup_cosmos_connection():
    COSMOS_CLUSTER_CONNECTION_STRING = client.get_secret('cosmos-cluster-string').value
    cosmosclient = pymongo.MongoClient(COSMOS_CLUSTER_CONNECTION_STRING)
    db = cosmosclient[DB_NAME]
    collection = cosmosclient[DB_NAME][COLLECTION_NAME]
    # Send a ping to confirm a successful connection
    try:
        cosmosclient.admin.command('ping')
        print(&quot;Pinged your deployment. You successfully connected to MongoDB!&quot;)
    except Exception as e:
        print(e)
    return collection, db
setup_public_openai()
collection, db = setup_cosmos_connection()</pre></div>



<p class="wp-block-paragraph">Now we have set things up to interact with our Cosmos DB Mong DB vCore instance.</p>



<h3 class="wp-block-heading">Step #2 Functions for Populating the Vector DB</h3>



<p class="wp-block-paragraph">Next, we prepare and insert data into the database as embeddings. First, we prepare the content. The preparation process involves turning the text content into embeddings. Each embedding is a list of flats representing the meaning of a specific part of the text in a way the AI system can understand.</p>



<p class="wp-block-paragraph">We create the embeddings by sending text (for example, a paragraph of a document) to an OpenAI embedding model that returns the embedding. There are two options for using OpenAI: You can use the Azure OpenAI engine and deploy your own Ada embedding model. Alternatively, you can use the public OpenAI Ada embedding model. </p>



<p class="wp-block-paragraph">We&#8217;ll use the public OpenAI&#8217;s <a href="https://platform.openai.com/docs/guides/embeddings" target="_blank" rel="noreferrer noopener">text-embedding-ada-002</a>. Remember that the model is designed to return embeddings, not text. Model inference may incur costs based on the data processed. Refer to <a href="https://openai.com/pricing" target="_blank" rel="noreferrer noopener">OpenAI </a>or Azure OpenAI service for pricing details. </p>



<p class="wp-block-paragraph">Finally, the code inserts the prepared requests (which now include both the original text and the corresponding embeddings) into the database. The function returns the unique IDs assigned to these newly inserted items in the database. In this way, the code processes and stores the necessary information in the database for later use.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># prepare content for insertion into cosmos db
def prepare_content(text_content):
  embeddings = create_embeddings_with_openai(text_content)
  request = [
    {
    &quot;textContent&quot;: text_content, 
    &quot;vectorContent&quot;: embeddings}
  ]
  return request
# create embeddings
def create_embeddings_with_openai(input):
    #print('Generating response from OpenAI...')
    ###### uncomment for AzureOpenAI model usage and comment code below
    # embeddings = openai.Embedding.create( 
    #     engine='&lt;name of the embedding deployment &gt;', 
    #     input=input)[&quot;data&quot;][0][&quot;embedding&quot;]
    ###### public openai model usage and comment code above
    embeddings = openai.Embedding.create(
        model='text-embedding-ada-002', 
        input=input)[&quot;data&quot;][0][&quot;embedding&quot;]
    
    # Number of embeddings    
    # print(len(embeddings))
    return embeddings
# insert the requests
def insert_requests(text_input):
    request = prepare_content(text_input)
    return collection.insert_many(request).inserted_ids
# Creates a searchable index for the vector content
def create_index():
  
  # delete and recreate the index. This might only be necessary once.
  collection.drop_indexes()
  embedding_len = 1536
  print(f'creating index with embedding length: {embedding_len}')
  db.command({
    'createIndexes': COLLECTION_NAME,
    'indexes': [
      {
        'name': 'vectorSearchIndex',
        'key': {
          &quot;vectorContent&quot;: &quot;cosmosSearch&quot;
        },
        'cosmosSearchOptions': {
          'kind': 'vector-ivf',
          'numLists': 100,
          'similarity': 'COS',
          'dimensions': embedding_len
        }
      }
    ]
  })
# Resets the DB and deletes all values from the collection to avoid dublicates
#collection.delete_many({})</pre></div>



<h3 class="wp-block-heading">Step #3 Document Cracing and Populating the DB</h3>



<p class="wp-block-paragraph">The next step is to break down the PDF document into smaller chunks of text (in this case, &#8216;records&#8217;) and then process these records for future use. You can repeat this process for any document that you want to make available to OpenAI. </p>



<p class="wp-block-paragraph">You can use any PDF that you like as long as you it contains readable text (use OCR). For demo purposes, I will use a <a href="https://www.zh.ch/content/dam/zhweb/bilder-dokumente/themen/steuern-finanzen/steuern/quellensteuer/infobl%C3%A4tter/div_q_informationsblatt_qs_2021_EN.pdf" target="_blank" rel="noreferrer noopener">tax document from Zurich</a>. Put the document in the folder data/vector_db_data/ in your root folder and provide the name to the Python script. </p>



<p class="wp-block-paragraph">Want to read in many documents at once? If you want to insert many documents, read the pdf documents from the folder and use the names to populate a list. You can then surround the insert function with a for loop that iterates through the list of document names </p>



<h4 class="wp-block-heading">#3.1 Document Slicing Considerations </h4>



<p class="wp-block-paragraph">To convert a PDF into embeddings, the first step is to divide it into smaller content slices. The slicing process plays a crucial role as it affects the information provided to the OpenAI GPT model when answering user questions. If the slices are too large, the model may encounter token limitations. Conversely, if they are too small, the model may not receive sufficient content to answer the question effectively. It is important to strike a balance between the number of slices and their length to optimize the results, considering that the search process may yield multiple outcomes.</p>



<p class="wp-block-paragraph">There are several approaches to handle the slicing process. One option is to define the slices based on a specific number of sentences or paragraphs. Alternatively, you can iteratively slice the document, allowing for some overlap between the data in the vector database. This approach has the advantage of providing more precise information to answer questions, but it also increases the data volume in the vector database, which can impact speed and cost considerations.</p>



<h4 class="wp-block-heading">#3.2 Running the code below to crack a document and insert embeddings into the vector DB</h4>



<p class="wp-block-paragraph">Running the code below will first define a function that breaks text into separate paragraphs based on line breaks. Another function slices the PDF into records. Each record contains a certain number of sentences (the maximum is defined by the &#8216;max_sentences&#8217; value). We use a Python library called PyPDF2 to extract text from each page of the PDF and Python&#8217;s built-in regular expressions to split the text into sentences and paragraphs. Note that if you want to achieve better results, you could also use a professional document content extraction tool such as Azure form recognizer.</p>



<p class="wp-block-paragraph">The code then opens a specific PDF file (&#8216;zurich_tax_info_2023.pdf&#8217;) and slices it into records, each containing no more than a certain number of sentences (as defined by&#8217;max_sentences&#8217;). After that, the function inserts these records into the vector database. Finally, we print the count of documents in the database collection. This shows how many pieces of data are already stored in this specific part of the database.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># document cracking function to insert data from the excel sheet
def split_text_into_paragraphs(text):
    paragraphs = re.split(r'\n{2,}', text)
    return paragraphs
def slice_pdf_into_records(pdf_path, max_sentences):
    records = []
    
    with open(pdf_path, 'rb') as file:
        reader = PyPDF2.PdfReader(file)
        
        for page in reader.pages:
            text = page.extract_text()
            paragraphs = split_text_into_paragraphs(text)
            
            current_record = ''
            sentence_count = 0
            
            for paragraph in paragraphs:
                sentences = re.split(r'(?&lt;=[.!?])\s+', paragraph)
                
                for sentence in sentences:
                    current_record += sentence
                    
                    sentence_count += 1
                    
                    if sentence_count &gt;= max_sentences:
                        records.append(current_record)
                        current_record = ''
                        sentence_count = 0
                
                if sentence_count &lt; max_sentences:
                    current_record += ' '  # Add space between paragraphs
            
            # If there is remaining text after the loop, add it as a record
            if current_record:
                records.append(current_record)
    
    return records
# get file from root/data folder
pdf_path = '../data/vector_db_data/zurich_tax_info_2023.pdf'
max_sentences = 20  # Adjust the slice size as per your requirement
result = slice_pdf_into_records(pdf_path, max_sentences)
# print the length of result
print(f'{len(result)} vectors created with maximum {max_sentences} sentences each.')
# Print the sliced records
for i, record in enumerate(result):
    insert_requests(record)
    if i &lt; 5:
        print(record[0:100])
        print('-------------------')
create_index()
print(f'number of records in the vector DB: {collection.count_documents({})}')</pre></div>



<p class="wp-block-paragraph">After slicing the document and inserting the embeddings into the vector database, we can proceed with functions for similarity search and prompting. </p>



<h3 class="wp-block-heading">Step #4 Functions for Similarity Search and Prompts to ChatGPT</h3>



<p class="wp-block-paragraph">This section of code provides a set of functions to perform a vector search in the Cosmos DB, make a request to the ChatGPT 3.5 Turbo model for generating responses, and create prompts for the OpenAI model to use in generating those responses.</p>



<h4 class="wp-block-heading">#4.1 How the Search Part Works </h4>



<p class="wp-block-paragraph"><br>Allow me to provide a concise explanation of how the search process operates. We have now reached the stage where a user poses a question, and we utilize the OpenAI model to supply an answer, drawing from our vector database. Here, it&#8217;s vital to understand that the model transforms the question into embeddings and subsequently scours the knowledge base for similar embeddings that align with the information requested in the user&#8217;s prompt. </p>



<p class="wp-block-paragraph">The vector database yields the most suitable results and inserts them into another prompt tailored for ChatGPT. This model, distinct from the embedding model, generates text. Thus, the final interaction with the ChatGPT model incorporates both the user&#8217;s question and the results from the vector database, which are the most fitting responses to the question. This combination should ideally aid the model in providing the appropriate answer. Now, let&#8217;s turn our attention to the corresponding code.</p>



<h4 class="wp-block-heading">#4.2 Setting up the Functions for Vector Search</h4>



<p class="wp-block-paragraph">The vector_search function takes as input a query vector (representing a user&#8217;s question in vector form) and an optional parameter to limit the number of results. It then conducts a search in the Cosmos DB, looking for entries whose vector content is most similar to the query vector.</p>



<p class="wp-block-paragraph">Next, the openai_request function makes a request to OpenAI&#8217;s ChatGPT 3.5 Turbo model to generate a response. This function takes a formatted conversation history (or &#8216;prompt&#8217;) and sends it to the model, which then generates a response. The content of the generated response is then returned.</p>



<p class="wp-block-paragraph">The create_tweet_prompt function constructs the conversation history for the OpenAI model. This function takes the user&#8217;s question and a JSON object containing results from a database search and constructs a list of system and user messages. This list will then serve as the prompt for the ChatGPT model, instructing it to generate a response that answers the user&#8217;s question about tax, with the added guideline that the response should be in the same language as the question. The constructed prompt is then returned by the function.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Cosmos DB Vector Search API Command
def vector_search(vector_query, max_number_of_results=2):
  results = collection.aggregate([
    {
      '$search': {
        &quot;cosmosSearch&quot;: {
          &quot;vector&quot;: vector_query,
          &quot;path&quot;: &quot;vectorContent&quot;,
          &quot;k&quot;: max_number_of_results
        },
      &quot;returnStoredSource&quot;: True
      }
    }
  ])
  return results
# openAI request - ChatGPT 3.5 Turbo Model
def openai_request(prompt, model_engine='gpt-3.5-turbo'):
    completion = openai.ChatCompletion.create(model=model_engine, messages=prompt, temperature=0.2, max_tokens=500)
    return completion.choices[0].message.content
# define OpenAI Prompt for News Tweet
def create_prompt(user_question, result_json):
    instructions = f'You are an assistant that answers questions based on sources provided. \
    If the information is not in the provided source, you answer with &quot;I don\'t know&quot;. '
    task = f&quot;{user_question} Translate the response to english /n \
    source: {result_json}&quot;
    
    prompt = [{&quot;role&quot;: &quot;system&quot;, &quot;content&quot;: instructions }, 
              {&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: task }]
    return prompt</pre></div>



<p class="wp-block-paragraph">You can easily change the voice and tone in which the ChatGPT answers questions by including the respective instructions in the create_prompt function. </p>



<p class="wp-block-paragraph">Also: <a href="https://www.relataly.com/chatgpt-style-guide-understanding-voice-and-tone-options-for-engaging-conversations/13065/">ChatGPT Style Guide: Understanding Voice and Tone Prompt Options for Engaging Conversations</a></p>



<h3 class="wp-block-heading">Step #5 Testing the Custom ChatGPT Solution</h3>



<p class="wp-block-paragraph">This part of the code works with the previous functions to facilitate a complete question-answering cycle with Cosmos DB and OpenAI&#8217;s ChatGPT 3.5 Turbo model.</p>



<p class="wp-block-paragraph">Now comes the most exciting part. Testing the solution, you can define a question and then execute the code below to run the search process. </p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># define OpenAI Prompt 
users_question = &quot;When do I have to submit my tax return?&quot;
# generate embeddings for the question
user_question_embeddings = create_embeddings_with_openai(user_question)
# search for the question in the cosmos db
search_results = vector_search(user_question_embeddings, 1)
print(search_results)
# prepare the results for the openai prompt
result_json = []
# print each document in the result
# remove all empty values from the results json
search_results = [x for x in search_results if x]
for doc in search_results:
    display(doc.get('_id'), doc.get('textContent'), doc.get('vectorContent')[0:5])
    result_json.append(doc.get('textContent'))
# create the prompt
prompt = create_prompt(user_question, result_json)
display(prompt)
# generate the response
response = openai_request(prompt)
display(f'User question: {users_question}')
display(f'OpenAI response: {response}')</pre></div>



<p class="wp-block-paragraph">&#8216;User question: When do I have to submit my tax return?&#8217;</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:false,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;null&quot;,&quot;mime&quot;:&quot;text/plain&quot;,&quot;theme&quot;:&quot;3024-day&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;disableCopy&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Plain Text&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;text&quot;}">'OpenAI response: When do I have to submit my tax return? \n\nAll natural persons who had their residence in the canton of Zurich on December 31, 2022, or who owned properties or business premises (or business operations) in the canton of Zurich, must submit a tax return for 2022 in the calendar year 2023. Taxpayers with a residence in another canton also have to submit a tax return for 2022 in the calendar year 2023 if they ended their tax liability in the canton of Zurich by giving up a property or business premises during the calendar year 2022. If you turned 18 in the tax period 2022 (persons born in 2004), you must submit your own tax return (for the tax period 2022) for the first time in the calendar year 2023.'</pre></div>



<p class="wp-block-paragraph">As of Mai 2023, the knowledge base of ChatGPT 3.5 is limited to the timeframe before September 2021. So it&#8217;s evident that the response of our custom ChatGPT solution is based on the individual information provided in the vector database. Remember that we did not fine-tune the GPT model, so the model itself does not inherently know anything about your private data and instead uses the data that was dynamically provided to it as part of the prompt. </p>



<h2 class="wp-block-heading">Real-world Applications of Chat with your data</h2>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">Custom ChatGPT boosts efficiency, personalizes services, and improves experiences across industries. Here are some examples:</p>



<ul class="wp-block-list">
<li><strong>Customer Support:</strong> Companies can use ChatGPT for 24/7 customer service. With data from manuals, FAQs, and support docs, it delivers fast, accurate answers, enhancing customer satisfaction and lessening staff workload.</li>



<li><strong>Healthcare</strong>: ChatGPT can respond to patient questions using medical texts and care guidelines. It offers data on symptoms, treatments, side effects, and preventive care, helping both healthcare providers and patients.</li>



<li><strong>Legal Sector</strong>: Law firms can use ChatGPT with legal texts, court decisions, and case studies for answering legal questions, offering case references, or explaining legal terms.</li>



<li><strong>Financial Services:</strong> Banks can use ChatGPT to extend their customer service and give customers advice based on their individual financial situation.</li>



<li><strong>E-Learning:</strong> Schools and e-learning platforms can use ChatGPT to tutor students. Using textbooks, notes, and research papers, it helps students understand complex topics, solve problems, or guide them through a course.</li>
</ul>



<p class="wp-block-paragraph">In short, any sector needing a large information database for queries or services can use custom ChatGPT. It enhances engagement and efficiency by offering personalized experiences.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<h2 class="wp-block-heading">Summary</h2>



<p class="wp-block-paragraph">In this comprehensive guide, we&#8217;ve journeyed through the fascinating process of creating a customized ChatGPT that lets users chat with your business data. We started with understanding the immense value a tailored ChatGPT brings to the table and dove into its ability to produce specialized responses sourced from a custom knowledge base. This tailored approach enhances user experiences, saves time, and bolsters productivity.</p>



<p class="wp-block-paragraph">We went behind the scenes to reveal the vital elements of crafting a custom ChatGPT: OpenAI&#8217;s GPT models, data embeddings, and vector databases like Cosmos DB for Mongo DB vCore. We clarified how these components synergize to transcend the token limitations inherent to GPT models. By integrating the components in Python, we broadened ChatGPT&#8217;s ability to answer queries based on your private knowledgebase, thereby offering contextually appropriate responses.</p>



<p class="wp-block-paragraph">I hope this tutorial was able to illustrate the business value of ChatGPT and its versatile utility across a variety of sectors, including customer service, healthcare, legal services, finance, e-learning, and CRM data analytics. Each instance emphasized the transformative potential of a personalized ChatGPT in delivering efficient, targeted solutions.</p>



<p class="wp-block-paragraph">I hope you found this helpful article. If you have any questions or remarks, please drop them in the comment section.</p>



<h2 class="wp-block-heading">Sources and Further Reading</h2>



<ul class="wp-block-list">
<li><a href="https://learn.microsoft.com/en-us/azure/cosmos-db/introduction" target="_blank" rel="noreferrer noopener">Azure Cosmos DB</a></li>



<li><a href="https://openai.com/pricing" target="_blank" rel="noreferrer noopener">OpenAI pricing</a></li>



<li><a href="https://learn.microsoft.com/en-us/azure/cognitive-services/openai/">Azure OpenAI</a></li>



<li><a href="https://azure.microsoft.com/en-au/products/cognitive-services/openai-service" target="_blank" rel="noreferrer noopener">Semantic search</a></li>



<li><a href="https://platform.openai.com/docs/guides/embeddings" target="_blank" rel="noreferrer noopener">What are embeddings?</a></li>



<li><a href="https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/vector-search" target="_blank" rel="noreferrer noopener">Using vector search on embeddings in Azure Cosmos DB for MongoDB vCore</a></li>



<li>OpenAI ChatGPT helped to revise this article</li>



<li>Images created with <a href="https://www.midjourney.com/home/?callbackUrl=%2Fapp%2F" target="_blank" rel="noreferrer noopener">Midjourney</a></li>
</ul>
<p>The post <a href="https://www.relataly.com/step-by-step-guide-to-building-your-own-chatgpt-on-a-custom-knowledge-base-in-python-leveraging-mongo-db-and-embeddings/13687/">Building &#8220;Chat with your Data&#8221; Apps using Embeddings, ChatGPT, and Cosmos DB for Mongo DB vCore</a> appeared first on <a href="https://www.relataly.com">relataly.com</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.relataly.com/step-by-step-guide-to-building-your-own-chatgpt-on-a-custom-knowledge-base-in-python-leveraging-mongo-db-and-embeddings/13687/feed/</wfw:commentRss>
			<slash:comments>3</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">13687</post-id>	</item>
		<item>
		<title>What is the Business Value of ChatGPT and other Large Generative Language Models?</title>
		<link>https://www.relataly.com/openai-gpt-chatgpt-in-a-business-context-whats-the-value-proposition/12282/</link>
					<comments>https://www.relataly.com/openai-gpt-chatgpt-in-a-business-context-whats-the-value-proposition/12282/#respond</comments>
		
		<dc:creator><![CDATA[Florian Follonier]]></dc:creator>
		<pubDate>Sat, 25 Feb 2023 17:30:59 +0000</pubDate>
				<category><![CDATA[Finance]]></category>
		<category><![CDATA[Healthcare]]></category>
		<category><![CDATA[Insurance]]></category>
		<category><![CDATA[Language Generation]]></category>
		<category><![CDATA[Logistics]]></category>
		<category><![CDATA[Manufacturing]]></category>
		<category><![CDATA[OpenAI]]></category>
		<category><![CDATA[Sentiment Analysis]]></category>
		<category><![CDATA[Topic Modelling]]></category>
		<category><![CDATA[Use Cases]]></category>
		<category><![CDATA[AI in E-Commerce]]></category>
		<category><![CDATA[AI in Finance]]></category>
		<category><![CDATA[AI in Insurance]]></category>
		<category><![CDATA[AI in Logistics]]></category>
		<category><![CDATA[AI in Marketing]]></category>
		<guid isPermaLink="false">https://www.relataly.com/?p=12282</guid>

					<description><![CDATA[<p>OpenAI&#8217;s GPT models, such as Davinci and ChatGPT, have gained recognition for their impressive language generation abilities. However, many of the tasks that GPT models can perform are not entirely new and could have been accomplished by traditional neural network models for some time. In specific tasks such as sentiment analysis, more specialized models could ... <a title="What is the Business Value of ChatGPT and other Large Generative Language Models?" class="read-more" href="https://www.relataly.com/openai-gpt-chatgpt-in-a-business-context-whats-the-value-proposition/12282/" aria-label="Read more about What is the Business Value of ChatGPT and other Large Generative Language Models?">Read more</a></p>
<p>The post <a href="https://www.relataly.com/openai-gpt-chatgpt-in-a-business-context-whats-the-value-proposition/12282/">What is the Business Value of ChatGPT and other Large Generative Language Models?</a> appeared first on <a href="https://www.relataly.com">relataly.com</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">OpenAI&#8217;s GPT models, such as Davinci and ChatGPT, have gained recognition for their impressive language generation abilities. However, many of the tasks that GPT models can perform are not entirely new and could have been accomplished by traditional neural network models for some time. In specific tasks such as sentiment analysis, more specialized models could outperform GPT-3. So, what distinguishes GPT from other models, and why is it creating so much hype? This question is particularly significant to business stakeholders who are curious about generative AI but are still seeking relevant applications. Understanding GPT&#8217;s value proposition will enable them to articulate the significance of generative AI use cases.</p>



<p class="wp-block-paragraph">This article aims to dismantle the value proposition of generative language models such as ChatGPT by discussing it along four dimensions: capabilities (1), versatility (2), simplification (3), and ease of use (4). So if you want to understand why your business should care about GPT, this article is for you!</p>



<p class="wp-block-paragraph">It is worth mentioning that this article does not differentiate between the different GPT models. The term GPT, in this article, refers to ChatGPT and Davinci, which have comparable capabilities. A key difference is that ChatGPT considers the conversation history, while Davinci treats requests entirely isolated from one another.</p>



<h4 class="wp-block-heading">Related articles </h4>



<p class="wp-block-paragraph">Also: </p>



<ul class="wp-block-list">
<li><a href="https://www.relataly.com/business-use-cases-for-openai-gpt-models-chatgpt-davinci/12200/" target="_blank" rel="noreferrer noopener">9 Powerful Applications of OpenAI’s ChatGPT and Davinci for Your Business</a> </li>



<li><a href="https://www.relataly.com/exploring-the-journey-of-the-swiss-economy-in-adopting-openai-chatgpt-and-co/13486/" target="_blank" rel="noreferrer noopener">Exploring the Journey of the Swiss Economy in Adopting OpenAI&#8217;s ChatGPT and Co</a></li>
</ul>



<p class="wp-block-paragraph">And if you are interested in implementing OpenAI, check out these Python tutorials:</p>



<ul class="wp-block-list">
<li><a href="https://www.relataly.com/using-chatgpt-and-other-openai-models-via-apis-in-python/12068/" target="_blank" rel="noreferrer noopener">Using OpenAI GPT with Python</a></li>



<li><a href="https://www.relataly.com/automated-prompt-generation-for-dall-e-using-chatgpt-in-python-a-step-by-step-api-tutorial/12143/" target="_blank" rel="noreferrer noopener">Integrating Dall-E with GPT for Prompt Generation using Python</a></li>
</ul>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%">
<figure class="wp-block-image size-full"><img decoding="async" width="510" height="507" data-attachment-id="12502" data-permalink="https://www.relataly.com/business-use-cases-for-openai-gpt-models-chatgpt-davinci/12200/diamond-value-business-proposition-python-machine-learning-relataly/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2023/02/diamond-value-business-proposition-python-machine-learning-relataly.png" data-orig-size="510,507" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="diamond value business proposition python machine learning relataly" data-image-description="&lt;p&gt;What&amp;#8217;s the value proposition of OpenAI GPT-3? Midjourney relataly.com &lt;/p&gt;
" data-image-caption="&lt;p&gt;What&amp;#8217;s the value proposition of OpenAI GPT-3? Midjourney relataly.com &lt;/p&gt;
" data-large-file="https://www.relataly.com/wp-content/uploads/2023/02/diamond-value-business-proposition-python-machine-learning-relataly.png" src="https://www.relataly.com/wp-content/uploads/2023/02/diamond-value-business-proposition-python-machine-learning-relataly.png" alt="What's the value proposition of OpenAI GPT-3? Midjourney relataly.com " class="wp-image-12502" srcset="https://www.relataly.com/wp-content/uploads/2023/02/diamond-value-business-proposition-python-machine-learning-relataly.png 510w, https://www.relataly.com/wp-content/uploads/2023/02/diamond-value-business-proposition-python-machine-learning-relataly.png 300w, https://www.relataly.com/wp-content/uploads/2023/02/diamond-value-business-proposition-python-machine-learning-relataly.png 140w" sizes="(max-width: 510px) 100vw, 510px" /><figcaption class="wp-element-caption">What&#8217;s the value proposition of OpenAI GPT? Image created with <a href="http://www.Midjourney.com" target="_blank" rel="noreferrer noopener">Midjourney</a>.</figcaption></figure>
</div>
</div>



<h2 class="wp-block-heading">What&#8217;s the Deal with Large Generative Language Models á la ChatGPT?</h2>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">To understand the value proposition of large generative models, it can be helpful to compare them to Bidirectional Encoder Representations from Transformers <a href="https://towardsdatascience.com/bert-explained-state-of-the-art-language-model-for-nlp-f8b21a9b6270" target="_blank" rel="noreferrer noopener">(BERT) and its variations (ROBERTA, etc.)</a>. </p>



<p class="wp-block-paragraph">BERT is a powerful and widely used pre-trained language model in natural language processing (NLP). It was developed by Google and released in 2018, and it quickly became one of the most influential NLP models in the field. We can consider it the predecessor of ChatGPT.</p>



<p class="wp-block-paragraph">One key difference between the two models is in how they process data. GPT&#8217;s sequential processing of input sequences, one token at a time, gives it an advantage over BERT in handling longer and more complex input sequences. This makes GPT better suited for tasks requiring more intricate outputs, such as language translation and dialogue systems. Consequently, GPT is better equipped than BERT for tasks that require generating lengthier and more complex outputs.</p>



<h2 class="wp-block-heading">Generative vs. Discriminative Models</h2>



<p class="wp-block-paragraph">Generative models like GPT and discriminative models like BERT <a href="https://symbl.ai/blog/gpt-3-versus-bert-a-high-level-comparison/" target="_blank" rel="noreferrer noopener">have fundamental differences in their approach to language processing</a>. While GPT is a large generative language model that generates new text based on input, BERT is a discriminative model that classifies text into predefined categories. While both models have unique strengths and weaknesses, their performance varies based on the task and dataset.</p>



<p class="wp-block-paragraph">BERT is particularly adept at question answering, text classification, and sentiment analysis, but it may not perform as well at generating new text. On the other hand, GPT is better suited for generating new text and capturing complex language dependencies. This makes it ideal for content generation, language translation, summarization, and question-answering. </p>



<h2 class="wp-block-heading">Pretraining</h2>



<p class="wp-block-paragraph">Another important aspect of GPT&#8217;s training methodology is pre-training. Before being fine-tuned for specific tasks, GPT is pre-trained on a vast amount of data, learning to generate text by predicting the next word in a sentence. </p>



<p class="wp-block-paragraph">This pre-training phase helps GPT learn grammar, facts about the world, and gives the model even reasoning abilities. This general language understanding serves as a strong foundation for GPT when it comes to solving specific natural language tasks later on. By leveraging this pre-training, GPT can easily adapt to new tasks with relatively less task-specific data. This process, called transfer learning, enables GPT to perform better than other models in various tasks.</p>



<h2 class="wp-block-heading">Performance vs. Capabilities</h2>



<p class="wp-block-paragraph">Performance and capabilities are distinct factors when evaluating language models. While BERT excels in some applications, GPT&#8217;s strengths lie in its capabilities across various fields, particularly with few-shot or zero-shot learning. By fine-tuning GPT to specific tasks, its performance can be further improved and may likely outperform BERT.</p>



<p class="wp-block-paragraph">Although GPT is proficient at basic NLP tasks like sentiment analysis and text classification, <a href="https://analyticsindiamag.com/gpt-3-vs-bert-for-nlp-tasks/" target="_blank" rel="noreferrer noopener">performance comparisons</a> show that BERT can achieve similar or better results with less computational complexity in fundamental NLP tasks. However, the performance of GPT-4, which is yet to be seen, may likely outperform BERT in almost any discipline, even without fine-tuning.</p>



<h2 class="wp-block-heading">Unmantling GPTs Value Proposition</h2>



<p class="wp-block-paragraph">Despite the impressive capabilities of generative language models like ChatGPT, examining their value proposition in more detail is essential. Therefore, this article aims to provide a more nuanced understanding of the value that these models can provide.</p>



<p class="wp-block-paragraph">Also: <a href="https://www.relataly.com/eliminating-friction-how-openais-gpt-streamlines-online-experiences-and-reduces-the-need-for-google-searches/13171/" target="_blank" rel="noreferrer noopener">Eliminating Friction: How OpenAI’s GPT Streamlines Online Experiences and Reduces the Need for Traditional Search</a></p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<h3 class="wp-block-heading">#1 Performance</h3>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">Large generative language models like ChatGPT offer valuable benefits to businesses through their ability to generate natural language responses similar to those produced by humans. This technology can be used in various ways, such as content generation, customer service, and marketing.</p>



<p class="wp-block-paragraph">Businesses can use generative language models to produce high-quality content quickly and efficiently. For example, a news organization could use ChatGPT to generate news articles or summaries based on current events. Similarly, a company could use this technology to create product descriptions, emails, or even social media posts.</p>



<p class="wp-block-paragraph">Generative language models can also be employed to provide customers with instant responses to their queries, which could be particularly useful for businesses that receive a high volume of customer inquiries or support requests. ChatGPT can be trained to provide accurate and helpful responses to frequently asked questions or to engage in more complex conversations with customers.</p>



<p class="wp-block-paragraph">In marketing, generative language models can be used to create personalized content for customers by analyzing customer data to generate customized marketing messages or entire campaigns that resonate with individual customers&#8217; preferences and interests.</p>



<p class="wp-block-paragraph">ChatGPT&#8217;s ability to handle longer input sequences enables it to maintain context and understand the sentiment behind a piece of text more effectively. The use of self-attention mechanisms allows ChatGPT to focus on the most relevant parts of the input when generating its predictions, leading to more accurate results in tasks like sentiment analysis. Additionally, ChatGPT&#8217;s increased capacity allows it to learn more complex patterns and representations, resulting in improved performance across various natural language tasks.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%">
<figure class="wp-block-image size-large"><img decoding="async" width="502" height="512" data-attachment-id="13200" data-permalink="https://www.relataly.com/openai-gpt-chatgpt-in-a-business-context-whats-the-value-proposition/12282/flo7up_a_running_robot_winning_a_marathon/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2023/03/Flo7up_a_running_robot_winning_a_marathon.png" data-orig-size="1008,1028" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="Flo7up_a_running_robot_winning_a_marathon" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2023/03/Flo7up_a_running_robot_winning_a_marathon.png" src="https://www.relataly.com/wp-content/uploads/2023/03/Flo7up_a_running_robot_winning_a_marathon-502x512.png" alt="Generative language models such as ChatGPT have several advantages over traditional machine learning approaches, including their ability to handle longer inputs." class="wp-image-13200" srcset="https://www.relataly.com/wp-content/uploads/2023/03/Flo7up_a_running_robot_winning_a_marathon.png 502w, https://www.relataly.com/wp-content/uploads/2023/03/Flo7up_a_running_robot_winning_a_marathon.png 294w, https://www.relataly.com/wp-content/uploads/2023/03/Flo7up_a_running_robot_winning_a_marathon.png 768w, https://www.relataly.com/wp-content/uploads/2023/03/Flo7up_a_running_robot_winning_a_marathon.png 1008w" sizes="(max-width: 502px) 100vw, 502px" /><figcaption class="wp-element-caption">Generative language models such as ChatGPT have several advantages over traditional machine learning approaches, including their ability to handle longer inputs. Created with <a href="https://www.midjourney.com/app/" target="_blank" rel="noreferrer noopener">Midjourney</a>.</figcaption></figure>
</div>
</div>



<h3 class="wp-block-heading">#2 Versatility</h3>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">For smaller organizations with limited data science resources, implementing AI in their processes can be a significant challenge. Developing specialized models for tasks such as summarization, classification, and translation requires substantial expertise and training data. In many organizations, these resources are not readily available, which can slow down development processes and hinder innovation.</p>



<p class="wp-block-paragraph">GPT&#8217;s versatility addresses this challenge by offering a single API that can perform <a href="https://www.relataly.com/business-use-cases-for-openai-gpt-models-chatgpt-davinci/12200/" target="_blank" rel="noreferrer noopener">these tasks and more</a>. This enables smaller organizations to benefit from AI without the need to invest in extensive data science resources. By automating and streamlining their workflows, these organizations can save time and resources, allowing them to focus on their core activities. </p>



<p class="wp-block-paragraph">A lot of the versatility comes from GPT, allowing for zero-shot or few-shot predictions. Zero-shot learning is a technique where a model is able to perform a task without any explicit training examples. This is possible because GPT was pre-trained on almost the entire text available from the public internet. It allows the model to make inferences based on the patterns it has learned from the data. Few-shot learning, on the other hand, involves training a model on a small amount of data. </p>



<p class="wp-block-paragraph">It&#8217;s important to note that using GPT also poses potential risks, such as biases and inaccuracies. Smaller organizations may lack the resources to address these risks and, therefore, must evaluate GPT&#8217;s performance carefully before integrating it into their processes. Nonetheless, the availability of GPT represents a significant opportunity for smaller organizations to leverage AI in their operations and remain competitive in their respective markets.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%">
<figure class="wp-block-image size-large"><img decoding="async" width="512" height="283" data-attachment-id="13243" data-permalink="https://www.relataly.com/openai-gpt-chatgpt-in-a-business-context-whats-the-value-proposition/12282/flo7up_a_robot_octopus_wielding_tools_in_his_various_hands_in_f_27e54910-2ee1-431c-96e5-0845e0415ffe-copy-min/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2023/03/Flo7up_a_robot_octopus_wielding_tools_in_his_various_hands_in_f_27e54910-2ee1-431c-96e5-0845e0415ffe-Copy-min.png" data-orig-size="1426,788" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="Flo7up_a_robot_octopus_wielding_tools_in_his_various_hands_in_f_27e54910-2ee1-431c-96e5-0845e0415ffe-Copy-min" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2023/03/Flo7up_a_robot_octopus_wielding_tools_in_his_various_hands_in_f_27e54910-2ee1-431c-96e5-0845e0415ffe-Copy-min.png" src="https://www.relataly.com/wp-content/uploads/2023/03/Flo7up_a_robot_octopus_wielding_tools_in_his_various_hands_in_f_27e54910-2ee1-431c-96e5-0845e0415ffe-Copy-min-512x283.png" alt="OpenAI GPT-3 is highly versatile and makes it easy to leverage the power of AI for various tasks. Image Source: Created with Midjourney - An AI that creates images using text." class="wp-image-13243" srcset="https://www.relataly.com/wp-content/uploads/2023/03/Flo7up_a_robot_octopus_wielding_tools_in_his_various_hands_in_f_27e54910-2ee1-431c-96e5-0845e0415ffe-Copy-min.png 512w, https://www.relataly.com/wp-content/uploads/2023/03/Flo7up_a_robot_octopus_wielding_tools_in_his_various_hands_in_f_27e54910-2ee1-431c-96e5-0845e0415ffe-Copy-min.png 300w, https://www.relataly.com/wp-content/uploads/2023/03/Flo7up_a_robot_octopus_wielding_tools_in_his_various_hands_in_f_27e54910-2ee1-431c-96e5-0845e0415ffe-Copy-min.png 768w, https://www.relataly.com/wp-content/uploads/2023/03/Flo7up_a_robot_octopus_wielding_tools_in_his_various_hands_in_f_27e54910-2ee1-431c-96e5-0845e0415ffe-Copy-min.png 1426w" sizes="(max-width: 512px) 100vw, 512px" /><figcaption class="wp-element-caption">OpenAI GPT is highly versatile and makes it easy to leverage the power of AI for various tasks. Image Source: Created with <a href="https://www.midjourney.com/app/" target="_blank" rel="noreferrer noopener">Midjourney</a>.</figcaption></figure>
</div>
</div>



<h3 class="wp-block-heading" id="h-3-simplifying-complex-processes">#3 Simplifying Complex Processes</h3>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">One of the major benefits of ChatGPT and Davinci is their ability to perform multiple tasks within a single request. For instance, a prompt to a GPT model that asks for a summary in five sentences and a German translation can effectively combine the tasks of summarization and translation. This multi-tasking capability streamlines the development process and simplifies complex procedures.</p>



<h4 class="wp-block-heading">GPT &#8211; the Swiss Army Knife of AI</h4>



<p class="wp-block-paragraph">Imagine a situation where a process involves several tasks like translating customer requests, checking specific information, categorizing, and summarizing them. Traditional models would need the creation, integration, security, and maintenance of four separate models. However, a multi-purpose language model like GPT can handle all these tasks in just one request all at once.</p>



<p class="wp-block-paragraph">While other models like BERT can perform tasks such as language translation and text classification, the ability of ChatGPT and Davinci to execute multiple tasks at once sets them apart. By moving some of the complexity into a prompt for a model, organizations can adapt more easily to changing requirements and become more agile.</p>



<p class="wp-block-paragraph">ChatGPT and Davinci can be seen as the Swiss Army Knives of AI language models. They offer versatile and adaptable solutions for a wide range of tasks. Much like a Swiss Army Knife, these multi-purpose models provide organizations with a valuable tool that simplifies and streamlines complex procedures, making them an essential asset in today&#8217;s rapidly evolving world.</p>



<h4 class="wp-block-heading">An Ongoing shift Toward AI</h4>



<p class="wp-block-paragraph">As generative AI technology continues to advance, an increasing number of organizations are likely to rely on these models to help simplify their complex processes. This shift can lead to improved efficiency, cost savings, and enhanced accuracy, enabling businesses to focus on their strategic objectives. However, this transition also brings potential risks and challenges, such as ensuring ethical AI usage and addressing the possibility of job displacement. Organizations must carefully consider these factors as they integrate AI into their operations.</p>



<p class="wp-block-paragraph">The multi-tasking abilities of ChatGPT and Davinci offer a distinct advantage for organizations aiming to streamline intricate processes and boost efficiency. By delegating some of the process complexity to these models, businesses can adapt more rapidly to evolving requirements and improve their overall agility. Nevertheless, it is essential for organizations to assess the potential challenges, ethical considerations, and workforce implications as they incorporate AI into their operations. By doing so, they can make well-informed decisions and develop a balanced approach to harnessing the power of generative AI models, ultimately ensuring sustainable growth and responsible AI integration.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%">
<figure class="wp-block-image size-full"><img decoding="async" width="837" height="837" data-attachment-id="12234" data-permalink="https://www.relataly.com/flip7up_a_craftsman_robot_with_a_hammer_striking_a_nail_3c6f0c2e-4c35-4ff0-bb8c-cc7552d74cee/" data-orig-file="https://www.relataly.com/wp-content/uploads/2023/02/flip7up_a_craftsman_robot_with_a_hammer_striking_a_nail_3c6f0c2e-4c35-4ff0-bb8c-cc7552d74cee.png" data-orig-size="837,837" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="flip7up_a_craftsman_robot_with_a_hammer_striking_a_nail_3c6f0c2e-4c35-4ff0-bb8c-cc7552d74cee" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2023/02/flip7up_a_craftsman_robot_with_a_hammer_striking_a_nail_3c6f0c2e-4c35-4ff0-bb8c-cc7552d74cee.png" src="https://www.relataly.com/wp-content/uploads/2023/02/flip7up_a_craftsman_robot_with_a_hammer_striking_a_nail_3c6f0c2e-4c35-4ff0-bb8c-cc7552d74cee.png" alt="OpenAI GPT-3 offers a unique value proposition that sets it apart from other models. Image created with Midjourney - An AI that creates images using text." class="wp-image-12234" srcset="https://www.relataly.com/wp-content/uploads/2023/02/flip7up_a_craftsman_robot_with_a_hammer_striking_a_nail_3c6f0c2e-4c35-4ff0-bb8c-cc7552d74cee.png 837w, https://www.relataly.com/wp-content/uploads/2023/02/flip7up_a_craftsman_robot_with_a_hammer_striking_a_nail_3c6f0c2e-4c35-4ff0-bb8c-cc7552d74cee.png 300w, https://www.relataly.com/wp-content/uploads/2023/02/flip7up_a_craftsman_robot_with_a_hammer_striking_a_nail_3c6f0c2e-4c35-4ff0-bb8c-cc7552d74cee.png 140w, https://www.relataly.com/wp-content/uploads/2023/02/flip7up_a_craftsman_robot_with_a_hammer_striking_a_nail_3c6f0c2e-4c35-4ff0-bb8c-cc7552d74cee.png 768w" sizes="(max-width: 837px) 100vw, 837px" /><figcaption class="wp-element-caption">OpenAI GPT offers a unique value proposition that sets it apart from other models. Image created with <a href="https://www.midjourney.com/app/" target="_blank" rel="noreferrer noopener">Midjourney</a> &#8211; An AI that creates images using text.</figcaption></figure>
</div>
</div>



<h3 class="wp-block-heading" id="h-4-ease-of-use">#4 Ease of Use</h3>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">A major advantage of OpenAI, including GPT, is its ability to lower the entry barrier for organizations using AI. GPT is accessible to developers and data scientists of all skill levels, making it easier for organizations to automate activities without extensive expertise. Its capacity to generalize to new cases (zero or few-shot learning) allows users to start with OpenAI even with little or no data. This is particularly beneficial for smaller customers. They may lack resources for in-house predictive model development, as well as larger customers who can speed up their development processes using a single multi-purpose AI.</p>



<p class="wp-block-paragraph">Moreover, OpenAI operates as a cloud service, eliminating the need for organizations to build and maintain their own AI infrastructure for GPT model development and hosting. Instead, they can utilize the cloud-based service provided by OpenAI, making it more convenient and cost-effective to begin using AI. This approach allows businesses to concentrate on their core competencies while leveraging GPT&#8217;s power to enhance operations and drive innovation.</p>



<p class="wp-block-paragraph">The scalability of Azure OpenAI also empowers businesses to start with a proof-of-concept project and scale up as required. This approach enables organizations to experiment with AI without committing to a large initial investment. Utilizing a single model for various purposes significantly accelerates the creation of POCs. Once a solution demonstrates its value, organizations can later fine-tune the process using more specialized models.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%">
<figure class="wp-block-image size-full"><img decoding="async" width="511" height="511" data-attachment-id="12304" data-permalink="https://www.relataly.com/openai-gpt-chatgpt-in-a-business-context-whats-the-value-proposition/12282/gilgerardo_web_designer_working_on_a_laptop_on_a_desk_in_the_ci_d4742d61-651b-4ccc-83be-402cce11cc35/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2023/02/gilgerardo_web_designer_working_on_a_laptop_on_a_desk_in_the_ci_d4742d61-651b-4ccc-83be-402cce11cc35.png" data-orig-size="511,511" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="gilgerardo_web_designer_working_on_a_laptop_on_a_desk_in_the_ci_d4742d61-651b-4ccc-83be-402cce11cc35" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2023/02/gilgerardo_web_designer_working_on_a_laptop_on_a_desk_in_the_ci_d4742d61-651b-4ccc-83be-402cce11cc35.png" src="https://www.relataly.com/wp-content/uploads/2023/02/gilgerardo_web_designer_working_on_a_laptop_on_a_desk_in_the_ci_d4742d61-651b-4ccc-83be-402cce11cc35.png" alt="Getting started with OpenAI GPT-3 is easy, as it allows developers to interact with the models using natural language prompts. Image created with Midjourney - An AI that creates images using text." class="wp-image-12304" srcset="https://www.relataly.com/wp-content/uploads/2023/02/gilgerardo_web_designer_working_on_a_laptop_on_a_desk_in_the_ci_d4742d61-651b-4ccc-83be-402cce11cc35.png 511w, https://www.relataly.com/wp-content/uploads/2023/02/gilgerardo_web_designer_working_on_a_laptop_on_a_desk_in_the_ci_d4742d61-651b-4ccc-83be-402cce11cc35.png 300w, https://www.relataly.com/wp-content/uploads/2023/02/gilgerardo_web_designer_working_on_a_laptop_on_a_desk_in_the_ci_d4742d61-651b-4ccc-83be-402cce11cc35.png 140w" sizes="(max-width: 511px) 100vw, 511px" /><figcaption class="wp-element-caption">Getting started with OpenAI GPT is easy, as it allows developers to interact with the models using natural language prompts. Image created with <a href="https://www.midjourney.com/app/" target="_blank" rel="noreferrer noopener">Midjourney </a>&#8211; An AI that creates images using text.</figcaption></figure>
</div>
</div>



<h2 class="wp-block-heading">Summary</h2>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">This article has explored the unique value proposition of OpenAI&#8217;s GPT in a business context, highlighting its enhanced language capabilities (1), versatility in use (2), complexity reduction (3), and lower entry barriers for AI adoption (4). These aspects make GPT a groundbreaking development in the field of artificial intelligence, particularly within natural language processing (NLP).</p>



<p class="wp-block-paragraph">GPT has demonstrated impressive performance across a wide array of applications, such as chatbots, personalized content generation, question-answering systems, and intricate data interpretation. While other NLP models can accomplish some tasks carried out by GPT, its extensive pre-training on large data sets and ability to manage various domains and tasks render it more flexible and powerful. Consequently, GPT&#8217;s potential to streamline workflows and reduce costs is indisputable.</p>



<p class="wp-block-paragraph">As OpenAI continues to advance and refine its technology, we can anticipate even more innovative use cases for GPT in the future. This ongoing evolution will undoubtedly contribute to the growing significance of GPT in shaping the AI landscape and revolutionizing the way businesses harness the power of artificial intelligence.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%">
<figure class="wp-block-image size-large"><img decoding="async" width="1024" height="1024" data-attachment-id="12307" data-permalink="https://www.relataly.com/openai-gpt-chatgpt-in-a-business-context-whats-the-value-proposition/12282/openai-sets-sail-for-wide-spread-adoption-2/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2023/02/OpenAI-sets-sail-for-wide-spread-adoption-2.png" data-orig-size="1024,1024" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="OpenAI-sets-sail-for-wide-spread-adoption-2" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2023/02/OpenAI-sets-sail-for-wide-spread-adoption-2.png" src="https://www.relataly.com/wp-content/uploads/2023/02/OpenAI-sets-sail-for-wide-spread-adoption-2-1024x1024.png" alt="OpenAI sets sail to transform various industries, as it offers a strong value proposition. " class="wp-image-12307" srcset="https://www.relataly.com/wp-content/uploads/2023/02/OpenAI-sets-sail-for-wide-spread-adoption-2.png 1024w, https://www.relataly.com/wp-content/uploads/2023/02/OpenAI-sets-sail-for-wide-spread-adoption-2.png 300w, https://www.relataly.com/wp-content/uploads/2023/02/OpenAI-sets-sail-for-wide-spread-adoption-2.png 140w, https://www.relataly.com/wp-content/uploads/2023/02/OpenAI-sets-sail-for-wide-spread-adoption-2.png 768w" sizes="(max-width: 1024px) 100vw, 1024px" /><figcaption class="wp-element-caption">OpenAI&#8217;s GPT offers businesses a substantial value proposition, thus setting sail for massive adoption in various industries. Image Source: Created with <a href="https://www.midjourney.com/app/" target="_blank" rel="noreferrer noopener">Midjourney</a>.</figcaption></figure>
</div>
</div>



<h2 class="wp-block-heading">Sources and Further Reading</h2>



<ul class="wp-block-list">
<li><a href="https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/" target="_blank" rel="noreferrer noopener">Reuter.com/chatgpt-sets-record-fastest-growing-user-base-analyst/</a></li>



<li><a href="http://analyticsindiamag.com/gpt-3-vs-bert-for-nlp-tasks/" target="_blank" rel="noreferrer noopener">Analyticsindiamag.com/gpt-vs-bert-for-nlp-tasks/</a></li>



<li><a href="https://symbl.ai/blog/gpt-3-versus-bert-a-high-level-comparison/" target="_blank" rel="noreferrer noopener">Symbl.ai/blog/gpt-versus-bert-a-high-level-comparison/</a></li>



<li><a href="https://platform.openai.com/docs/guides/completion/prompt-design" target="_blank" rel="noreferrer noopener">OpenAI.com/prompt-design</a></li>



<li><a href="https://www.relataly.com/using-chatgpt-and-other-openai-models-via-apis-in-python/12068/" target="_blank" rel="noreferrer noopener">Relataly.com &#8211; Using OpenAI GPT-3 with Python</a></li>



<li>OpenAI ChatGPT was used to revise this article</li>



<li><a href="https://www.relataly.com/automated-prompt-generation-for-dall-e-using-chatgpt-in-python-a-step-by-step-api-tutorial/12143/" target="_blank" rel="noreferrer noopener">Relataly.com &#8211; Integrating Dall-E with GPT-3 for Prompt Generation using Python</a></li>



<li>Images generated with <a href="https://www.midjourney.com/app/">Midjourney</a> </li>
</ul>
<p>The post <a href="https://www.relataly.com/openai-gpt-chatgpt-in-a-business-context-whats-the-value-proposition/12282/">What is the Business Value of ChatGPT and other Large Generative Language Models?</a> appeared first on <a href="https://www.relataly.com">relataly.com</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.relataly.com/openai-gpt-chatgpt-in-a-business-context-whats-the-value-proposition/12282/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">12282</post-id>	</item>
		<item>
		<title>Training a Sentiment Classifier with Naive Bayes and Logistic Regression in Python</title>
		<link>https://www.relataly.com/simple-sentiment-analysis-using-naive-bayes-and-logistic-regression/2007/</link>
					<comments>https://www.relataly.com/simple-sentiment-analysis-using-naive-bayes-and-logistic-regression/2007/#respond</comments>
		
		<dc:creator><![CDATA[Florian Follonier]]></dc:creator>
		<pubDate>Sat, 20 Jun 2020 21:49:05 +0000</pubDate>
				<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[Classification (multi-class)]]></category>
		<category><![CDATA[Finance]]></category>
		<category><![CDATA[Insurance]]></category>
		<category><![CDATA[Logistic Regression]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Naive Bayes]]></category>
		<category><![CDATA[Natural Language Processing]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Retail]]></category>
		<category><![CDATA[Scikit-Learn]]></category>
		<category><![CDATA[Seaborn]]></category>
		<category><![CDATA[Sentiment Analysis]]></category>
		<category><![CDATA[Telecommunications]]></category>
		<category><![CDATA[AI in Business]]></category>
		<category><![CDATA[AI in E-Commerce]]></category>
		<category><![CDATA[Beginner Tutorials]]></category>
		<category><![CDATA[Classic Machine Learning]]></category>
		<category><![CDATA[Digital Transformation]]></category>
		<category><![CDATA[Social Media Data]]></category>
		<category><![CDATA[Supervised Learning]]></category>
		<guid isPermaLink="false">https://www.relataly.com/?p=2007</guid>

					<description><![CDATA[<p>Are you ready to learn about the exciting world of social media sentiment analysis using Python? In this article, we&#8217;ll dive into how companies are leveraging machine learning to extract insights from Twitter comments, and how you can do the same. By comparing two popular classification models &#8211; Naive Bayes and Logistic Regression &#8211; we&#8217;ll ... <a title="Training a Sentiment Classifier with Naive Bayes and Logistic Regression in Python" class="read-more" href="https://www.relataly.com/simple-sentiment-analysis-using-naive-bayes-and-logistic-regression/2007/" aria-label="Read more about Training a Sentiment Classifier with Naive Bayes and Logistic Regression in Python">Read more</a></p>
<p>The post <a href="https://www.relataly.com/simple-sentiment-analysis-using-naive-bayes-and-logistic-regression/2007/">Training a Sentiment Classifier with Naive Bayes and Logistic Regression in Python</a> appeared first on <a href="https://www.relataly.com">relataly.com</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow">
<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">Are you ready to learn about the exciting world of social media sentiment analysis using Python? In this article, we&#8217;ll dive into how companies are leveraging machine learning to extract insights from Twitter comments, and how you can do the same. By comparing two popular classification models &#8211; Naive Bayes and Logistic Regression &#8211; we&#8217;ll help you identify which one best fits your needs.</p>



<p class="wp-block-paragraph">Businesses are using sentiment analysis to make better sense of the vast amounts of data available online and on social media platforms. Understanding customer opinions and feedback can help companies identify trends and make more informed decisions. Whether you&#8217;re a business professional looking to leverage the power of social media data or a machine learning enthusiast, this article has everything you need to get started.</p>



<p class="wp-block-paragraph">We&#8217;ll begin with an introduction to the concept of sentiment analysis and its theoretical foundations. Then, we&#8217;ll guide you through the practical steps of implementing a sentiment classifier in Python. Our model will analyze text snippets and categorize them into one of three sentiment categories: &#8220;positive,&#8221; &#8220;neutral,&#8221; or &#8220;negative.&#8221; Finally, we&#8217;ll compare the performance of Naive Bayes and Logistic Regression classifiers.</p>



<p class="wp-block-paragraph">By the end of this article, you&#8217;ll have the skills and knowledge to perform sentiment analysis on social media data and apply these insights to your business or personal projects. So let&#8217;s jump right in!</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%">
<figure class="wp-block-image size-large"><img decoding="async" width="512" height="423" data-attachment-id="13349" data-permalink="https://www.relataly.com/simple-sentiment-analysis-using-naive-bayes-and-logistic-regression/2007/flo7up_a_person_talking_to_a_virtual_assistant-_colorful_popart_2f34967a-ce4e-420d-bc01-75a4e47c1181-copy/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2023/03/Flo7up_a_person_talking_to_a_virtual_assistant._Colorful_popart_2f34967a-ce4e-420d-bc01-75a4e47c1181-Copy.png" data-orig-size="920,760" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="Flo7up_a_person_talking_to_a_virtual_assistant._Colorful_popart_2f34967a-ce4e-420d-bc01-75a4e47c1181-Copy" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2023/03/Flo7up_a_person_talking_to_a_virtual_assistant._Colorful_popart_2f34967a-ce4e-420d-bc01-75a4e47c1181-Copy.png" src="https://www.relataly.com/wp-content/uploads/2023/03/Flo7up_a_person_talking_to_a_virtual_assistant._Colorful_popart_2f34967a-ce4e-420d-bc01-75a4e47c1181-Copy-512x423.png" alt="Sentiment analysis has various use cases from analyzing social media to reviewing customer feedback in call centers." class="wp-image-13349" srcset="https://www.relataly.com/wp-content/uploads/2023/03/Flo7up_a_person_talking_to_a_virtual_assistant._Colorful_popart_2f34967a-ce4e-420d-bc01-75a4e47c1181-Copy.png 512w, https://www.relataly.com/wp-content/uploads/2023/03/Flo7up_a_person_talking_to_a_virtual_assistant._Colorful_popart_2f34967a-ce4e-420d-bc01-75a4e47c1181-Copy.png 300w, https://www.relataly.com/wp-content/uploads/2023/03/Flo7up_a_person_talking_to_a_virtual_assistant._Colorful_popart_2f34967a-ce4e-420d-bc01-75a4e47c1181-Copy.png 768w, https://www.relataly.com/wp-content/uploads/2023/03/Flo7up_a_person_talking_to_a_virtual_assistant._Colorful_popart_2f34967a-ce4e-420d-bc01-75a4e47c1181-Copy.png 920w" sizes="(max-width: 512px) 100vw, 512px" /><figcaption class="wp-element-caption">Sentiment analysis has various use cases from analyzing social media to reviewing customer feedback in call centers.</figcaption></figure>
</div>
</div>



<p class="wp-block-paragraph">Also: <a href="https://www.relataly.com/predicting-the-purchase-intention-of-online-shoppers/982/" target="_blank" rel="noreferrer noopener">Classifying Purchase Intention of Online Shoppers with Python</a></p>
</div>
</div>



<h2 class="wp-block-heading">What is Sentiment Analysis?</h2>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">Sentiment analysis is the process of identifying the sentiment, or emotional tone, of a piece of text. This can be useful for a wide range of applications, such as identifying customer sentiment towards a product or service, or detecting the overall sentiment of a social media post or news article.</p>



<p class="wp-block-paragraph">Sentiment analysis is typically performed using natural language processing (NLP) techniques and machine learning algorithms. These tools allow computers to &#8220;understand&#8221; the meaning of text and identify the sentiment it contains. Sentiment analysis can be performed at various levels of granularity, from identifying the sentiment of an entire document to identifying the sentiment of individual words or phrases within a document.</p>



<h2 class="wp-block-heading">How Sentiment Classification Works</h2>



<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="4767" data-permalink="https://www.relataly.com/simple-sentiment-analysis-using-naive-bayes-and-logistic-regression/2007/image-61-3/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2021/06/image-61.png" data-orig-size="1171,492" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-61" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2021/06/image-61.png" src="https://www.relataly.com/wp-content/uploads/2021/06/image-61-1024x430.png" alt="sentiment classification using bayes and logistic regression in python" class="wp-image-4767" width="772" height="326" srcset="https://www.relataly.com/wp-content/uploads/2021/06/image-61.png 300w, https://www.relataly.com/wp-content/uploads/2021/06/image-61.png 768w" sizes="(max-width: 772px) 100vw, 772px" /><figcaption class="wp-element-caption">A sentiment classifier with three classes</figcaption></figure>



<p class="wp-block-paragraph">There are many different approaches to sentiment analysis, and the specific methods used can vary depending on the specific application and the type of text being analyzed. Some common techniques for performing sentiment analysis include using machine learning algorithms to classify text as positive, negative, or neutral, and using lexicons, or lists of words with pre-defined sentiment, to identify the sentiment of individual words or phrases. In this way, it is possible to measure the emotions towards a specific topic, e.g., products, brands, political parties, services, or trends. </p>



<p class="wp-block-paragraph">We can show how sentiment analysis works with a simple example:</p>



<ul class="wp-block-list">
<li>&#8220;This product is excellent!&#8221;</li>



<li>&#8220;I don&#8217;t like this ice cream at all.&#8221;</li>



<li>&#8220;Yesterday, I&#8217;ve seen a dolphin.&#8221;</li>
</ul>



<p class="wp-block-paragraph">While the first sentence denotes a positive sentiment, the second sentence is negative, and in the third sentence, the sentiment is neutral. A sentiment classifier can automatically label these sentences:</p>



<figure class="wp-block-table"><table><tbody><tr><td><strong>Text Sequence</strong></td><td class="has-text-align-left" data-align="left"><strong>Sentiment Label</strong></td></tr><tr><td>This product is great!</td><td class="has-text-align-left" data-align="left">POSITIVE</td></tr><tr><td>I wouldn&#8217;t say I like this ice cream at all.</td><td class="has-text-align-left" data-align="left">NEGATIVE</td></tr><tr><td>Yesterday I saw a dolphin.</td><td class="has-text-align-left" data-align="left">NEUTRAL</td></tr></tbody></table><figcaption class="wp-element-caption">Sentiment Labels of Text Sequences </figcaption></figure>



<p class="wp-block-paragraph">Predicting sentiment classes opens the door to more advanced statistical analysis and automated text processing. </p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<h3 class="wp-block-heading" id="h-use-cases-for-sentiment-analysis">Use Cases for Sentiment Analysis</h3>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">Sentiment analysis is used in various application domains, including the following:</p>



<ul class="wp-block-list">
<li>Sentiment analysis can lead to more efficient customer service by prioritizing customer requests. For example, when customers complain about services or products, an algorithm can identify and prioritize these messages so that sales agents answer them first. This can increase customer satisfaction and reduce the churn rate. </li>



<li>Twitter and Amazon reviews have become the first port of call for many customers today when exchanging information about products, brands, and trends or expressing their own opinions. A sentiment classifier systematically enables businesses to evaluate this information. It can collect data from social media posts and product reviews in real-time. For example, marketing managers can quickly obtain feedback on how well customers perceive campaigns and ads.</li>



<li>In stock market prediction, analyze the sentiment of social media or news feeds towards stocks or brands. The sentiment is then used as an additional feature alongside price data to create better forecasting models. Some forecasting also approaches exclusively rely on sentiment.</li>
</ul>



<p class="wp-block-paragraph">Sentiment Analysis will find further adoption in the coming years. Especially in marketing and customer service, companies will increasingly use sentiment analysis to automate business processes and offer their customers a better customer experience.</p>



<h3 class="wp-block-heading" id="h-how-sentiment-analysis-works-feature-modelling">How Sentiment Analysis Works: Feature Modelling</h3>



<p class="wp-block-paragraph">An essential step in the development of the Sentiment Classifier is language modeling. Before we can train a machine learning model, we need to bring the natural text into a structured format that the model can statistically assess in the training process. Various modeling techniques exist for this purpose. The two most common models are <strong>bag-of-words </strong>and <strong>n-grams</strong>.</p>



<p class="wp-block-paragraph">Also: <a href="https://www.relataly.com/business-use-cases-for-openai-gpt-models-chatgpt-davinci/12200/" target="_blank" rel="noreferrer noopener">9 Powerful Applications of OpenAI&#8217;s ChatGPT and Davinci</a></p>



<h4 class="wp-block-heading" id="h-bag-of-word-model">Bag-of-word Model</h4>



<p class="wp-block-paragraph">The bag-of-word model calculates probability distributions over the number of unique words. This approach converts individual words into individual features. Fill words with low predictive power, such as &#8220;the&#8221; or &#8220;a,&#8221; will be filtered out. Consider the following text sample: </p>



<p class="wp-block-paragraph"><em>&#8220;Bob likes to play basketball. But his friend Daniel prefers to play soccer. &#8220;</em></p>



<p class="wp-block-paragraph">Through filtering of fill words, we convert his sample to: </p>



<p class="wp-block-paragraph"><em>&#8220;Bob&#8221;, &#8220;likes&#8221;, &#8220;play&#8221;, &#8220;basketball&#8221;, &#8220;friend&#8221;, &#8220;Daniel&#8221;, &#8220;play&#8221;, &#8220;soccer&#8221;</em>.</p>



<p class="wp-block-paragraph">In the next step, the algorithm converts these words into a normalized form, where each word becomes a column:</p>



<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="4227" data-permalink="https://www.relataly.com/simple-sentiment-analysis-using-naive-bayes-and-logistic-regression/2007/image-15-9/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2021/05/image-15.png" data-orig-size="1046,134" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-15" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2021/05/image-15.png" src="https://www.relataly.com/wp-content/uploads/2021/05/image-15-1024x131.png" alt="" class="wp-image-4227" width="613" height="77" srcset="https://www.relataly.com/wp-content/uploads/2021/05/image-15.png 1024w, https://www.relataly.com/wp-content/uploads/2021/05/image-15.png 300w, https://www.relataly.com/wp-content/uploads/2021/05/image-15.png 768w" sizes="(max-width: 613px) 100vw, 613px" /><figcaption class="wp-element-caption">Text sample after transformation</figcaption></figure>



<p class="wp-block-paragraph">The bag-of-word model is easy to implement. However, it does not consider grammar or word order.</p>



<h4 class="wp-block-heading" id="h-what-is-an-n-gram-model">What is an N-gram Model?</h4>



<p class="wp-block-paragraph">The n-gram model considers multiple consecutive words in a text sequence and thus captures word sequence. The n stands for the number of words considered. </p>



<p class="wp-block-paragraph">For example, in a 2-gram model, the sentence <em>&#8220;Bob likes to play basketball. But his friend Daniel prefers to play soccer.&#8221;</em> will be converted to the following model: </p>



<p class="wp-block-paragraph">&#8220;Bob likes,&#8221; &#8220;likes to,&#8221; &#8220;to play,&#8221; &#8220;play basketball,&#8221; and so on. The n-gram model is often used to supplement the bag-of-word model. It is also possible to combine different n-gram models. For a 3-gram model, the text would be converted to &#8220;Bob likes to,&#8221; &#8220;likes to play,&#8221; &#8220;to play basketball,&#8221; and so on. Combining multiple n-gram models, however, can quickly increase model complexity.</p>



<h3 class="wp-block-heading" id="h-sentiment-classes-and-model-training">Sentiment Classes and Model Training </h3>



<p class="wp-block-paragraph">The training of sentiment classifiers traditionally takes place in a supervised learning process. For this purpose, a training data set is used, which contains text sections with associated sentiment tendencies as prediction labels. Depending on which labels we provide and the training data, the classifier will learn to predict sentiment on a more or less fine-grained scale. Capturing neutral sentiment requires choosing an odd number of classes. </p>



<p class="wp-block-paragraph">More advanced classifiers can detect different sorts of emotions and, for example, detect whether someone expresses anger, happiness, sadness, and so on. It basically comes down to which prediction labels you provide with the training data.</p>



<p class="wp-block-paragraph">When the classifier is trained on a one-gram model, the classifier will learn that certain words such as &#8220;good&#8221; or &#8220;great&#8221; increase the probability that a text is associated with a positive sentiment. Consequently, when the classifier encounters these words in a new text sample, it will predict a higher probability of positive sentiment. On the other hand, the classifier will learn that words such as &#8220;hate&#8221; or &#8220;dislike&#8221; are often used to express negative opinions and thus increase the probability of negative sentiment.</p>



<h3 class="wp-block-heading" id="h-language-complications">Language Complications </h3>



<p class="wp-block-paragraph">Is sentiment analysis that simple? Well, not quite. The cases described so far were deliberately chosen to be very simple. However, human language is very complex, and many peculiarities make it more difficult in practice to identify the sentiment in a sentence or paragraph. Here are some examples:</p>



<ul class="wp-block-list">
<li>Inversions: &#8220;this product is not so great.&#8221;</li>



<li>Typos: &#8220;I live this product!&#8221;</li>



<li>Comparisons: &#8220;Product a is better than product z.&#8221;</li>



<li>In a text passage, expression of pros and cons: &#8220;An advantage is that. But on the other hand&#8230;&#8221; </li>



<li>Unknown vocabulary: &#8220;This product is just whuopii!&#8221;</li>



<li>Missing words: &#8220;How can you not  this product?&#8221;</li>
</ul>



<p class="wp-block-paragraph">Fortunately, there are methods to solve the complications mentioned above. I will explain more about them in one of my future articles. But for now, let&#8217;s stay with the basics and implement a simple classifier.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<h2 class="wp-block-heading" id="h-training-a-sentiment-classifier-using-twitter-data-in-python">Training a Sentiment Classifier Using Twitter Data in Python</h2>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<p class="wp-block-paragraph">Venturing into the practical aspects of sentiment classification, our aim in this tutorial is to create an efficient sentiment classifier. Our focus will be on a dataset provided by Kaggle, comprising tens of thousands of tweets, each categorized as positive, neutral, or negative.</p>



<p class="wp-block-paragraph">Our objective is to design a classifier capable of assigning one of these three sentiment categories to new text sequences. To this end, we will employ two distinct algorithms &#8211; Logistic Regression and Naive Bayes &#8211; as our estimators.</p>



<p class="wp-block-paragraph">The tutorial culminates with a comparative analysis of the prediction performance of both models, followed by a set of test predictions. Through this hands-on approach, you will gain an understanding of the nuances of sentiment classification and its application in understanding public opinion, especially on social media platforms like Twitter.</p>



<p class="wp-block-paragraph">Boost your sentiment analysis skills with our step-by-step guide, and learn to leverage machine learning tools for precise sentiment prediction.</p>



<p class="wp-block-paragraph">The code is available on the GitHub repository.</p>



<div class="wp-block-kadence-advancedbtn kb-buttons-wrap kb-btns_9fa82e-91"><a class="kb-button kt-button button kb-btn_c78218-5c kt-btn-size-standard kt-btn-width-type-full kb-btn-global-inherit kt-btn-has-text-true kt-btn-has-svg-true wp-block-button__link wp-block-kadence-singlebtn" href="https://github.com/flo7up/relataly-public-python-tutorials/blob/master/08%20Natural%20Language%20Processing/700%20NLP%20-%20Simple%20Sentiment%20Analysis%20using%20Bayes%20and%20Logistic%20Regression.ipynb" target="_blank" rel="noreferrer noopener"><span class="kb-svg-icon-wrap kb-svg-icon-fe_eye kt-btn-icon-side-left"><svg viewBox="0 0 24 24"  fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"  aria-hidden="true"><path d="M1 12s4-8 11-8 11 8 11 8-4 8-11 8-11-8-11-8z"/><circle cx="12" cy="12" r="3"/></svg></span><span class="kt-btn-inner-text">View on GitHub </span></a>

<a class="kb-button kt-button button kb-btn_41bac4-d5 kt-btn-size-standard kt-btn-width-type-full kb-btn-global-inherit kt-btn-has-text-true kt-btn-has-svg-true wp-block-button__link wp-block-kadence-singlebtn" href="https://github.com/flo7up/relataly-public-python-API-tutorials" target="_blank" rel="noreferrer noopener"><span class="kb-svg-icon-wrap kb-svg-icon-fa_github kt-btn-icon-side-left"><svg viewBox="0 0 496 512"  fill="currentColor" xmlns="http://www.w3.org/2000/svg"  aria-hidden="true"><path d="M165.9 397.4c0 2-2.3 3.6-5.2 3.6-3.3.3-5.6-1.3-5.6-3.6 0-2 2.3-3.6 5.2-3.6 3-.3 5.6 1.3 5.6 3.6zm-31.1-4.5c-.7 2 1.3 4.3 4.3 4.9 2.6 1 5.6 0 6.2-2s-1.3-4.3-4.3-5.2c-2.6-.7-5.5.3-6.2 2.3zm44.2-1.7c-2.9.7-4.9 2.6-4.6 4.9.3 2 2.9 3.3 5.9 2.6 2.9-.7 4.9-2.6 4.6-4.6-.3-1.9-3-3.2-5.9-2.9zM244.8 8C106.1 8 0 113.3 0 252c0 110.9 69.8 205.8 169.5 239.2 12.8 2.3 17.3-5.6 17.3-12.1 0-6.2-.3-40.4-.3-61.4 0 0-70 15-84.7-29.8 0 0-11.4-29.1-27.8-36.6 0 0-22.9-15.7 1.6-15.4 0 0 24.9 2 38.6 25.8 21.9 38.6 58.6 27.5 72.9 20.9 2.3-16 8.8-27.1 16-33.7-55.9-6.2-112.3-14.3-112.3-110.5 0-27.5 7.6-41.3 23.6-58.9-2.6-6.5-11.1-33.3 2.6-67.9 20.9-6.5 69 27 69 27 20-5.6 41.5-8.5 62.8-8.5s42.8 2.9 62.8 8.5c0 0 48.1-33.6 69-27 13.7 34.7 5.2 61.4 2.6 67.9 16 17.7 25.8 31.5 25.8 58.9 0 96.5-58.9 104.2-114.8 110.5 9.2 7.9 17 22.9 17 46.4 0 33.7-.3 75.4-.3 83.6 0 6.5 4.6 14.4 17.3 12.1C428.2 457.8 496 362.9 496 252 496 113.3 383.5 8 244.8 8zM97.2 352.9c-1.3 1-1 3.3.7 5.2 1.6 1.6 3.9 2.3 5.2 1 1.3-1 1-3.3-.7-5.2-1.6-1.6-3.9-2.3-5.2-1zm-10.8-8.1c-.7 1.3.3 2.9 2.3 3.9 1.6 1 3.6.7 4.3-.7.7-1.3-.3-2.9-2.3-3.9-2-.6-3.6-.3-4.3.7zm32.4 35.6c-1.6 1.3-1 4.3 1.3 6.2 2.3 2.3 5.2 2.6 6.5 1 1.3-1.3.7-4.3-1.3-6.2-2.2-2.3-5.2-2.6-6.5-1zm-11.4-14.7c-1.6 1-1.6 3.6 0 5.9 1.6 2.3 4.3 3.3 5.6 2.3 1.6-1.3 1.6-3.9 0-6.2-1.4-2.3-4-3.3-5.6-2z"/></svg></span><span class="kt-btn-inner-text">Relataly GitHub Repo </span></a></div>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:33.33%"></div>
</div>



<h3 class="wp-block-heading" id="h-prerequisites">Prerequisites</h3>



<p class="wp-block-paragraph">Before starting the coding part, make sure that you have set up your <a href="https://www.python.org/downloads/" target="_blank" rel="noreferrer noopener">Python 3</a> environment and required packages. If you don&#8217;t have an environment, follow&nbsp;<a href="https://www.relataly.com/category/data-science/setup-anaconda-environment/" target="_blank" rel="noreferrer noopener">this tutorial</a>&nbsp;to set up the&nbsp;<a href="https://www.anaconda.com/products/individual" target="_blank" rel="noreferrer noopener">Anaconda environment</a>. Also, make sure you install all required packages. In this tutorial, we will be working with the following standard packages:&nbsp;</p>



<ul class="wp-block-list">
<li><em><a href="https://pandas.pydata.org/" target="_blank" rel="noreferrer noopener">pandas</a></em></li>



<li><em><a href="https://numpy.org/" target="_blank" rel="noreferrer noopener">NumPy</a></em></li>



<li><a href="https://docs.python.org/3/library/math.html" target="_blank" rel="noreferrer noopener">math</a></li>



<li><em><a href="https://matplotlib.org/" target="_blank" rel="noreferrer noopener">matplotlib</a></em></li>
</ul>



<p class="wp-block-paragraph">In addition, we will be using the machine learning libraries <a href="https://scikit-learn.org/stable/" target="_blank" rel="noreferrer noopener">scikit-learn</a> and <a href="https://seaborn.pydata.org/" target="_blank" rel="noreferrer noopener">seaborn</a> for visualization. </p>



<p class="wp-block-paragraph">You can install packages using console commands:</p>



<ul class="wp-block-list">
<li><em>pip install &lt;package name&gt;</em></li>



<li><em>conda install &lt;package name&gt;</em>&nbsp;(if you are using the anaconda packet manager)</li>
</ul>



<h3 class="wp-block-heading" id="h-about-the-sentiment-dataset">About the Sentiment Dataset</h3>



<p class="wp-block-paragraph">Let&#8217;s begin with the technical part. First, we will download the data from the <a href="https://www.kaggle.com/c/liverpool-ion-switching/data">Twitter sentiment example</a> on Kaggle.com. If you are working with the Kaggle Python environment, you can also directly save the data into your Python project. </p>



<p class="wp-block-paragraph">We will only use the following two CSV files:</p>



<ul class="wp-block-list">
<li><strong>train.csv:</strong> contains 27480 text samples.</li>



<li><strong>test.csv:</strong> contains 3533 text samples for validation purposes</li>
</ul>



<p class="wp-block-paragraph">The two files contain four columns:</p>



<ul class="wp-block-list">
<li>textID: An identifier</li>



<li>text: The raw text</li>



<li>selected_text: Contains a selected part of the original text</li>



<li>sentiment: Contains the prediction label</li>
</ul>



<p class="wp-block-paragraph">We will copy the two files (train.csv and test.csv) into a folder that you can access from your Python environment. For simplicity, I recommend putting these files directly into the folder of your Python notebook. If you put them somewhere else, don&#8217;t forget to adjust the file path when loading the data.</p>



<h3 class="wp-block-heading" id="h-step-1-load-the-data">Step #1 Load the Data</h3>



<p class="wp-block-paragraph">Assuming that you have copied the files into your Python environment, the next step is to load the data into your Python project and convert it into a Pandas DataFrame. The following code performs these steps and then prints a data summary.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}">import math 
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt 
import matplotlib

from sklearn.pipeline import Pipeline
from sklearn.feature_extraction.text import CountVectorizer, TfidfTransformer
from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import classification_report, multilabel_confusion_matrix
import scikitplot as skplt

import seaborn as sns

# Load the train data
train_path = &quot;train.csv&quot;
train_df = pd.read_csv(train_path) 

# Load the test data
sub_test_path = &quot;test.csv&quot;
test_df = pd.read_csv(sub_test_path) 

# Print a Summary of the data
print(train_df.shape, test_df.shape)
print(train_df.head(5))</pre></div>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:false,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;null&quot;,&quot;mime&quot;:&quot;text/plain&quot;,&quot;theme&quot;:&quot;3024-day&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Plain Text&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;text&quot;}">			textID		text												selected_text									sentiment
0			cb774db0d1	I`d have responded, if I were going					I`d have responded, if I were going				neutral
1			549e992a42	Sooo SAD I will miss you here in San Diego!!!		Sooo SAD										negative
2			088c60f138	my boss is bullying me...							bullying me										negative
3			9642c003ef	what interview! leave me alone						leave me alone									negative
4			358bd9e861	Sons of ****, why couldn`t they put them on t...	Sons of ****,									negative
...	
27481 rows × 4 columns</pre></div>



<h3 class="wp-block-heading" id="h-step-2-clean-and-preprocess-the-data">Step #2 Clean and Preprocess the Data</h3>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow">
<p class="wp-block-paragraph">Next, let&#8217;s quickly clean and preprocess the data. First, as a best practice, we will transform the sentiment labels of the train and the test data into numeric values.</p>



<p class="wp-block-paragraph">In addition, we will add a column in which we store the length of the text samples.</p>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow">
<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="2042" data-permalink="https://www.relataly.com/simple-sentiment-analysis-using-naive-bayes-and-logistic-regression/2007/image-12-3/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2020/06/image-12.png" data-orig-size="389,108" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-12" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2020/06/image-12.png" src="https://www.relataly.com/wp-content/uploads/2020/06/image-12.png" alt="" class="wp-image-2042" width="248" height="68" srcset="https://www.relataly.com/wp-content/uploads/2020/06/image-12.png 389w, https://www.relataly.com/wp-content/uploads/2020/06/image-12.png 300w" sizes="(max-width: 248px) 100vw, 248px" /><figcaption class="wp-element-caption">Three-class sentiment scale</figcaption></figure>
</div>
</div>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Define Class Integer Values
cleanup_nums = {&quot;sentiment&quot;: {&quot;negative&quot;: 1, &quot;neutral&quot;: 2, &quot;positive&quot;: 3}}

# Replace the Classes with Integer Values
train_df = train_base_df.copy()
train_df.replace(cleanup_nums, inplace=True)

# Clean the Test Data
test_df = test_base_df.copy()
test_df.replace(cleanup_nums, inplace=True)

# Create a Feature based on Text Length
train_df['text_length'] = train_df['text'].str.len() # Store string length of each sample
train_df = train_df.sort_values(['text_length'], ascending=True)
train_df = train_df.dropna()
train_df </pre></div>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:false,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;null&quot;,&quot;mime&quot;:&quot;text/plain&quot;,&quot;theme&quot;:&quot;3024-day&quot;,&quot;lineNumbers&quot;:false,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Plain Text&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;text&quot;}">			textID			text			selected_text	sentiment	text_length
14339		5c6abc28a1		ow				ow				2			3.0
26005		0b3fe0ca78		?				?				2			3.0
11524		4105b6a05d		aw				aw				2			3.0
641			5210cc55ae		no				no				2			3.0
25699		ee8ee67cb3		ME				ME				2			3.0
...</pre></div>



<h3 class="wp-block-heading" id="h-step-3-explore-the-data">Step #3 Explore the Data</h3>



<p class="wp-block-paragraph">It&#8217;s always good to check the label distribution for a potential imbalance. We do this by plotting the distribution of labels in the text samples. This is important because it helps ensure that the trained model can make accurate predictions on new data. If the class labels are unbalanced, then the model is more likely to be biased toward the more common classes, which can lead to poor performance on less common classes. </p>



<p class="wp-block-paragraph">Also: <a href="https://www.relataly.com/exploratory-feature-preparation-for-regression-with-python-and-scikit-learn/8832/" target="_blank" rel="noreferrer noopener">Feature Engineering and Selection for Regression Models</a></p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Print the Distribution of Sentiment Labels
sns.set_theme(style=&quot;whitegrid&quot;)
ax = train_df['sentiment'].value_counts(sort=False).plot(kind='barh', color='b')
ax.set_xlabel('Count')
ax.set_ylabel('Labels')</pre></div>



<figure class="wp-block-image size-large"><img decoding="async" width="378" height="265" data-attachment-id="4636" data-permalink="https://www.relataly.com/simple-sentiment-analysis-using-naive-bayes-and-logistic-regression/2007/image-42-4/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2021/06/image-42.png" data-orig-size="378,265" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-42" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2021/06/image-42.png" src="https://www.relataly.com/wp-content/uploads/2021/06/image-42.png" alt="Balance of Class Labels in a Machine Learning Use Case" class="wp-image-4636" srcset="https://www.relataly.com/wp-content/uploads/2021/06/image-42.png 378w, https://www.relataly.com/wp-content/uploads/2021/06/image-42.png 300w" sizes="(max-width: 378px) 100vw, 378px" /></figure>



<p class="wp-block-paragraph">As we can see, our data is a bit imbalanced, but the differences are still within an acceptable range. </p>



<p class="wp-block-paragraph"> Let&#8217;s also quickly take a look at the distribution of text length. </p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Visualize a distribution of text_length
sns.histplot(data=train_df, x='text_length', bins='auto', color='darkblue');
plt.title('Text Length Distribution')</pre></div>



<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="4635" data-permalink="https://www.relataly.com/simple-sentiment-analysis-using-naive-bayes-and-logistic-regression/2007/image-41-5/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2021/06/image-41.png" data-orig-size="397,279" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-41" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2021/06/image-41.png" src="https://www.relataly.com/wp-content/uploads/2021/06/image-41.png" alt="" class="wp-image-4635" width="468" height="329" srcset="https://www.relataly.com/wp-content/uploads/2021/06/image-41.png 397w, https://www.relataly.com/wp-content/uploads/2021/06/image-41.png 300w" sizes="(max-width: 468px) 100vw, 468px" /></figure>



<h3 class="wp-block-heading" id="h-step-4-train-a-sentiment-classifier">Step #4 Train a Sentiment Classifier </h3>



<p class="wp-block-paragraph">Next, we will prepare the data and train a classification model. We will use the pipeline class of the scikit-learn framework and a bag-of-word model to keep things simple. In NLP, we typically have to transform and split up the text into sentences and words. The pipeline class is thus instrumental in NLP because it allows us to perform multiple actions on the same data in a row.</p>



<p class="wp-block-paragraph">The pipeline contains transformation activities and a prediction algorithm, the final estimator. In the following, we create two pipelines that use two different prediction algorithms: </p>



<ul class="wp-block-list">
<li>Logistic Regression </li>



<li>Naive Bayes</li>
</ul>



<h4 class="wp-block-heading" id="h-4a-sentiment-classification-using-logistic-regression">4a) Sentiment Classification using Logistic Regression </h4>



<p class="wp-block-paragraph">The first model that we will train uses the logistic regression algorithm. We create a new pipeline. Then we add two transformers and the logistic regression estimator. The pipeline will perform the following activities. </p>



<ul class="wp-block-list">
<li><strong><a href="https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.CountVectorizer.html">CountVectorizer</a></strong>: The vectorizer counts the number of words in each text sequence and creates the bag-of-word models. </li>



<li><strong><a href="https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfTransformer.html">TfidfTransformer</a></strong>: The &#8220;Term Frequency Transformer&#8221; scales down the impact of words that occur very often in the training data and are thus less informative for the estimator than words that occur in a smaller fraction of the text samples. Examples are words such as &#8220;to&#8221; or &#8220;a.&#8221;</li>



<li><a href="https://www.relataly.com/category/machine-learning-algorithms/logistic-regression/" target="_blank" rel="noreferrer noopener">Logistic Regression</a>: By defining the multi_class as &#8216;auto,&#8217; we will use logistic regression in a one-vs-all approach. This approach will split our three-class prediction problem into two two-class problems. Our model differentiates between one class and all other classes in the first step. Then all observations that do not fall into the first class enter a second model that predicts whether it is class two or three. </li>
</ul>



<p class="wp-block-paragraph">Our pipeline will transform the data and fit the logistic regression model to the training data. After executing the pipeline, we will directly evaluate the model&#8217;s performance. We will do this by defining a function that generates predictions on the test dataset and then evaluating the performance of our model. The function will print the performance results and store them in a dataframe. Later, when we want to compare the models, we can access the results from the dataframe. </p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Create a transformation pipeline
# The pipeline sequentially applies a list of transforms and as a final estimator logistic regression 
pipeline_log = Pipeline([
                ('count', CountVectorizer()),
                ('tfidf', TfidfTransformer()),
                ('clf', LogisticRegression(solver='liblinear', multi_class='auto')),
        ])

# Train model using the created sklearn pipeline
model_name = 'logistic regression classifier'
model_lgr = pipeline_log.fit(train_df['text'], train_df['sentiment'])

def evaluate_results(model, test_df):
    # Predict class labels using the learner function
    test_df['pred'] = model.predict(test_df['text'])
    y_true = test_df['sentiment']
    y_pred = test_df['pred']
    target_names = ['negative', 'neutral', 'positive']

    # Print the Confusion Matrix
    results_log = classification_report(y_true, y_pred, target_names=target_names, output_dict=True)
    results_df_log = pd.DataFrame(results_log).transpose()
    print(results_df_log)
    matrix = confusion_matrix(y_true,  y_pred)
    sns.heatmap(pd.DataFrame(matrix), 
                annot=True, fmt=&quot;d&quot;, linewidths=.5, cmap=&quot;YlGnBu&quot;)
    plt.xlabel('Predictions')
    plt.xlabel('Actual')
    
    model_score = score(y_pred, y_true, average='macro')
    return model_score

    
# Evaluate model performance
model_score = evaluate_results(model_lgr, test_df)
performance_df = pd.DataFrame().append({'model_name': model_name, 
                                    'f1_score': model_score[0], 
                                    'precision': model_score[1], 
                                    'recall': model_score[2]}, ignore_index=True)</pre></div>



<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="4643" data-permalink="https://www.relataly.com/simple-sentiment-analysis-using-naive-bayes-and-logistic-regression/2007/image-47-3/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2021/06/image-47.png" data-orig-size="634,555" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-47" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2021/06/image-47.png" src="https://www.relataly.com/wp-content/uploads/2021/06/image-47.png" alt="performance of our logistic regression sentiment classifier" class="wp-image-4643" width="562" height="492" srcset="https://www.relataly.com/wp-content/uploads/2021/06/image-47.png 634w, https://www.relataly.com/wp-content/uploads/2021/06/image-47.png 300w" sizes="(max-width: 562px) 100vw, 562px" /></figure>



<h4 class="wp-block-heading" id="h-4b-sentiment-classification-using-naive-bayes">4b) Sentiment Classification using Naive Bayes</h4>



<p class="wp-block-paragraph">We will reuse the code from the last step to create another pipeline. However, we will exchange the Logistic Regressor with Naive Bayes (&#8220;MultinomialNB&#8221;). Naive Bayes is commonly used in natural language processing. The algorithm calculates the probability of each tag for a text sequence and then outputs the tag with the highest score. For example, the probabilities of the appearance of the words &#8220;likes&#8221; and &#8220;good&#8221; in texts within the category &#8220;positive sentiment&#8221; are higher than the probabilities of formation within the &#8220;negative&#8221; or &#8220;neutral&#8221; categories. In this way, the model predicts how likely it is for an unknown text that contains those words to be associated with either category. </p>



<p class="wp-block-paragraph">We will reuse the previously defined function to print a classification report and plot the results in a confusion matrix. </p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Create a pipeline which transforms phrases into normalized feature vectors and uses a bayes estimator
model_name = 'bayes classifier'

pipeline_bayes = Pipeline([
                ('count', CountVectorizer()),
                ('tfidf', TfidfTransformer()),
                ('gnb', MultinomialNB()),
                ])

# Train model using the created sklearn pipeline
model_bayes = pipeline_bayes.fit(train_df['text'], train_df['sentiment'])

# Evaluate model performance
model_score = evaluate_results(model_bayes, test_df)
performance_df = performance_df.append({'model_name': model_name, 
                                    'f1_score': model_score[0], 
                                    'precision': model_score[1], 
                                    'recall': model_score[2]}, ignore_index=True)</pre></div>



<h3 class="wp-block-heading" id="h-step-5-measuring-multi-class-performance">Step #5 Measuring Multi-class Performance</h3>



<p class="wp-block-paragraph">So which classifier achieved better performance? It&#8217;s not so easy to say because it depends on the metrics. We will compare the classification performance of our two classifiers using the following metrics:</p>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-8f761849 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow" style="flex-basis:66.66%">
<ul class="wp-block-list">
<li><strong>Accuracy </strong>is calculated as the ratio between correctly predicted observations and total observations.</li>



<li><strong>Precision</strong> is calculated as the ratio between correctly labeled values and the sum of the correctly and incorrectly labeled positive observations.</li>



<li>The formula for<strong> Recall </strong>is the ratio between correctly predicted observations and the sum of falsely classified observations. </li>



<li><strong>F1-Score</strong> takes all falsely labeled observations into account. It is, therefore, useful when you have an unequal class distribution.</li>
</ul>
</div>
</div>



<p class="wp-block-paragraph">You may wonder which of our three classes is the positive class. The answer is that we have to determine the positive class ourselves. By defining the positive class, we can consider that some classes may be more important than others. The other classes will then be counted as negative. You can see this in the confusion matrix in sections 5 and 6, containing separate metrics for each label. </p>



<p class="wp-block-paragraph">Another option is to define a weighted average (see confusion matrix) that weights the quantity of the different labels in the overall dataset. For example, the negative label is weighted a bit higher than the neutral label because fewer observations with negative and positive labels are present in the data. Because our classes are equally important, I decided to use the weighted average. </p>



<h3 class="wp-block-heading" id="h-step-6-comparing-model-performance">Step #6 Comparing Model Performance</h3>



<p class="wp-block-paragraph">The following code calculates the performance metrics for the two classifiers and then creates a barplot to illustrate the results. In this specific case, the recall equals the accuracy. </p>



<p class="wp-block-paragraph">If you want to learn more about measuring classification performance, check out<a href="https://www.relataly.com/measuring-classification-performance-with-python-and-scikit-learn/846/" target="_blank" rel="noreferrer noopener"> this article</a>.</p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}"># Compare model performance
print(performance_df)

performance_df = performance_df.sort_values('model_name')
fig, ax = plt.subplots(figsize=(12, 4))
tidy = performance_df.melt(id_vars='model_name').rename(columns=str.title)
sns.barplot(y='Model_Name', x='Value', hue='Variable', data=tidy, ax=ax, palette='husl',  linewidth=1, edgecolor=&quot;w&quot;)
plt.title('Model Outlier Detection Performance (Macro)')</pre></div>



<figure class="wp-block-image size-large is-resized"><img decoding="async" data-attachment-id="4639" data-permalink="https://www.relataly.com/simple-sentiment-analysis-using-naive-bayes-and-logistic-regression/2007/image-44-4/#main" data-orig-file="https://www.relataly.com/wp-content/uploads/2021/06/image-44.png" data-orig-size="1164,510" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-44" data-image-description="" data-image-caption="" data-large-file="https://www.relataly.com/wp-content/uploads/2021/06/image-44.png" src="https://www.relataly.com/wp-content/uploads/2021/06/image-44-1024x449.png" alt="Performance comparison of the bayes classification model and the logistic regression classifier" class="wp-image-4639" width="744" height="326" srcset="https://www.relataly.com/wp-content/uploads/2021/06/image-44.png 1024w, https://www.relataly.com/wp-content/uploads/2021/06/image-44.png 300w, https://www.relataly.com/wp-content/uploads/2021/06/image-44.png 768w, https://www.relataly.com/wp-content/uploads/2021/06/image-44.png 1164w" sizes="(max-width: 744px) 100vw, 744px" /></figure>



<p class="wp-block-paragraph">So we see that our Logistic Regression model performs slightly better than the Naive Bayes model. Of course, there are still many possibilities to improve the models further. In addition, there are several other methods and algorithms with which the performance could be significantly increased.</p>



<h3 class="wp-block-heading" id="h-step-7-make-test-predictions">Step #7 Make Test Predictions</h3>



<p class="wp-block-paragraph">Finally, we use the Bayes classifier to generate some test predictions. Feel free to try it out! Change the text in the text phrases array and convince yourself that the classifier works. </p>



<div class="wp-block-codemirror-blocks-code-block code-block"><pre class="CodeMirror" data-setting="{&quot;showPanel&quot;:true,&quot;languageLabel&quot;:false,&quot;fullScreenButton&quot;:true,&quot;copyButton&quot;:true,&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text/x-python&quot;,&quot;theme&quot;:&quot;monokai&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:true,&quot;readOnly&quot;:true,&quot;fileName&quot;:&quot;&quot;,&quot;language&quot;:&quot;Python&quot;,&quot;maxHeight&quot;:&quot;400px&quot;,&quot;modeName&quot;:&quot;python&quot;}">testphrases = ['Mondays just suck!', 'I love this product', 'That is a tree', 'Terrible service']
for testphrase in testphrases:
    resultx = model_lgr.predict([testphrase]) # use model_bayes for predictions with the other model
    dict = {1: 'Negative', 2: 'Neutral', 3: 'Positive'}
    print(testphrase + '-&gt; ' + dict[resultx[0]])</pre></div>



<ul class="wp-block-list">
<li>Mondays suck!-&gt; Negative </li>



<li>I love this product-&gt; Positive </li>



<li>That is a tree-&gt; Neutral </li>



<li>Terrible service-&gt; Negative</li>
</ul>



<h2 class="wp-block-heading" id="h-summary">Summary</h2>



<p class="wp-block-paragraph">That&#8217;s it! In this tutorial, you have learned to build a simple sentiment classifier that can detect sentiment expressed through text on a three-class scale. We have trained and tested two standard classification algorithms &#8211; Logistic Regression and Naive Bayes. Finally, we have compared the performance of the two algorithms and made some test predictions. </p>



<p class="wp-block-paragraph">The best way to deepen your knowledge of sentiment analysis is to apply it in practice. I thus want to encourage you to use your knowledge by tackling other NLP challenges. For example, you could build a sentiment classifier that assigns text phrases to labels such as sports, fashion, cars, technology, etc. If you are still looking for data you can use for such a project, you will find exciting ones on Kaggle.com.</p>



<p class="wp-block-paragraph">Let me know if you found this tutorial helpful. I appreciate your feedback!</p>



<h2 class="wp-block-heading">Sources and Further Reading</h2>



<div style="display: inline-block;">
  <iframe sandbox="allow-popups allow-scripts allow-modals allow-forms allow-same-origin" style="width:120px;height:240px;" marginwidth="0" marginheight="0" scrolling="no" frameborder="0" src="//ws-eu.amazon-adsystem.com/widgets/q?ServiceVersion=20070822&amp;OneJS=1&amp;Operation=GetAdHtml&amp;MarketPlace=DE&amp;source=ss&amp;ref=as_ss_li_til&amp;ad_type=product_link&amp;tracking_id=flo7up-21&amp;language=de_DE&amp;marketplace=amazon&amp;region=DE&amp;placement=3030181162&amp;asins=3030181162&amp;linkId=669e46025028259138fbb5ccec12dfbe&amp;show_border=true&amp;link_opens_in_new_window=true"></iframe>
<iframe sandbox="allow-popups allow-scripts allow-modals allow-forms allow-same-origin" style="width:120px;height:240px;" marginwidth="0" marginheight="0" scrolling="no" frameborder="0" src="//ws-eu.amazon-adsystem.com/widgets/q?ServiceVersion=20070822&amp;OneJS=1&amp;Operation=GetAdHtml&amp;MarketPlace=DE&amp;source=ss&amp;ref=as_ss_li_til&amp;ad_type=product_link&amp;tracking_id=flo7up-21&amp;language=de_DE&amp;marketplace=amazon&amp;region=DE&amp;placement=1999579577&amp;asins=1999579577&amp;linkId=91d862698bf9010ff4c09539e4c49bf4&amp;show_border=true&amp;link_opens_in_new_window=true"></iframe>
<iframe sandbox="allow-popups allow-scripts allow-modals allow-forms allow-same-origin" style="width:120px;height:240px;" marginwidth="0" marginheight="0" scrolling="no" frameborder="0" src="//ws-eu.amazon-adsystem.com/widgets/q?ServiceVersion=20070822&amp;OneJS=1&amp;Operation=GetAdHtml&amp;MarketPlace=DE&amp;source=ss&amp;ref=as_ss_li_til&amp;ad_type=product_link&amp;tracking_id=flo7up-21&amp;language=de_DE&amp;marketplace=amazon&amp;region=DE&amp;placement=1839217715&amp;asins=1839217715&amp;linkId=356ba074068849ff54393f527190825d&amp;show_border=true&amp;link_opens_in_new_window=true"></iframe>
<iframe sandbox="allow-popups allow-scripts allow-modals allow-forms allow-same-origin" style="width:120px;height:240px;" marginwidth="0" marginheight="0" scrolling="no" frameborder="0" src="//ws-eu.amazon-adsystem.com/widgets/q?ServiceVersion=20070822&amp;OneJS=1&amp;Operation=GetAdHtml&amp;MarketPlace=DE&amp;source=ss&amp;ref=as_ss_li_til&amp;ad_type=product_link&amp;tracking_id=flo7up-21&amp;language=de_DE&amp;marketplace=amazon&amp;region=DE&amp;placement=1492032646&amp;asins=1492032646&amp;linkId=2214804dd039e7103577abd08722abac&amp;show_border=true&amp;link_opens_in_new_window=true"></iframe>
</div>



<p class="has-contrast-2-color has-base-3-background-color has-text-color has-background wp-block-paragraph"><em>The links above to Amazon are affiliate links. By buying through these links, you support the Relataly.com blog and help to cover the hosting costs. Using the links does not affect the price.</em></p>
<p>The post <a href="https://www.relataly.com/simple-sentiment-analysis-using-naive-bayes-and-logistic-regression/2007/">Training a Sentiment Classifier with Naive Bayes and Logistic Regression in Python</a> appeared first on <a href="https://www.relataly.com">relataly.com</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.relataly.com/simple-sentiment-analysis-using-naive-bayes-and-logistic-regression/2007/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">2007</post-id>	</item>
	</channel>
</rss>
