This post is also published in blog.saeloun.com.
What is ChatGPT?
ChatGPT is an artificial intelligence chatbot developed by OpenAI that allow us to have human-like conversations and generate image based on the text description. It is one of the greatest leaps in natural language processing.
Integrating OpenAI API in ruby application:
We can implement all the ChatGPT features in a ruby application to make it more engaging for users by integrating OpenAI API. For this, we are using the ruby-openai gem which allows us to use various OpenAI models which we can pick based on the use case.
ruby-openai gem to the Gemfile.
bundle install to install the gem.
Get access key:
We have to generate an access key to get a response back, visit API keys page and create a new secret key.
Copy the secret key and assign it to
OPENAI_ACCESS_TOKEN environment variable.
Configure Ruby OpenAI:
If the account is tied to an organization then set the value in this environment variable
OPENAI_ORGANIZATION_ID. We can find the organization ID value from the Settings page.
Then to create a client,
Choosing the right model:
Before diving into models we have to understand what a token is.
This is an explanation from OpenAI article,
Tokens can be thought of as pieces of words. Before the API processes the prompts, the input is broken down into tokens. These tokens are not cut up exactly where the words start or end - tokens can include trailing spaces and even sub-words.
- 1 token ~= 4 chars in English
- 1 token ~= ¾ words
- 100 tokens ~= 75 words
We can consider approx. 4 characters as a token.
OpenAI API has various models in each version and it can be used for different use cases,
GPT-4 model is great at solving complex problems with great accuracy and much more capable than the previous models, for most basic tasks there is no significant difference between GPT-4 and GPT-3.5 models.
- gpt-4 model can do complex tasks and optimized chat it has max support for 8,192 tokens and the model is training data up to Sep 2021.
- gpt-4-32k model has the same capability as gpt-4 model but max 32,768 tokens support and training data up to Sep 2021.
GPT-3.5 models can understand and generate natural language or code.
gpt-3.5-turbo is optimized for chat but works well for traditional tasks.
gpt-3.5-turbo model is the most capable GPT-3.5 model which is optimized for chat and has max token support of 4,096 and training data up to Sept 2021.
text-davinci-003 model can do any language task with better quality, longer output, and consistent instruction. It has max token support of 4,096 and training data up to Jun 2021.
text-davinci-002 model is similar capabilities to text-davinci-003 but trained with supervised fine-tuning which has max token support of 4,096 and training data up to Jun 2021.
code-davinci-002 model is optimized for code completion tasks that have max token support of 8,001 and training data up to Jun 2021.
GPT-3 models can understand and generate natural language. These models were superseded by the more powerful GPT-3.5 generation models. All the models have max token support of 2,049 and training data up to Oct 2019.
davinci model is the most capable model and can do any tasks with higher quality than other models.
curie model is very capable but faster and lower cost compared to the davinci model.
babbage model is capable of straightforward tasks, is very fast, and has a lower cost.
ada model is capable of very simple tasks, the very fastest model in GPT-3 model, and has the lowest cost.
DALL-E model can generate and edit images from the description in natural language.
Whisper model can convert audio into text, it can perform multilingual speech recognition, speech translation, and language identification.
Embeddings are a numerical representation of text that can be used to measure the relatedness between two pieces of text these models are useful for search, clustering, recommendations, anomaly detection, and classification tasks.
A fine-tuned model that can detect whether text may be sensitive or unsafe, this model will check whether the passed content complies with OpenAI usage policies.
We are using the
gpt-3.5-turbo model as gpt-4 has only limited access at the time of writing this post.
In a request, We have to pass two required parameters,
messages. Inside the
messages parameter, we should pass the
content parameter values.
temperature (optional) parameter, we have to pass a value between 0 to 2. A higher temperature value will result in more unpredictable and diverse responses and a lower temperature value will result in predictable and conservative responses.
OpenAI API supports three roles,
system - System instruction helps set the behavior of the assistant (OpenAI response), it is the high-level instruction given for the conversation.
user - Instruction passed by the end user.
assistant - The assistant messages help store prior responses.
As a response Ruby OpenAI API will return an object, this object will have,
id - Chat ID.
object - Name of the API that returns the response.
created - Response created at timestamp.
model - Model used to generate the response.
usage - Usage returns the number of tokens passed and generated.
choices - Message generated by the model and the status of the result.
In this case, we have set the role as a
user and the message or question in the
Example 1: Solving Problems:
In the below example, we have asked Open AI API to calculate the time taken for a spaceship to reach the Sun from the Earth which returns a step by step calculations.
Example 2: Technical Questions:
In the below example, we have asked Open AI to explain about
<></> use in react.
Example 3: Back and forth conversation:
In the below example, we can pass the prior conversation history as an instruction to have a more interactive and dynamic conversation.
Stream the response:
To make the application more engaging for users we can stream a chuck of responses.
For this, we’ll have to pass a
stream parameter along with the
content to stream the result.
In the stream parameter, we can pass a proc that prints the stream of response chunks that is generated. With this, we can set up a ChatGPT-like messaging stream in the Rails app by following this guide.
In the below example, we have asked Open AI API to explain the Color theory, the result will be a detailed explanation of color theory instead of waiting for the complete result we can stream chunks to the user and improve the user experience.
We will be using the GPT-3.5
text-davinci-003 model to complete the text.
We have to pass the content in
prompt parameters which the model uses to complete the text. We can also pass the maximum tokens that need to be generated to complete the text.
Example 1: Social media description:
In the below example, We have asked Open AI to complete a social media description.
Example 2: Ask Open AI API to complete the code!:
In the below example, We have asked Open AI API to complete a simple ruby addition code.
We will be using the
text-davinci-edit-001 model to edit the text.
We have to pass the content in
input parameter and a description of the task in
Example 1: Transplate code to different programming language:
In the below example, we have instructed Open AI API to translate a code snippet to C. In the results, it returns the entire C program and the interesting part is I didn’t even mention the programming language of the code snippet that I have passed.
Example 2: Find and replace and formatting:
In this below example, we have passed a input and insturcted Open AI API to replace a word and capitalize each word in the sentence.
We can use moderate text with OpenAI API, it will check whether the passed content complies with OpenAI usage policies.
There are seven categories and OpenAI API generate score for all seven categories, the scores will be between 0 to 1 a higher value denotes higher confidence ref.
In the below example, we will be using the
text-moderation-stable model and as a result, the score of the hate category will be returned.
Using the DALL-E model we can describe an image or art by describing it in natural language.
In the prompt parameter, we can describe the image that needs to be generated and in the size parameter we can pass the resolution, an image can be generated in 256x256, 512x512, or 1024x1024, if the size parameter is not passed 1024x1024 will be set as default.
This is the image generated,
We can also edit images but for that, we have to mask the image with a transparent section. The masked section can be altered based on the description.
This is the tree image that’s used for testing,
The image generated,
We can use the
whisper-1 model to transcribe the audio.
Integrating OpenAI API will open up endless possibilities for improving the user experience and making the site more engaging, it provides versatile language models that can simplify most of the traditional tasks and also solves complex problems. We can use it as a chatbot, to translate or transcribe audio, write or debug code, generate or edit images, and many more things.