chapter improvement

5 months ago · 40fc3bc2b5
parent 9ff2eb3b8d
commit 40fc3bc2b5
5 changed files with 55 additions and 22 deletions
--- a/img/introduction/sky.png
+++ b/img/introduction/sky.png
--- a/pages/introduction.en.mdx
+++ b/pages/introduction.en.mdx
@ -6,4 +6,4 @@ Prompt engineering skills help to better understand the capabilities and limitat

 This comprehensive guide covers the theory and practical aspects of prompt engineering and how to leverage the best prompting techniques to interact and build with LLMs. 

-All examples are tested with `gpt-3.5-turbo` using the [OpenAI's playground](https://platform.openai.com/playground) unless otherwise specified. The model uses the default configurations, i.e., `temperature=0.7` and `top-p=1`. The prompts should work with other models that have similar capabilities as `gpt-3.5-turbo` but you might get completely different results. 
+All examples are tested with `gpt-3.5-turbo` using the [OpenAI's Playground](https://platform.openai.com/playground) unless otherwise specified. The model uses the default configurations, i.e., `temperature=1` and `top_p=1`. The prompts should also work with other models that have similar capabilities as `gpt-3.5-turbo` but the model responses may vary.
--- a/pages/introduction/basics.en.mdx
+++ b/pages/introduction/basics.en.mdx
@ -1,9 +1,11 @@
 # Basics of Prompting

+import {Screenshot} from 'components/screenshot'
+import INTRO1 from '../../img/introduction/sky.png'

-## Basic Prompts
+## Prompting an LLM

-You can achieve a lot with simple prompts, but the quality of results depends on how much information you provide it and how well-crafted it is. A prompt can contain information like the *instruction* or *question* you are passing to the model and include other details such as *context*, *inputs*, or *examples*. You can use these elements to instruct the model better and as a result get better results.
+You can achieve a lot with simple prompts, but the quality of results depends on how much information you provide it and how well-crafted the prompt is. A prompt can contain information like the *instruction* or *question* you are passing to the model and include other details such as *context*, *inputs*, or *examples*. You can use these elements to instruct the model more effectively to improve the quality of results.

 Let's get started by going over a basic example of a simple prompt:

@ -15,14 +17,16 @@ The sky is

 *Output:*
 ```md
-blue
-
-The sky is blue on a clear day. On a cloudy day, the sky may be gray or white.
+blue.
 ```

-As you can see, the language model outputs a continuation of strings that make sense given the context `"The sky is"`. The output might be unexpected or far from the task you want to accomplish. 
+If you are using the OpenAI Playground or any other LLM playground, you can prompt the model as shown in the following screenshot:
+
+<Screenshot src={INTRO1} alt="INTRO1" />

-This basic example also highlights the necessity to provide more context or instructions on what specifically you want to achieve.
+Something to note is that when using the OpenAI chat models like `gtp-3.5-turbo` or `gpt-4`, you can structure your prompt using three different roles: `system`, `user`, and `assistant`. The system message is not required but helps to set the overall behavior of the assistant. The example above only includes a user message which you can use to directly prompt the model. For simplicity, all of the examples, except when it's explicitly mentioned, will use only the `user` message to prompt the `gpt-3.5-turbo` model. The `assistant` message in the example above corresponds to the model response. You can also use define an assistant message to pass examples of the desired behavior you want. You can learn more about working with chat models [here](https://www.promptingguide.ai/models/chatgpt).
+
+You can observe from the prompt example above that the language model responds with a sequence of tokens that make sense given the context `"The sky is"`. The output might be unexpected or far from the task you want to accomplish. In fact, this basic example highlights the necessity to provide more context or instructions on what specifically you want to achieve with the system. This is what prompt engineering is all about.

 Let's try to improve it a bit:

@ -36,10 +40,10 @@ The sky is
 *Output:*

 ```
-so  beautiful today.
+blue during the day and dark at night.
 ```

-Is that better? Well, you told the model to complete the sentence so the result looks a lot better as it follows exactly what you told it to do ("complete the sentence"). This approach of designing optimal prompts to instruct the model to perform a task is what's referred to as **prompt engineering**. 
+Is that better? Well, with the prompt above you are instructing the model to complete the sentence so the result looks a lot better as it follows exactly what you told it to do ("complete the sentence"). This approach of designing effective prompts to instruct the model to perform a desired task is what's referred to as **prompt engineering** in this guide. 

 The example above is a basic illustration of what's possible with LLMs today. Today's LLMs are able to perform all kinds of advanced tasks that range from text summarization to mathematical reasoning to code generation.

@ -64,7 +68,22 @@ Q: <Question>?
 A: 
 ```

-When prompting like the above, it's also referred to as *zero-shot prompting*, i.e., you are directly prompting the model for a response without any examples or demonstrations about the task you want it to achieve. Some large language models do have the ability to perform zero-shot prompting but it depends on the complexity and knowledge of the task at hand. 
+When prompting like the above, it's also referred to as *zero-shot prompting*, i.e., you are directly prompting the model for a response without any examples or demonstrations about the task you want it to achieve. Some large language models have the ability to perform zero-shot prompting but it depends on the complexity and knowledge of the task at hand and the tasks the model was trained to perform good on.
+
+A concrete prompt example is as follows:
+
+*Prompt*
+```
+Q: What is prompt engineering?
+```
+
+With some of the more recent models you can skip the "Q:" part as it is implied and understood by the model as a question answering task based on how the sequence is composed. In other words, the prompt could be simplified as follows:
+
+*Prompt*
+```
+What is prompt engineering?
+```
+

 Given the standard format above, one popular and effective technique to prompting is referred to as *few-shot prompting* where you provide exemplars (i.e., demonstrations). You can format few-shot prompts as follows:

@ -98,7 +117,7 @@ Q: <Question>?
 A:
 ```

-Keep in mind that it's not required to use QA format. The prompt format depends on the task at hand. For instance, you can perform a simple classification task and give exemplars that demonstrate the task as follows:
+Keep in mind that it's not required to use the QA format. The prompt format depends on the task at hand. For instance, you can perform a simple classification task and give exemplars that demonstrate the task as follows:

 *Prompt:*
 ```
@ -113,4 +132,4 @@ What a horrible show! //
 Negative
 ```

-Few-shot prompts enable in-context learning, which is the ability of language models to learn tasks given a few demonstrations.
+Few-shot prompts enable in-context learning, which is the ability of language models to learn tasks given a few demonstrations. We discuss zero-shot prompting and few-shot prompting more extensively in upcoming sections.
--- a/pages/introduction/elements.en.mdx
+++ b/pages/introduction/elements.en.mdx
@ -12,4 +12,20 @@ A prompt contains any of the following elements:

 **Output Indicator** - the type or format of the output.

+To demonstrate the prompt elements better, here is a simple prompt that aims to perform a text classification task:
+
+*Prompt*
+```
+Classify the text into neutral, negative, or positive
+
+Text: I think the food was okay.
+
+Sentiment:
+```
+
+In the prompt example above, the instruction correspond to the classification task, "Classify the text into neutral, negative, or positive". The input data corresponds to the "I think the food was okay.' part, and the output indicator used is "Sentiment:". Note that this basic example doesn't use context but this can also be provided as part of the prompt. For instance, the context for this text classification prompt can be additional examples provided as part of the prompt to help the model better understand the task and steer the type of outputs that you expect.
+
+
 You do not need all the four elements for a prompt and the format depends on the task at hand. We will touch on more concrete examples in upcoming guides.
+
+
--- a/pages/introduction/settings.en.mdx
+++ b/pages/introduction/settings.en.mdx
@ -1,21 +1,19 @@
 # LLM Settings

-When designing and testing prompts, you typically interact with the LLM via an API. You can configure a few parameters to get different results for your prompts. Tweaking these settings are important to improve reliability and desirability of responses and it takes experimentation to figure out the proper settings for your use cases. Below are the common settings you will come across when using different LLM providers:
+When designing and testing prompts, you typically interact with the LLM via an API. You can configure a few parameters to get different results for your prompts. Tweaking these settings are important to improve reliability and desirability of responses and it takes  a bit of experimentation to figure out the proper settings for your use cases. Below are the common settings you will come across when using different LLM providers:

 **Temperature** - In short, the lower the `temperature`, the more deterministic the results in the sense that the highest probable next token is always picked. Increasing temperature could lead to more randomness, which encourages more diverse or creative outputs. You are essentially increasing the weights of the other possible tokens. In terms of application, you might want to use a lower temperature value for tasks like fact-based QA to encourage more factual and concise responses. For poem generation or other creative tasks, it might be beneficial to increase the temperature value.

-**Top_p** - Similarly, with `top_p`, a sampling technique with temperature called nucleus sampling, you can control how deterministic the model is at generating a response. If you are looking for exact and factual answers keep this low. If you are looking for more diverse responses, increase to a higher value. 
+**Top P** - A sampling technique with temperature, called nucleus sampling, where you can control how deterministic the model is. If you are looking for exact and factual answers keep this low. If you are looking for more diverse responses, increase to a higher value. If you use Top P it means that only the tokens comprising the `top_p` probability mass are considered for responses, so a low `top_p` value selects the most confident responses. This means that a high `top_p` value will enable the model to look at more possible words, including less likely ones, leading to more diverse outputs. The general recommendation is to alter temperature or Top P but not both.

-The general recommendation is to alter temperature or `top_p`, not both.
+**Max Length** - You can manage the number of tokens the model generates by adjusting the `max length`. Specifying a max length helps you prevent long or irrelevant responses and control costs.

-**Max Length** - You can manage the number of tokens the model generates by adjusting the 'max length'. Specifying a max length helps you prevent long or irrelevant responses and control costs.
+**Stop Sequences** - A `stop sequence` is a string that stops the model from generating tokens. Specifying stop sequences is another way to control the length and structure of the model's response. For example, you can tell the model to generate lists that have no more than 10 items by adding "11" as a stop sequence.

-**Stop Sequences** - A 'stop sequence' is a string that stops the model from generating tokens. Specifying stop sequences is another way to control the length and structure of the model's response. For example, you can tell the model to generate lists that have no more than 10 items by adding "11" as a stop sequence.
+**Frequency Penalty** - The `frequency penalty` applies a penalty on the next token proportional to how many times that token already appeared in the response and prompt. The higher the frequency penalty, the less likely a word will appear again. This setting reduces the repetition of words in the model's response by giving tokens that appear more a higher penalty.

-**Frequency Penalty** - The 'frequency penalty' applies a penalty on the next token proportional to how many times that token already appeared in the response and prompt. The higher the frequency penalty, the less likely a word will appear again. This setting reduces the repetition of words in the model's response by giving tokens that appear more a higher penalty.
+**Presence Penalty** - The `presence penalty` also applies a penalty on repeated tokens but, unlike the frequency penalty, the penalty is the same for all repeated tokens. A token that appears twice and a token that appears 10 times are penalized the same. This setting prevents the model from repeating phrases too often in its response. If you want the model to generate diverse or creative text, you might want to use a higher presence penalty. Or, if you need the model to stay focused, try using a lower presence penalty.

-**Presence Penalty** - The 'presence penalty' also applies a penalty on repeated tokens but, unlike the frequency penalty, the penalty is the same for all repeated tokens. A token that appears twice and a token that appears 10 times are penalized the same. This setting prevents the model from repeating phrases too often in its response. If you want the model to generate diverse or creative text, you might want to use a higher presence penalty. Or, if you need the model to stay focused, try using a lower presence penalty.
-
-Similar to temperature and top_p, the general recommendation is to alter the frequency or presence penalty, not both.
+Similar to `temperature` and `top_p`, the general recommendation is to alter the frequency or presence penalty but not both.

 Before starting with some basic examples, keep in mind that your results may vary depending on the version of LLM you use.