removes outdated, pre-ChatGPT guides

pull/1077/head
Ted Sanders 1 year ago
parent 43e1d529d8
commit a75d795f1e

@ -1,63 +0,0 @@
# Code editing example
OpenAI's [edits](https://openai.com/blog/gpt-3-edit-insert/) endpoint is particularly useful for editing code.
Unlike completions, edits takes two inputs: the text to edit and an instruction.
For example, if you wanted to edit a Python function, you could supply the text of the function and an instruction like "add a docstring".
Example text input to `code-davinci-edit-001`:
```python
def tribonacci(n):
if n == 0:
return 0
elif n == 1:
return 1
elif n == 2:
return 1
elif n == 3:
return 2
else:
return tribonacci(n-1) + tribonacci(n-2) + tribonacci(n-3)
```
Example instruction inputs:
```text
add a docstring
```
```text
Add typing, using Python 3.9 conventions
```
```text
improved the runtime
```
```text
Add a test.
```
```text
Translate to JavaScript (or Rust or Lisp or any language you like)
```
Example output after improving the runtime and translating to JavaScript:
```JavaScript
function tribonacci(n) {
let a = 0;
let b = 1;
let c = 1;
for (let i = 0; i < n; i++) {
[a, b, c] = [b, c, a + b + c];
}
return a;
}
```
As you can see, `code-davinci-edit-001` was able to successfully reduce the function's runtime from exponential down to linear, as well as convert from Python to JavaScript.
Experiment with code editing using `code-davinci-edit-001` in the [OpenAI Playground](https://beta.openai.com/playground?mode=edit&model=code-davinci-edit-001).

@ -1,41 +0,0 @@
# Code explanation examples
GPT's understanding of code can be applied to many use cases, e.g.:
* Generating in-code documentation (e.g., Python docstrings, git commit messages)
* Generating out-of-code documentation (e.g., man pages)
* An interactive code exploration tool
* Communicating program results back to users via a natural language interface
For example, if you wanted to understand a SQL query, you could give `code-davinci-002` the following example prompt:
````text
A SQL query:
```
SELECT c.customer_id
FROM Customers c
JOIN Streaming s
ON c.customer_id = s.customer_id
WHERE c.signup_date BETWEEN '2020-03-01' AND '2020-03-31'
AND s.watch_date BETWEEN c.signup_date AND DATE_ADD(c.signup_date, INTERVAL 30 DAY)
GROUP BY c.customer_id
HAVING SUM(s.watch_minutes) > 50 * 60
```
Questions:
1. What does the SQL query do?
2. Why might someone be interested in this time period?
3. Why might a company be interested in this SQL query?
Answers:
````
[Output]((https://beta.openai.com/playground/p/Sv1VQKbJV1TZKmiTK9r6nlj3)):
```text
1. The SQL query finds all customers who signed up in March 2020 and watched more than 50 hours of content in the first 30 days after signing up.
2. The time period is interesting because it is the first month of the COVID-19 pandemic.
3. A company might be interested in this SQL query because it can help them understand how the pandemic has affected their business.
```
Note that `code-davinci-002` is not trained to follow instructions and therefore usually needs examples or other structure to help steer its output, as well as stop sequences to stop generating. For easier prompting, try `text-davinci-003`.

@ -1,31 +0,0 @@
# Code writing examples
GPT-3 is able to write code as well as text.
Here's an example of `code-davinci-002` writing a SQL query:
````text
SQL tables (and columns):
* Customers(customer_id, signup_date)
* Streaming(customer_id, video_id, watch_date, watch_minutes)
A well-written SQL query that lists customers who signed up during March 2020 and watched more than 50 hours of video in their first 30 days:
```
````
[Output](https://beta.openai.com/playground/p/r2mw99cANoa0TJHok725CeaC):
```sql
SELECT c.customer_id
FROM Customers c
JOIN Streaming s
ON c.customer_id = s.customer_id
WHERE c.signup_date BETWEEN '2020-03-01' AND '2020-03-31'
AND s.watch_date BETWEEN c.signup_date AND DATE_ADD(c.signup_date, INTERVAL 30 DAY)
GROUP BY c.customer_id
HAVING SUM(s.watch_minutes) > 50 * 60
```
Helpfully, `code-davinci-002` is able to make inferences from variable names; for example, it infers that `watch_minutes` has units of minutes and therefore needs to be converted by a factor of 60 before being compared with 50 hours.
For easier prompting, you can also try `text-davinci-003`.

@ -1,86 +0,0 @@
# Text editing examples
In addition to the [completions API endpoint][Completion API Docs], OpenAI offers an [edits API endpoint][Edit API Docs]. Read more at:
* [Blog post announcement (Mar 2022)][GPT3 Edit Blog Post]
* [Edit API documentation][Edit API Docs]
In contrast to completions, which only take a single text input, edits take two text inputs: the instruction and the text to be modified. For example:
Instruction input:
```text
Fix the OCR errors
```
Text input:
```text
Therewassomehostilityntheenergybehindthe researchreportedinPerceptrons....Part of ourdrivecame,aswequiteplainlyacknoweldgednourbook,fromhe facthatfundingndresearchnergywerebeingdissipatedon. . .misleadingttemptsouseconnectionistmethodsnpracticalappli-cations.
```
[Output](https://beta.openai.com/playground/p/5W5W6HHlHrGsLu1cpx0VF4qu):
```text
There was some hostility in the energy behind the research reported in Perceptrons....Part of our drive came, as we quite plainly acknowledged in our book, from the fact that funding and research energy were being dissipated on...misleading attempts to use connectionist methods in practical applications.
```
In general, instructions can be imperative, present tense, or past tense. Experiment to see what works best for your use case.
## Translation
One application of the edit API is translation.
Large language models are excellent at translating across common languages. In 2021, [GPT-3 set](https://arxiv.org/abs/2110.05448) a new state-of-the-art record in unsupervised translation on the WMT14 English-French benchmark.
Here's an example of how to translate text using the edits endpoint:
Instruction input:
```text
translation into French
```
Text input:
```text
That's life.
```
[Output](https://beta.openai.com/playground/p/6JWAH8a4ZbEafSDyRsSVdgKr):
```text
C'est la vie.
```
Of course, many tasks that can be accomplished with the edits endpoint can also be done by the completions endpoint too. For example, you can request a translate by prepending an instruction as follows:
```text
Translate the following text from English to French.
English: That's life.
French:
```
[Output](https://beta.openai.com/playground/p/UgaPfgjBNTRRPeNcMSNtGzcu):
```text
C'est la vie.
```
Tips for translation:
* Performance is best on the most common languages
* We've seen better performance when the instruction is given in the final language (so if translating into French, give the instruction `Traduire le texte de l'anglais au français.` rather than `Translate the following text from English to French.`)
* Backtranslation (as described [here](https://arxiv.org/abs/2110.05448)) can also increase performance
* Text with colons and heavy punctuation can trip up the instruction-following models, especially if the instruction uses colons (e.g., `English: {english text} French:`)
* The edits endpoint sometimes repeats the original text input alongside the translation, which can be monitored and filtered
When it comes to translation, large language models particularly shine at combining other instructions alongside translation. For example, you can ask GPT-3 to translate Slovenian to English but keep all LaTeX typesetting commands unchanged. The following notebook details how we translated a Slovenian math book into English:
[Translation of a Slovenian math book into English](examples/book_translation/translate_latex_book.ipynb)
[Edit API Docs]: https://beta.openai.com/docs/api-reference/edits
[Completion API Docs]: https://beta.openai.com/docs/api-reference/completions
[GPT3 Edit Blog Post]: https://openai.com/blog/gpt-3-edit-insert/

@ -1,108 +0,0 @@
# Text explanation examples
Large language models are useful for distilling information from long texts. Applications include:
* Answering questions about a piece of text, e.g.:
* Querying a knowledge base to help people look up things they don't know
* Querying an unfamiliar document to understand what it contains
* Querying a document with structured questions in order to extract tags, classes, entities, etc.
* Summarizing text, e.g.:
* Summarizing long documents
* Summarizing back-and-forth emails or message threads
* Summarizing detailed meeting notes with key points and next steps
* Classifying text, e.g.:
* Classifying customer feedback messages by topic or type
* Classifying documents by topic or type
* Classifying the tone or sentiment of text
* Extracting entities, e.g.:
* Extracting contact information from a customer message
* Extracting names of people or companies or products from a document
* Extracting things mentioned in customer reviews or feedback
Below are some simple examples of each.
## Answering questions about a piece of text
Here's an example prompt for answering questions about a piece of text:
```text
Using the following text, answer the following question. If the answer is not contained within the text, say "I don't know."
Text:
"""
Oklo Mine (sometimes Oklo Reactor or Oklo Mines), located in Oklo, Gabon on the west coast of Central Africa, is believed to be the only natural nuclear fission reactor. Oklo consists of 16 sites at which self-sustaining nuclear fission reactions are thought to have taken place approximately 1.7 billion years ago, and ran for hundreds of thousands of years. It is estimated to have averaged under 100 kW of thermal power during that time.
"""
Question: How many natural fission reactors have ever been discovered?
Answer:
```
[Output](https://beta.openai.com/playground/p/c8ZL7ioqKK7zxrMT2T9Md3gJ):
```text
One. Oklo Mine is believed to be the only natural nuclear fission reactor.
```
If the text you wish to ask about is longer than the token limit (~4,000 tokens for `text-davinci-002`/`-003` and ~2,000 tokens for earlier models), you can split the text into smaller pieces, rank them by relevance, and then ask your question only using the most-relevant-looking pieces. This is demonstrated in [Question_answering_using_embeddings.ipynb](examples/Question_answering_using_embeddings.ipynb).
In the same way that students do better on tests when allowed to access notes, GPT-3 does better at answering questions when it's given text containing the answer.
Without notes, GPT-3 has to rely on its own long-term memory (i.e., internal weights), which are more prone to result in confabulated or hallucinated answers.
## Summarization
Here's a simple example prompt to summarize a piece of text:
```text
Summarize the following text.
Text:
"""
Two independent experiments reported their results this morning at CERN, Europe's high-energy physics laboratory near Geneva in Switzerland. Both show convincing evidence of a new boson particle weighing around 125 gigaelectronvolts, which so far fits predictions of the Higgs previously made by theoretical physicists.
"As a layman I would say: 'I think we have it'. Would you agree?" Rolf-Dieter Heuer, CERN's director-general, asked the packed auditorium. The physicists assembled there burst into applause.
"""
Summary:
```
[Output](https://beta.openai.com/playground/p/pew7DNB908TkUYiF0ZOdaIGc):
```text
CERN's director-general asked a packed auditorium if they agreed that two independent experiments had found convincing evidence of a new boson particle that fits predictions of the Higgs, to which the physicists assembled there responded with applause.
```
The triple quotation marks `"""` used in these example prompts aren't special; GPT-3 can recognize most delimiters, including `<>`, `{}`, or `###`. For long pieces of text, we recommend using some kind of delimiter to help disambiguate where one section of text ends and the next begins.
## Classification
If you want to classify the text, the best approach depends on whether the classes are known in advance.
If your classes _`are`_ known in advance, classification is often best done with a fine-tuned model, as demonstrated in [Fine-tuned_classification.ipynb](examples/Fine-tuned_classification.ipynb).
If your classes _`are not`_ known in advance (e.g., they are set by a user or generated on the fly), you can try zero-shot classification by either giving an instruction containing the classes or even by using embeddings to see which class label (or other classified texts) is most similar to the text (as demonstrated in [Zero-shot_classification.ipynb](examples/Zero-shot_classification_with_embeddings.ipynb)).
## Entity extraction
Here's an example prompt for entity extraction:
```text
From the text below, extract the following entities in the following format:
Companies: <comma-separated list of companies mentioned>
People & titles: <comma-separated list of people mentioned (with their titles or roles appended in parentheses)>
Text:
"""
In March 1981, United States v. AT&T came to trial under Assistant Attorney General William Baxter. AT&T chairman Charles L. Brown thought the company would be gutted. He realized that AT&T would lose and, in December 1981, resumed negotiations with the Justice Department. Reaching an agreement less than a month later, Brown agreed to divestiture—the best and only realistic alternative. AT&T's decision allowed it to retain its research and manufacturing arms. The decree, titled the Modification of Final Judgment, was an adjustment of the Consent Decree of 14 January 1956. Judge Harold H. Greene was given the authority over the modified decree....
In 1982, the U.S. government announced that AT&T would cease to exist as a monopolistic entity. On 1 January 1984, it was split into seven smaller regional companies, Bell South, Bell Atlantic, NYNEX, American Information Technologies, Southwestern Bell, US West, and Pacific Telesis, to handle regional phone services in the U.S. AT&T retains control of its long distance services, but was no longer protected from competition.
"""
```
[Output](https://beta.openai.com/playground/p/of47T7N5CtHF4RlvwFkTu3pN):
```text
Companies: AT&T, Bell South, Bell Atlantic, NYNEX, American Information Technologies, Southwestern Bell, US West, Pacific Telesis
People & titles: William Baxter (Assistant Attorney General), Charles L. Brown (AT&T chairman), Harold H. Greene (Judge)
```

@ -1,48 +0,0 @@
# Text writing examples
Large language models are excellent at writing. They can assist with all sorts of tasks:
* Blog posts
* Email copy
* Ad copy
* Website copy
* Product descriptions
* Memos
* Storytelling
* Brainstorming
* Question generation
* etc.
The easiest way to prompt GPT-3 is to tell it what you'd like. For example, if you want it to write an email, you could use a prompt like this:
```text
Write an email to a colleague named Jill congratulating her on her promotion. The tone should be warm yet professional. Mention how you admire the work she's been putting in. Include a joke about how her pet lizard Max enjoys eating grasshoppers. Mention how you're looking forward to the team off-site next week.
```
[Output](https://beta.openai.com/playground/p/fUeh5saVLI2dhU0WU3nZtIiU):
```text
Subject: Congratulations on Your Promotion, Jill!
Dear Jill,
Congratulations on your promotion! I'm so proud of you and all the hard work you've been putting in. It's been inspiring to watch you grow in your career.
I'm sure Max the lizard is also celebrating with you. I hear he loves eating grasshoppers, so you should treat him to a special snack!
I'm looking forward to seeing you at the team off-site next week. I'm sure it will be a great opportunity to celebrate your success.
All the best,
[Your Name]
```
In general, writing can work with any style of prompt, whether it's an instruction or a series of examples to follow. Experiment to see what works best for your use case.
Writing also works with any type of model, though they each have strengths and weaknesses.
| | Advantages | Disadvantages |
| ---------------------------------------------------------- | ----------------------------------------------------------------------------- | -------------------------------------------------------------------------------- |
| Instruction-following models<br>(e.g., `text-davinci-003`) | Easiest to use | Less diverse; less creative; sometimes harder to steer tone, style, etc. |
| Base models<br>(e.g., `davinci`) | Potentially more creative and diverse | Harder to prompt well, more expensive (as examples in the prompt cost extra tokens) |
| Fine-tuned models | Can train off of many examples; cheaper than including examples in the prompt | Hard to gather training data; training makes iteration slower and more expensive |
Loading…
Cancel
Save