added new techniques

pull/189/head
Elvis Saravia 12 months ago
parent d27d18ff1b
commit 1e4208e419

Binary file not shown.

After

Width:  |  Height:  |  Size: 194 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 211 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 110 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 82 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 66 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 224 KiB

@ -222,6 +222,7 @@ The current recommendation for `gpt-3.5-turbo-0301` is to add instructions in th
- [Can AI Put Gamma-Ray Astrophysicists Out of a Job?](https://arxiv.org/abs/2303.17853) (March 2023)
- [Comparing Abstractive Summaries Generated by ChatGPT to Real Summaries Through Blinded Reviewers and Text Classification Algorithms](https://arxiv.org/abs/2303.17650) (March 2023)
- [HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace](https://arxiv.org/abs/2303.17580) (March 2023)
- [SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models](https://arxiv.org/abs/2303.08896) (March 2023)
- [WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research](https://arxiv.org/abs/2303.17395) (March 2023)
- [How well do Large Language Models perform in Arithmetic tasks?](https://arxiv.org/abs/2304.02015) (March 2023)
- [Assessing Cross-Cultural Alignment between ChatGPT and Human Societies: An Empirical Study](https://arxiv.org/abs/2303.17466) (March 2023)

@ -74,6 +74,7 @@ The following are the latest papers (sorted by release date) on prompt engineeri
- [Larger language models do in-context learning differently](https://arxiv.org/abs/2303.03846) (March 2023)
- [OpenICL: An Open-Source Framework for In-context Learning](https://arxiv.org/abs/2303.02913) (March 2023)
- [Dynamic Prompting: A Unified Framework for Prompt Tuning](https://arxiv.org/abs/2303.02909) (March 2023)
- [ART: Automatic multi-step reasoning and tool-use for large language models](https://arxiv.org/abs/2303.09014) (March 2023)
- [Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning](https://arxiv.org/abs/2303.02861) (March 2023)
- [Effectiveness of Data Augmentation for Prefix Tuning with Limited Data](https://arxiv.org/abs/2303.02577) (March 2023)
- [Mixture of Soft Prompts for Controllable Data Generation](https://arxiv.org/abs/2303.01580) (March 2023)
@ -122,6 +123,7 @@ The following are the latest papers (sorted by release date) on prompt engineeri
- [ReAct: Synergizing Reasoning and Acting in Language Models](https://arxiv.org/abs/2210.03629) (Oct 2022)
- [Prompting GPT-3 To Be Reliable](https://arxiv.org/abs/2210.09150) (Oct 2022)
- [Decomposed Prompting: A Modular Approach for Solving Complex Tasks](https://arxiv.org/abs/2210.02406) (Oct 2022)
- [Automatic Chain of Thought Prompting in Large Language Models](https://arxiv.org/abs/2210.03493) (Oct 2022)
- [Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought](https://arxiv.org/abs/2210.01240v3) (Oct 2022)
- [Evaluating the Susceptibility of Pre-Trained Language Models via Handcrafted Adversarial Examples](https://arxiv.org/abs/2209.02128) (Sep 2022)
- [Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning](https://arxiv.org/abs/2209.14610) (Sep 2022)
@ -172,6 +174,7 @@ The following are the latest papers (sorted by release date) on prompt engineeri
- [Layout and Task Aware Instruction Prompt for Zero-shot Document Image Question Answering](https://arxiv.org/abs/2306.00526) (June 2023)
- [Chain-Of-Thought Prompting Under Streaming Batch: A Case Study](https://arxiv.org/abs/2306.00550) (June 2023)
- [Red Teaming Language Model Detectors with Language Models](https://arxiv.org/abs/2305.19713) (May 2023)
- [Gorilla: Large Language Model Connected with Massive APIs](https://shishirpatil.github.io/gorilla/) (May 2023)
- [Deliberate then Generate: Enhanced Prompting Framework for Text Generation](https://arxiv.org/abs/2305.19835) (May 2023)
- [What does the Failure to Reason with "Respectively" in Zero/Few-Shot Settings Tell Us about Language Models?](https://arxiv.org/abs/2305.19597) (May 2023)
- [ScoNe: Benchmarking Negation Reasoning in Language Models With Fine-Tuning and In-Context Learning](https://arxiv.org/abs/2305.19426) (May 2023)
@ -356,6 +359,7 @@ The following are the latest papers (sorted by release date) on prompt engineeri
- [The Capacity for Moral Self-Correction in Large Language Models](https://arxiv.org/abs/2302.07459) (Feb 2023)
- [Prompting for Multimodal Hateful Meme Classification](https://arxiv.org/abs/2302.04156) (Feb 2023)
- [PLACES: Prompting Language Models for Social Conversation Synthesis](https://arxiv.org/abs/2302.03269) (Feb 2023)
- [Toolformer: Language Models Can Teach Themselves to Use Tools](https://arxiv.org/abs/2302.04761) (Feb 2023)
- [Commonsense-Aware Prompting for Controllable Empathetic Dialogue Generation](https://arxiv.org/abs/2302.01441) (Feb 2023)
- [Crawling the Internal Knowledge-Base of Language Models](https://arxiv.org/abs/2301.12810) (Jan 2023)
- [Legal Prompt Engineering for Multilingual Legal Judgement Prediction](https://arxiv.org/abs/2212.02199) (Dec 2022)

@ -63,6 +63,7 @@
- [Learn Prompting](https://learnprompting.org)
- [Learning Prompt](https://github.com/thinkingjimmy/Learning-Prompt)
- [LINGO : Visually Debiasing Natural Language Instructions to Support Task Diversity](https://arxiv.org/abs/2304.06184)
- [Make PowerPoint presentations with ChatGPT](https://www.reddit.com/r/AIAssisted/comments/13xf8pq/make_powerpoint_presentations_with_chatgpt/)
- [Meet Claude: Anthropics Rival to ChatGPT](https://scale.com/blog/chatgpt-vs-claude)
- [Methods of prompt programming](https://generative.ink/posts/methods-of-prompt-programming)
- [Mysteries of mode collapse](https://www.lesswrong.com/posts/t9svvNPNmFf5Qa3TA/mysteries-of-mode-collapse)
@ -76,6 +77,7 @@
- [Prompt Engineer: Tech's hottest job title?](https://www.peoplematters.in/article/talent-management/is-prompt-engineering-the-hottest-job-in-ai-today-37036)
- [Prompt Engineering by Lilian Weng](https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/)
- [Prompt Engineering 101 - Introduction and resources](https://www.linkedin.com/pulse/prompt-engineering-101-introduction-resources-amatriain)
- [Prompt Engineering 201: Advanced prompt engineering and toolkits](https://amatriain.net/blog/prompt201)
- [Prompt Engineering 101: Autocomplete, Zero-shot, One-shot, and Few-shot prompting](https://youtube.com/watch?v=v2gD8BHOaX4&feature=shares)
- [Prompt Engineering 101](https://humanloop.com/blog/prompt-engineering-101)
- [Prompt Engineering - A new profession ?](https://www.youtube.com/watch?v=w102J3_9Bcs&ab_channel=PatrickDebois)

@ -4,6 +4,8 @@
"cot": "Prompt cadena de pensament (CoT)",
"consistency": "Autoconsistència",
"knowledge": "Prompt de coneixement generat",
"tot": "Tree of Thoughts",
"art": "Automatic Reasoning and Tool-use",
"ape": "Enginyeria de prompts automàtic (APE)",
"activeprompt": "Prompt actiu",
"dsp": "Prompt d'Estímul dirigit",

@ -4,6 +4,8 @@
"cot": "Chain-of-Thought Prompting",
"consistency": "Self-Consistency",
"knowledge": "Generate Knowledge Prompting",
"tot": "Tree of Thoughts",
"art": "Automatic Reasoning and Tool-use",
"ape": "Automatic Prompt Engineer",
"activeprompt": "Active-Prompt",
"dsp": "Directional Stimulus Prompting",

@ -4,6 +4,8 @@
"cot": "Prompt cadena de pensamiento (CoT)",
"consistency": "Auto-consistencia",
"knowledge": "Prompt de conocimiento generado",
"tot": "Tree of Thoughts",
"art": "Automatic Reasoning and Tool-use",
"ape": "Ingeniería de prompts automático (APE)",
"activeprompt": "Prompt activo",
"dsp": "Prompt de Estímulo direccional",

@ -4,6 +4,8 @@
"cot": "Chain-of-Thought Prompting",
"consistency": "Self-Consistency",
"knowledge": "Generate Knowledge Prompting",
"tot": "Tree of Thoughts",
"art": "Automatic Reasoning and Tool-use",
"ape": "Automatic Prompt Engineer",
"activeprompt": "Active-Prompt",
"dsp": "Directional Stimulus Prompting",

@ -4,6 +4,8 @@
"cot": "Chain-of-Thought Prompting",
"consistency": "Self-Consistency",
"knowledge": "Generate Knowledge Prompting",
"tot": "Tree of Thoughts",
"art": "Automatic Reasoning and Tool-use",
"ape": "Automatic Prompt Engineer",
"activeprompt": "Active-Prompt",
"dsp": "Directional Stimulus Prompting",

@ -4,6 +4,8 @@
"cot": "Prompt Chain-of-Thought",
"consistency": "Self-Consistency",
"knowledge": "Prompt Generate Knowledge",
"tot": "Tree of Thoughts",
"art": "Automatic Reasoning and Tool-use",
"ape": "Automatic Prompt Engineer",
"activeprompt": "Prompt Attivo",
"dsp": "Prompt Directional Stimulus",

@ -4,6 +4,8 @@
"cot": "Chain-of-Thoughtプロンプティング",
"consistency": "自己整合性Self-Consistency",
"knowledge": "知識生成プロンプティング",
"tot": "Tree of Thoughts",
"art": "Automatic Reasoning and Tool-use",
"ape": "自動プロンプトエンジニア",
"activeprompt": "アクティブプロンプト",
"dsp": "方向性刺激プロンプティング",

@ -4,6 +4,8 @@
"cot": "Chain-of-Thought Prompting",
"consistency": "Self-Consistency",
"knowledge": "Generate Knowledge Prompting",
"tot": "Tree of Thoughts",
"art": "Automatic Reasoning and Tool-use",
"ape": "Automatic Prompt Engineer",
"activeprompt": "Active-Prompt",
"dsp": "Directional Stimulus Prompting",

@ -4,6 +4,8 @@
"cot": "Chain-of-Thought Prompting",
"consistency": "Self-Consistency",
"knowledge": "Generate Knowledge Prompting",
"tot": "Tree of Thoughts",
"art": "Automatic Reasoning and Tool-use",
"ape": "Automatic Prompt Engineer",
"activeprompt": "Active-Prompt",
"dsp": "Directional Stimulus Prompting",

@ -4,6 +4,8 @@
"cot": "Chain-of-Thought Prompting",
"consistency": "Self-Consistency",
"knowledge": "Generate Knowledge Prompting",
"tot": "Tree of Thoughts",
"art": "Automatic Reasoning and Tool-use",
"ape": "Automatic Prompt Engineer",
"activeprompt": "Active-Prompt",
"dsp": "Directional Stimulus Prompting",

@ -4,6 +4,8 @@
"cot": "链式思考CoT提示",
"consistency": "自我一致性",
"knowledge": "生成知识提示",
"tot": "Tree of Thoughts",
"art": "Automatic Reasoning and Tool-use",
"ape": "自动提示工程师",
"activeprompt": "Active-Prompt",
"dsp": "方向性刺激提示",

@ -0,0 +1,3 @@
# Automatic Reasoning and Tool-use (ART)
This page needs a translation! Feel free to contribute a translation by clicking the `Edit this page` button on the right side.

@ -0,0 +1,25 @@
# Automatic Reasoning and Tool-use (ART)
import { Callout, FileTree } from 'nextra-theme-docs'
import {Screenshot} from 'components/screenshot'
import ART from '../../img/ART.png'
import ART2 from '../../img/ART2.png'
Combining CoT prompting and tools in an interleaved manner has shown to be a strong and robust approach to address many tasks with LLMs. These approaches typically require hand-crafting task-specific demonstrations and carefully scripted interleaving of model generations with tool use. [Paranjape et al., (2023)](https://arxiv.org/abs/2303.09014) propose a new framework that uses a frozen LLM to automatically generate intermediate reasoning steps as a program.
ART works as follows:
- given a new task, it select demonstrations of multi-step reasoning and tool use from a task library
- at test time, it pauses generation whenever external tools are called, and integrate their output before resuming generation
ART encourages the model to generalize from demonstrations to decompose a new task and
use tools in appropriate places, in a zero-shot fashion. In addition, ART is extensible as it also enables humans to fix mistakes in the reasoning steps or add new tools by simply updating the task and tool libraries. The process is demonstrated below:
<Screenshot src={ART} alt="ART" />
Image Source: [Paranjape et al., (2023)](https://arxiv.org/abs/2303.09014)
ART substantially improves over few-shot prompting and automatic CoT on unseen tasks in the BigBench and MMLU benchmarks, and exceeds performance of hand-crafted CoT prompts when human feedback is incorporated.
Below is a table demonstrating ART's performance on BigBench and MMLU tasks:
<Screenshot src={ART2} alt="ART2" />
Image Source: [Paranjape et al., (2023)](https://arxiv.org/abs/2303.09014)

@ -0,0 +1,3 @@
# Automatic Reasoning and Tool-use (ART)
This page needs a translation! Feel free to contribute a translation by clicking the `Edit this page` button on the right side.

@ -0,0 +1,3 @@
# Automatic Reasoning and Tool-use (ART)
This page needs a translation! Feel free to contribute a translation by clicking the `Edit this page` button on the right side.

@ -0,0 +1,3 @@
# Automatic Reasoning and Tool-use (ART)
This page needs a translation! Feel free to contribute a translation by clicking the `Edit this page` button on the right side.

@ -0,0 +1,3 @@
# Automatic Reasoning and Tool-use (ART)
This page needs a translation! Feel free to contribute a translation by clicking the `Edit this page` button on the right side.

@ -0,0 +1,3 @@
# Automatic Reasoning and Tool-use (ART)
This page needs a translation! Feel free to contribute a translation by clicking the `Edit this page` button on the right side.

@ -0,0 +1,3 @@
# Automatic Reasoning and Tool-use (ART)
This page needs a translation! Feel free to contribute a translation by clicking the `Edit this page` button on the right side.

@ -0,0 +1,3 @@
# Automatic Reasoning and Tool-use (ART)
This page needs a translation! Feel free to contribute a translation by clicking the `Edit this page` button on the right side.

@ -0,0 +1,3 @@
# Automatic Reasoning and Tool-use (ART)
This page needs a translation! Feel free to contribute a translation by clicking the `Edit this page` button on the right side.

@ -0,0 +1,3 @@
# Automatic Reasoning and Tool-use (ART)
This page needs a translation! Feel free to contribute a translation by clicking the `Edit this page` button on the right side.

@ -1,6 +1,6 @@
# Self-Consistency
Perhaps one of the more advanced techniques out there for prompt engineering is self-consistency. Proposed by [Wang et al. (2022)](https://arxiv.org/pdf/2203.11171.pdf), self-consistency aims "to replace the naive greedy decoding used in chain-of-thought prompting". The idea is to sample multiple, diverse reasoning paths through few-shot CoT, and use the generations to select the most consistent answer. This helps to boost the performance of CoT prompting on tasks involving arithmetic and commonsense reasoning.
Perhaps one of the more advanced techniques out there for prompt engineering is self-consistency. Proposed by [Wang et al. (2022)](https://arxiv.org/abs/2203.11171), self-consistency aims "to replace the naive greedy decoding used in chain-of-thought prompting". The idea is to sample multiple, diverse reasoning paths through few-shot CoT, and use the generations to select the most consistent answer. This helps to boost the performance of CoT prompting on tasks involving arithmetic and commonsense reasoning.
Let's try the following example for arithmetic reasoning:

@ -3,6 +3,7 @@
import {Screenshot} from 'components/screenshot'
import COT from '../../img/cot.png'
import ZEROCOT from '../../img/zero-cot.png'
import AUTOCOT from '../../img/auto-cot.png'
## Chain-of-Thought (CoT) Prompting
@ -89,4 +90,23 @@ Then you bought 5 more apples, so now you had 11 apples.
Finally, you ate 1 apple, so you would remain with 10 apples.
```
It's impressive that this simple prompt is effective at this task. This is particularly useful where you don't have too many examples to use in the prompt.
It's impressive that this simple prompt is effective at this task. This is particularly useful where you don't have too many examples to use in the prompt.
## Automatic Chain-of-Thought (Auto-CoT)
When applying chain-of-thought prompting with demonstrations, the process involves hand-crafting effective and diverse examples. This manual effort could lead to suboptimal solutions. [Zhang et al. (2022)](https://arxiv.org/abs/2210.03493) propose an approach to eliminate manual efforts by leveraging LLMs with "Let's think step by step" prompt to generate reasoning chains for demonstrations one by one. This automatic process can still end up with mistakes in generated chains. To mitigate the effects of the mistakes, the diversity of demonstrations matter. This works proposes Auto-CoT, which samples questions with diversity and generates reasoning chains to construct the demonstrations.
Auto-CoT consists of two main stages:
- Stage 1): **question clustering**: partition questions of a given dataset into a few clusters
- Stage 2): **demonstration sampling**: select a representative question from each cluster and generate its reasoning chain using Zero-Shot-CoT with simple heuristics
The simple heuristics could be length of questions (e.g., 60 tokens) and number of steps in rationale (e.g., 5 reasoning steps). This encourages the model to use simple and accurate demonstrations.
The process is illustrated below:
<Screenshot src={AUTOCOT} alt="AUTOCOT" />
Image Source: [Zhang et al. (2022)](https://arxiv.org/abs/2210.03493)
Code for Auto-CoT is available [here](https://github.com/amazon-science/auto-cot).

@ -0,0 +1,3 @@
# Tree of Thoughts (ToT)
This page needs a translation! Feel free to contribute a translation by clicking the `Edit this page` button on the right side.

@ -0,0 +1,28 @@
# Tree of Thoughts (ToT)
import { Callout, FileTree } from 'nextra-theme-docs'
import {Screenshot} from 'components/screenshot'
import TOT from '../../img/TOT.png'
import TOT2 from '../../img/TOT2.png'
import TOT3 from '../../img/TOT3.png'
For complex tasks that require exploration or strategic lookahead, traditional or simple prompting techniques fall short. [Yao et el. (2023)](https://arxiv.org/abs/2305.10601) recently proposed Tree of Thoughts (ToT), a framework that generalizes over chain-of-thought prompting and encourages exploration over thoughts that serve as intermediate steps for general problem solving with language models.
ToT maintains a tree of thoughts, where thoughts represent coherent language sequences that serve as intermediate steps toward solving a problem. This approach enables an LM to self-evaluate the progress intermediate thoughts make towards solving a problem through a deliberate reasoning process. The LM ability to generate and evaluate thoughts is then combined with search algorithms (e.g., breath-first search and depth-first search) to enable systematic exploration of thoughts with lookahead and backtracking.
The ToT framework is illustrated below:
<Screenshot src={TOT} alt="TOT" />
Image Source: [Yao et el. (2023)](https://arxiv.org/abs/2305.10601)
When using ToT, different tasks requires defining the number of candidates and the number of thoughts/steps. For instance, as demonstrated in the paper, Game of 24 is used as a mathematical reasoning task which requires decomposing the thoughts into 3 steps, each involving an intermediate equation. At each step, the best b=5 candidates are kept.
To perform BFS in ToT for the Game of 24 task, the LM is prompted to evaluate each thought candidate as "sure/maybe/impossible" with regard to reaching 24. As stated by the authors, "the aim is to promote correct partial solutions that can be verdicted within few lookahead trials, and eliminate impossible partial solutions based on "too big/small" commonsense, and keep the rest "maybe"". Values are sampled 3 times for each thought. The process is illustrated below:
<Screenshot src={TOT2} alt="TOT2" />
Image Source: [Yao et el. (2023)](https://arxiv.org/abs/2305.10601)
From the results reported in the figure below, ToT substantially outperforms the other prompting methods:
<Screenshot src={TOT2} alt="TOT2" />
Image Source: [Yao et el. (2023)](https://arxiv.org/abs/2305.10601)

@ -0,0 +1,3 @@
# Tree of Thoughts (ToT)
This page needs a translation! Feel free to contribute a translation by clicking the `Edit this page` button on the right side.

@ -0,0 +1,3 @@
# Tree of Thoughts (ToT)
This page needs a translation! Feel free to contribute a translation by clicking the `Edit this page` button on the right side.

@ -0,0 +1,3 @@
# Tree of Thoughts (ToT)
This page needs a translation! Feel free to contribute a translation by clicking the `Edit this page` button on the right side.

@ -0,0 +1,3 @@
# Tree of Thoughts (ToT)
This page needs a translation! Feel free to contribute a translation by clicking the `Edit this page` button on the right side.

@ -0,0 +1,3 @@
# Tree of Thoughts (ToT)
This page needs a translation! Feel free to contribute a translation by clicking the `Edit this page` button on the right side.

@ -0,0 +1,3 @@
# Tree of Thoughts (ToT)
This page needs a translation! Feel free to contribute a translation by clicking the `Edit this page` button on the right side.

@ -0,0 +1,3 @@
# Tree of Thoughts (ToT)
This page needs a translation! Feel free to contribute a translation by clicking the `Edit this page` button on the right side.

@ -0,0 +1,3 @@
# Tree of Thoughts (ToT)
This page needs a translation! Feel free to contribute a translation by clicking the `Edit this page` button on the right side.

@ -0,0 +1,3 @@
# Tree of Thoughts (ToT)
This page needs a translation! Feel free to contribute a translation by clicking the `Edit this page` button on the right side.

@ -12,6 +12,7 @@
- [EmergentMind](https://www.emergentmind.com)
- [EveryPrompt](https://www.everyprompt.com)
- [Guardrails](https://github.com/ShreyaR/guardrails)
- [Guidance](https://github.com/microsoft/guidance)
- [GPT Index](https://github.com/jerryjliu/gpt_index)
- [GPTTools](https://gpttools.com/comparisontool)
- [hwchase17/adversarial-prompts](https://github.com/hwchase17/adversarial-prompts)

Loading…
Cancel
Save