From 7b3cac938ca440f6fc832e98b7f764d409be9cf8 Mon Sep 17 00:00:00 2001 From: Elvis Saravia Date: Wed, 24 May 2023 11:40:02 -0600 Subject: [PATCH] fix --- pages/models/chatgpt.en.mdx | 24 ++++++++ pages/models/gpt-4.en.mdx | 9 +++ pages/papers.en.mdx | 115 +++++++++++++++++++++++++++++++++++- 3 files changed, 147 insertions(+), 1 deletion(-) diff --git a/pages/models/chatgpt.en.mdx b/pages/models/chatgpt.en.mdx index 760653c..fc429ea 100644 --- a/pages/models/chatgpt.en.mdx +++ b/pages/models/chatgpt.en.mdx @@ -145,6 +145,30 @@ The current recommendation for `gpt-3.5-turbo-0301` is to add instructions in th --- ## References +- [Narrative XL: A Large-scale Dataset For Long-Term Memory Models](https://arxiv.org/abs/2305.13877) (May 2023) +- [Does ChatGPT have Theory of Mind?](https://arxiv.org/abs/2305.14020) (May 2023) +- [ZeroSCROLLS: A Zero-Shot Benchmark for Long Text Understanding](https://arxiv.org/abs/2305.14196) (May 2023) +- [Navigating Prompt Complexity for Zero-Shot Classification: A Study of Large Language Models in Computational Social Science](https://arxiv.org/abs/2305.14310) (May 2023) +- [ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings](https://arxiv.org/abs/2305.13724) (May 2023) +- [Can LLMs facilitate interpretation of pre-trained language models?](https://arxiv.org/abs/2305.13386) (May 2023) +- [Can ChatGPT Detect Intent? Evaluating Large Language Models for Spoken Language Understanding](https://arxiv.org/abs/2305.13512) (May 2023) +- [LLM-empowered Chatbots for Psychiatrist and Patient Simulation: Application and Evaluation](https://arxiv.org/abs/2305.13614) (May 2023) +- [ChatGPT as your Personal Data Scientist](https://arxiv.org/abs/2305.13657) (May 2023) +- [Are Large Language Models Good Evaluators for Abstractive Summarization?](https://arxiv.org/abs/2305.13091) (May 2023) +- [Can ChatGPT Defend the Truth? Automatic Dialectical Evaluation Elicits LLMs' Deficiencies in Reasoning](https://arxiv.org/abs/2305.13160) (May 2023) +- [Evaluating ChatGPT's Performance for Multilingual and Emoji-based Hate Speech Detection](https://arxiv.org/abs/2305.13276) (May 2023) +- [ChatGPT to Replace Crowdsourcing of Paraphrases for Intent Classification: Higher Diversity and Comparable Model Robustness](https://arxiv.org/abs/2305.12947) (May 2023) +- [Distilling ChatGPT for Explainable Automated Student Answer Assessment](https://arxiv.org/abs/2305.12962) (May 2023) +- [Prompt ChatGPT In MNER: Improved multimodal named entity recognition method based on auxiliary refining knowledge from ChatGPT](https://arxiv.org/abs/2305.12212) (May 2023) +- [ChatGPT Is More Likely to Be Perceived as Male Than Female](https://arxiv.org/abs/2305.12564) (May 2023) +- [Observations on LLMs for Telecom Domain: Capabilities and Limitations](https://arxiv.org/abs/2305.13102) (May 2023) +- [Bits of Grass: Does GPT already know how to write like Whitman?](https://arxiv.org/abs/2305.11064) (May 2023) +- [Are Large Language Models Fit For Guided Reading?](https://arxiv.org/abs/2305.10645) (May 2023) +- [ChatGPT Perpetuates Gender Bias in Machine Translation and Ignores Non-Gendered Pronouns: Findings across Bengali and Five other Low-Resource Languages](https://arxiv.org/abs/2305.10510) (May 2023) +- [BAD: BiAs Detection for Large Language Models in the context of candidate screening](https://arxiv.org/abs/2305.10407) (May 2023) +- [MemoryBank: Enhancing Large Language Models with Long-Term Memory](https://arxiv.org/abs/2305.10250) (May 2023) +- [Knowledge Graph Completion Models are Few-shot Learners: An Empirical Study of Relation Labeling in E-commerce with LLMs](https://arxiv.org/abs/2305.09858) (May 2023) +- [A Preliminary Analysis on the Code Generation Capabilities of GPT-3.5 and Bard AI Models for Java Functions](https://arxiv.org/abs/2305.09402) (May 2023) - [ChatGPT-4 Outperforms Experts and Crowd Workers in Annotating Political Twitter Messages with Zero-Shot Learning](https://arxiv.org/abs/2304.06588) (April 2023) - [ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large Language Models in Multilingual Learning](https://arxiv.org/abs/2304.05613) (April 2023) - [Distinguishing ChatGPT(-3.5, -4)-generated and human-written papers through Japanese stylometric analysis](https://arxiv.org/abs/2304.05534) (April 2023) diff --git a/pages/models/gpt-4.en.mdx b/pages/models/gpt-4.en.mdx index 11bffa0..7fe7f2e 100644 --- a/pages/models/gpt-4.en.mdx +++ b/pages/models/gpt-4.en.mdx @@ -160,6 +160,15 @@ Coming soon! ## References / Papers +- [Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks](https://arxiv.org/abs/2305.14201) (May 2023) +- [How Language Model Hallucinations Can Snowball](https://arxiv.org/abs/2305.13534) (May 2023) +- [LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future Opportunities](https://arxiv.org/abs/2305.13168) (May 2023) +- [GPT-3.5 vs GPT-4: Evaluating ChatGPT's Reasoning Performance in Zero-shot Learning](https://arxiv.org/abs/2305.12477) (May 2023) +- [TheoremQA: A Theorem-driven Question Answering dataset](https://arxiv.org/abs/2305.12524) (May 2023) +- [Experimental results from applying GPT-4 to an unpublished formal language](https://arxiv.org/abs/2305.12196) (May 2023) +- [LogiCoT: Logical Chain-of-Thought Instruction-Tuning Data Collection with GPT-4](https://arxiv.org/abs/2305.12147) (May 2023) +- [Large-Scale Text Analysis Using Generative Language Models: A Case Study in Discovering Public Value Expressions in AI Patents](https://arxiv.org/abs/2305.10383) (May 2023) +- [Can Language Models Solve Graph Problems in Natural Language?]https://arxiv.org/abs/2305.10037) (May 2023) - [chatIPCC: Grounding Conversational AI in Climate Science](https://arxiv.org/abs/2304.05510) (April 2023) - [Galactic ChitChat: Using Large Language Models to Converse with Astronomy Literature](https://arxiv.org/abs/2304.05406) (April 2023) - [Emergent autonomous scientific research capabilities of large language models](https://arxiv.org/abs/2304.05332) (April 2023) diff --git a/pages/papers.en.mdx b/pages/papers.en.mdx index bbc9c75..df1cfea 100644 --- a/pages/papers.en.mdx +++ b/pages/papers.en.mdx @@ -4,6 +4,7 @@ The following are the latest papers (sorted by release date) on prompt engineeri ## Overviews + - [Jailbreaking ChatGPT via Prompt Engineering: An Empirical Study](https://arxiv.org/abs/2305.13860) (May 2023) - [Tool Learning with Foundation Models](https://arxiv.org/abs/2304.08354) (April 2023) - [One Small Step for Generative AI, One Giant Leap for AGI: A Complete Survey on ChatGPT in AIGC Era](https://arxiv.org/abs/2304.06488) (April 2023) - [A Bibliometric Review of Large Language Models Research from 2017 to 2023](https://arxiv.org/abs/2304.02020) (April 2023) @@ -18,7 +19,32 @@ The following are the latest papers (sorted by release date) on prompt engineeri - [Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing](https://arxiv.org/abs/2107.13586) (Jul 2021) ## Approaches - + + - []() (May 2023) + - []() (May 2023) + - [Self-Critique Prompting with Large Language Models for Inductive Instructions](https://arxiv.org/abs/2305.13733) (May 2023) + - [Better Zero-Shot Reasoning with Self-Adaptive Prompting](https://arxiv.org/abs/2305.14106) (May 2023) + - [Hierarchical Prompting Assists Large Language Model on Web Navigation](https://arxiv.org/abs/2305.14257) (May 2023) + - [Interactive Natural Language Processing](https://arxiv.org/abs/2305.13246) (May 2023) + - [Can We Edit Factual Knowledge by In-Context Learning?](https://arxiv.org/abs/2305.12740) (May 2023) + - [In-Context Learning of Large Language Models Explained as Kernel Regression](https://arxiv.org/abs/2305.12766) (May 2023) + - [Meta-in-context learning in large language models](https://arxiv.org/abs/2305.12907) (May 2023) + - [Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning with LLMs](https://arxiv.org/abs/2305.11860) (May 2023) + - [Post Hoc Explanations of Language Models Can Improve Language Models](https://arxiv.org/abs/2305.11426) (May 2023) + - [Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt](https://arxiv.org/abs/2305.11186) (May 2023) + - [TreePrompt: Learning to Compose Tree Prompts for Explainable Visual Grounding](https://arxiv.org/abs/2305.11497) (May 2023) + - [TELeR: A General Taxonomy of LLM Prompts for Benchmarking Complex Tasks](https://arxiv.org/abs/2305.11430) (May 2023) + - [Efficient Prompting via Dynamic In-Context Learning](https://arxiv.org/abs/2305.11170) (May 2023) + - [The Web Can Be Your Oyster for Improving Large Language Models](https://arxiv.org/abs/2305.10998) (May 2023) + - [Flatness-Aware Prompt Selection Improves Accuracy and Sample Efficiency](https://arxiv.org/abs/2305.10713) (May 2023) + - [Tree of Thoughts: Deliberate Problem Solving with Large Language Models](https://arxiv.org/abs/2305.10601) (May 2023) + - [ZeroPrompt: Streaming Acoustic Encoders are Zero-Shot Masked LMs](https://arxiv.org/abs/2305.10649) (May 2023) + - [Chain-of-Symbol Prompting Elicits Planning in Large Langauge Models](https://arxiv.org/abs/2305.10276) (May 2023) + - [CooK: Empowering General-Purpose Language Models with Modular and Collaborative Knowledge](https://arxiv.org/abs/2305.09955) (May 2023) + - [What In-Context Learning "Learns" In-Context: Disentangling Task Recognition and Task Learning](https://arxiv.org/abs/2305.09731) (May 2023) + - [Reprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs Sampling](https://arxiv.org/abs/2305.09993) (May 2023) + - [Satisfiability-Aided Language Models Using Declarative Prompting](https://arxiv.org/abs/2305.09656) (May 2023) + - [Pre-Training to Learn in Context](https://arxiv.org/abs/2305.09137) (May 2023) - [Boosted Prompt Ensembles for Large Language Models](https://arxiv.org/abs/2304.05970) (April 2023) - [Global Prompt Cell: A Portable Control Module for Effective Prompt](https://arxiv.org/abs/2304.05642) (April 2023) - [Why think step-by-step? Reasoning emerges from the locality of experience](https://arxiv.org/abs/2304.03843) (April 2023) @@ -130,6 +156,93 @@ The following are the latest papers (sorted by release date) on prompt engineeri ## Applications + - [PromptClass: Weakly-Supervised Text Classification with Prompting Enhanced Noise-Robust Self-Training](https://arxiv.org/abs/2305.13723) (May 2023) + - [Aligning Large Language Models through Synthetic Feedback](https://arxiv.org/abs/2305.13735) (May 2023) + - [Concept-aware Training Improves In-context Learning Ability of Language Models](https://arxiv.org/abs/2305.13775) (May 2023) + - [Enhancing Black-Box Few-Shot Text Classification with Prompt-Based Data Augmentation](https://arxiv.org/abs/2305.13785) (May 2023) + - [Detecting automatically the layout of clinical documents to enhance the performances of downstream natural language processing](https://arxiv.org/abs/2305.13817) (May 2023) + - ["Is the Pope Catholic?" Applying Chain-of-Thought Reasoning to Understanding Conversational Implicatures](https://arxiv.org/abs/2305.13826) (May 2023) + - [Let's Think Frame by Frame: Evaluating Video Chain of Thought with Video Infilling and Prediction](https://arxiv.org/abs/2305.13903) (May 2023) + - [Generating Data for Symbolic Language with Large Language Models](https://arxiv.org/abs/2305.13917) (May 2023) + - [Make a Choice! Knowledge Base Question Answering with In-Context Learning](https://arxiv.org/abs/2305.13972) (May 2023) + - [Improving Language Models via Plug-and-Play Retrieval Feedback](https://arxiv.org/abs/2305.14002) (May 2023) + - [Multi-Granularity Prompts for Topic Shift Detection in Dialogue](https://arxiv.org/abs/2305.14006) (May 2023) + - [The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning](https://arxiv.org/abs/2305.14045) (May 2023) + - [Can Language Models Understand Physical Concepts?](https://arxiv.org/abs/2305.14057) (May 2023) + - [Evaluating Factual Consistency of Summaries with Large Language Models](https://arxiv.org/abs/2305.14069) (May 2023) + - [Dr.ICL: Demonstration-Retrieved In-context Learning](https://arxiv.org/abs/2305.14128) (May 2023) + - [Probing in Context: Toward Building Robust Classifiers via Probing Large Language Models](https://arxiv.org/abs/2305.14171) (May 2023) + - [Skill-Based Few-Shot Selection for In-Context Learning](https://arxiv.org/abs/2305.14210) (May 2023) + - [Exploring Chain-of-Thought Style Prompting for Text-to-SQL](https://arxiv.org/abs/2305.14215) (May 2023) + - [Enhancing Chat Language Models by Scaling High-quality Instructional Conversations](https://arxiv.org/abs/2305.14233) (May 2023) + - [On Learning to Summarize with Large Language Models as References](https://arxiv.org/abs/2305.14239) (May 2023) + - [Learning to Generate Novel Scientific Directions with Contextualized Literature-based Discovery](https://arxiv.org/abs/2305.14259) (May 2023) + - [Active Learning Principles for In-Context Learning with Large Language Models](https://arxiv.org/abs/2305.14264) (May 2023) + - [Two Failures of Self-Consistency in the Multi-Step Reasoning of LLMs](https://arxiv.org/abs/2305.14279) (May 2023) + - [Improving Factuality and Reasoning in Language Models through Multiagent Debate](https://arxiv.org/abs/2305.14325) (May 2023) + - [ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on\\ Chat-based Large Language Models](https://arxiv.org/abs/2305.14323) (May 2023) + - [WikiChat: A Few-Shot LLM-Based Chatbot Grounded with Wikipedia](https://arxiv.org/abs/2305.14292) (May 2023) + - [Query Rewriting for Retrieval-Augmented Large Language Models](https://arxiv.org/abs/2305.14283) (May 2023) + - [Discrete Prompt Optimization via Constrained Generation for Zero-shot Re-ranker](https://arxiv.org/abs/2305.13729) (May 2023) + - [Element-aware Summarization with Large Language Models: Expert-aligned Evaluation and Chain-of-Thought Method](https://arxiv.org/abs/2305.13412) (May 2023) + - [Small Language Models Improve Giants by Rewriting Their Outputs](https://arxiv.org/abs/2305.13514) (May 2023) + - [Prompting and Evaluating Large Language Models for Proactive Dialogues: Clarification, Target-guided, and Non-collaboration](https://arxiv.org/abs/2305.13626) (May 2023) + - [Prompt-Based Monte-Carlo Tree Search for Goal-Oriented Dialogue Policy Planning](https://arxiv.org/abs/2305.13660) (May 2023) + - [Mitigating Language Model Hallucination with Interactive Question-Knowledge Alignment](https://arxiv.org/abs/2305.13669) (May 2023) + - [Making Language Models Better Tool Learners with Execution Feedback](https://arxiv.org/abs/2305.13068) (May 2023) + - [Text-to-SQL Error Correction with Language Models of Code](https://arxiv.org/abs/2305.13073) (May 2023) + - [Decomposed Prompting for Machine Translation Between Related Languages using Large Language Models](https://arxiv.org/abs/2305.13085) (May 2023) + - [SPARSEFIT: Few-shot Prompting with Sparse Fine-tuning for Jointly Generating Predictions and Natural Language Explanations](https://arxiv.org/abs/2305.13235) (May 2023) + - ["According to ..." Prompting Language Models Improves Quoting from Pre-Training Data](https://arxiv.org/abs/2305.13252) (May 2023) + - [Prompt-based methods may underestimate large language models' linguistic generalizations](https://arxiv.org/abs/2305.13264) (May 2023) + - [Chain of Knowledge: A Framework for Grounding Large Language Models with Structured Knowledge Bases](https://arxiv.org/abs/2305.13269) (May 2023) + - [Measuring Inductive Biases of In-Context Learning with Underspecified Demonstrations](https://arxiv.org/abs/2305.13299) (May 2023) + - [Automated Few-shot Classification with Instruction-Finetuned Language Models](https://arxiv.org/abs/2305.12576) (May 2023) + - [Enhancing Few-shot Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies](https://arxiv.org/abs/2305.12586) (May 2023) + - [MvP: Multi-view Prompting Improves Aspect Sentiment Tuple Prediction](https://arxiv.org/abs/2305.12627) (May 2023) + - [Learning Interpretable Style Embeddings via Prompting LLMs](https://arxiv.org/abs/2305.12696) (May 2023) + - [Enhancing Small Medical Learners with Privacy-preserving Contextual Prompting](https://arxiv.org/abs/2305.12723) (May 2023) + - [Fact-Checking Complex Claims with Program-Guided Reasoning](https://arxiv.org/abs/2305.12744) (May 2023) + - [A Benchmark on Extremely Weakly Supervised Text Classification: Reconcile Seed Matching and Prompting Approaches](https://arxiv.org/abs/2305.12749) (May 2023) + - [This Prompt is Measuring : Evaluating Bias Evaluation in Language Models](https://arxiv.org/abs/2305.12757) (May 2023) + - [Enhancing Cross-lingual Natural Language Inference by Soft Prompting with Multilingual Verbalizer](https://arxiv.org/abs/2305.12761) (May 2023) + - [Evaluating Prompt-based Question Answering for Object Prediction in the Open Research Knowledge Graph](https://arxiv.org/abs/2305.12900) (May 2023) + - [Explaining How Transformers Use Context to Build Predictions](https://arxiv.org/abs/2305.12535) (May 2023) + - [PiVe: Prompting with Iterative Verification Improving Graph-based Generative Capability of LLMs](https://arxiv.org/abs/2305.12392) (May 2023) + - [PromptNER: A Prompting Method for Few-shot Named Entity Recognition via k Nearest Neighbor Search](https://arxiv.org/abs/2305.12217) (May 2023) + - [Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning](https://arxiv.org/abs/2305.12295) (May 2023) + - [Enhancing Few-shot NER with Prompt Ordering based Data Augmentation](https://arxiv.org/abs/2305.11791) (May 2023) + - [Chain-of-thought prompting for responding to in-depth dialogue questions with LLM](https://arxiv.org/abs/2305.11792) (May 2023) + - [How to Prompt LLMs for Text-to-SQL: A Study in Zero-shot, Single-domain, and Cross-domain Settings](https://arxiv.org/abs/2305.11853) (May 2023) + - [Evaluation of medium-large Language Models at zero-shot closed book generative question answering](https://arxiv.org/abs/2305.11991) (May 2023) + - [Few-Shot Dialogue Summarization via Skeleton-Assisted Prompt Transfer](https://arxiv.org/abs/2305.12077) (May 2023) + - [Can NLP Models Correctly Reason Over Contexts that Break the Common Assumptions?](https://arxiv.org/abs/2305.12096) (May 2023) + - [Reasoning Implicit Sentiment with Chain-of-Thought Prompting](https://arxiv.org/abs/2305.11255) (May 2023) + - [Writing your own book: A method for going from closed to open book QA to improve robustness and performance of smaller LLMs](https://arxiv.org/abs/2305.11334) (May 2023) + - [AutoTrial: Prompting Language Models for Clinical Trial Design](https://arxiv.org/abs/2305.11366) (May 2023) + - [CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing](https://arxiv.org/abs/2305.11738) (May 2023) + - [Controlling the Extraction of Memorized Data from Large Language Models via Prompt-Tuning](https://arxiv.org/abs/2305.11759) (May 2023) + - [Prompting with Pseudo-Code Instructions](https://arxiv.org/abs/2305.11790) (May 2023) + - [TrueTeacher: Learning Factual Consistency Evaluation with Large Language Models](https://arxiv.org/abs/2305.11171) (May 2023) + - [Aligning Instruction Tasks Unlocks Large Language Models as Zero-Shot Relation Extractors](https://arxiv.org/abs/2305.11159) (May 2023) + - [Exploiting Biased Models to De-bias Text: A Gender-Fair Rewriting Model](https://arxiv.org/abs/2305.11140) (May 2023) + - [Learning In-context Learning for Named Entity Recognition](https://arxiv.org/abs/2305.11038) (May 2023) + - [Take a Break in the Middle: Investigating Subgoals towards Hierarchical Script Generation](https://arxiv.org/abs/2305.10907) (May 2023) + - [TEPrompt: Task Enlightenment Prompt Learning for Implicit Discourse Relation Recognition](https://arxiv.org/abs/2305.10866) (May 2023) + - [Large Language Models can be Guided to Evade AI-Generated Text Detection](https://arxiv.org/abs/2305.10847) (May 2023) + - [Temporal Knowledge Graph Forecasting Without Knowledge Using In-Context Learning](https://arxiv.org/abs/2305.10613) (May 2023) + - [Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization](https://arxiv.org/abs/2305.11095) (May 2023) + - [Think Outside the Code: Brainstorming Boosts Large Language Models in Code Generation](https://arxiv.org/abs/2305.10679) (May 2023) + - [Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback](https://arxiv.org/abs/2305.10142) (May 2023) + - [ConvXAI: Delivering Heterogeneous AI Explanations via Conversations to Support Human-AI Scientific Writing](https://arxiv.org/abs/2305.09770) (May 2023) + - [StructGPT: A General Framework for Large Language Model to Reason over Structured Data](https://arxiv.org/abs/2305.09645) (May 2023) + - [Towards Expert-Level Medical Question Answering with Large Language Models](https://arxiv.org/abs/2305.09617) (May 2023) + - [Large Language Models are Built-in Autoregressive Search Engines](https://arxiv.org/abs/2305.09612) (May 2023) + - [MsPrompt: Multi-step Prompt Learning for Debiasing Few-shot Event Detection](https://arxiv.org/abs/2305.09335) (May 2023) + - [Exploring the Impact of Layer Normalization for Zero-shot Neural Machine Translation](https://arxiv.org/abs/2305.09312) (May 2023) + - [SGP-TOD: Building Task Bots Effortlessly via Schema-Guided LLM Prompting](https://arxiv.org/abs/2305.09067) (May 2023) + - [Multi-modal Visual Understanding with Prompts for Semantic Information Disentanglement of Image](https://arxiv.org/abs/2305.09333) (May 2023) + - [Soft Prompt Decoding for Multilingual Dense Retrieval](https://arxiv.org/abs/2305.09025) (May 2023) - [PaLM 2 Technical Report](https://ai.google/static/documents/palm2techreport.pdf) (May 2023) - [Are LLMs All You Need for Task-Oriented Dialogue?](https://arxiv.org/abs/2304.06556) (April 2023) - [HiPrompt: Few-Shot Biomedical Knowledge Fusion via Hierarchy-Oriented Prompting](https://arxiv.org/abs/2304.05973) (April 2023)