fix

1 year ago · 7b3cac938c
parent da026ad4e4
commit 7b3cac938c
3 changed files with 147 additions and 1 deletions
--- a/pages/models/chatgpt.en.mdx
+++ b/pages/models/chatgpt.en.mdx
@ -145,6 +145,30 @@ The current recommendation for `gpt-3.5-turbo-0301` is to add instructions in th
 ---
 ## References

+- [Narrative XL: A Large-scale Dataset For Long-Term Memory Models](https://arxiv.org/abs/2305.13877) (May 2023)
+- [Does ChatGPT have Theory of Mind?](https://arxiv.org/abs/2305.14020) (May 2023)
+- [ZeroSCROLLS: A Zero-Shot Benchmark for Long Text Understanding](https://arxiv.org/abs/2305.14196) (May 2023)
+- [Navigating Prompt Complexity for Zero-Shot Classification: A Study of Large Language Models in Computational Social Science](https://arxiv.org/abs/2305.14310) (May 2023)
+- [ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings](https://arxiv.org/abs/2305.13724) (May 2023)
+- [Can LLMs facilitate interpretation of pre-trained language models?](https://arxiv.org/abs/2305.13386) (May 2023)
+- [Can ChatGPT Detect Intent? Evaluating Large Language Models for Spoken Language Understanding](https://arxiv.org/abs/2305.13512) (May 2023)
+- [LLM-empowered Chatbots for Psychiatrist and Patient Simulation: Application and Evaluation](https://arxiv.org/abs/2305.13614) (May 2023)
+- [ChatGPT as your Personal Data Scientist](https://arxiv.org/abs/2305.13657) (May 2023)
+- [Are Large Language Models Good Evaluators for Abstractive Summarization?](https://arxiv.org/abs/2305.13091) (May 2023)
+- [Can ChatGPT Defend the Truth? Automatic Dialectical Evaluation Elicits LLMs' Deficiencies in Reasoning](https://arxiv.org/abs/2305.13160) (May 2023)
+- [Evaluating ChatGPT's Performance for Multilingual and Emoji-based Hate Speech Detection](https://arxiv.org/abs/2305.13276) (May 2023)
+- [ChatGPT to Replace Crowdsourcing of Paraphrases for Intent Classification: Higher Diversity and Comparable Model Robustness](https://arxiv.org/abs/2305.12947) (May 2023)
+- [Distilling ChatGPT for Explainable Automated Student Answer Assessment](https://arxiv.org/abs/2305.12962) (May 2023)
+- [Prompt ChatGPT In MNER: Improved multimodal named entity recognition method based on auxiliary refining knowledge from ChatGPT](https://arxiv.org/abs/2305.12212) (May 2023)
+- [ChatGPT Is More Likely to Be Perceived as Male Than Female](https://arxiv.org/abs/2305.12564) (May 2023)
+- [Observations on LLMs for Telecom Domain: Capabilities and Limitations](https://arxiv.org/abs/2305.13102) (May 2023)
+- [Bits of Grass: Does GPT already know how to write like Whitman?](https://arxiv.org/abs/2305.11064) (May 2023)
+- [Are Large Language Models Fit For Guided Reading?](https://arxiv.org/abs/2305.10645) (May 2023)
+- [ChatGPT Perpetuates Gender Bias in Machine Translation and Ignores Non-Gendered Pronouns: Findings across Bengali and Five other Low-Resource Languages](https://arxiv.org/abs/2305.10510) (May 2023)
+- [BAD: BiAs Detection for Large Language Models in the context of candidate screening](https://arxiv.org/abs/2305.10407) (May 2023)
+- [MemoryBank: Enhancing Large Language Models with Long-Term Memory](https://arxiv.org/abs/2305.10250) (May 2023)
+- [Knowledge Graph Completion Models are Few-shot Learners: An Empirical Study of Relation Labeling in E-commerce with LLMs](https://arxiv.org/abs/2305.09858) (May 2023)
+- [A Preliminary Analysis on the Code Generation Capabilities of GPT-3.5 and Bard AI Models for Java Functions](https://arxiv.org/abs/2305.09402) (May 2023)
 - [ChatGPT-4 Outperforms Experts and Crowd Workers in Annotating Political Twitter Messages with Zero-Shot Learning](https://arxiv.org/abs/2304.06588) (April 2023)
 - [ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large Language Models in Multilingual Learning](https://arxiv.org/abs/2304.05613) (April 2023)
 - [Distinguishing ChatGPT(-3.5, -4)-generated and human-written papers through Japanese stylometric analysis](https://arxiv.org/abs/2304.05534) (April 2023)
--- a/pages/models/gpt-4.en.mdx
+++ b/pages/models/gpt-4.en.mdx
@ -160,6 +160,15 @@ Coming soon!

 ## References / Papers

+- [Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks](https://arxiv.org/abs/2305.14201) (May 2023)
+- [How Language Model Hallucinations Can Snowball](https://arxiv.org/abs/2305.13534) (May 2023)
+- [LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future Opportunities](https://arxiv.org/abs/2305.13168) (May 2023)
+- [GPT-3.5 vs GPT-4: Evaluating ChatGPT's Reasoning Performance in Zero-shot Learning](https://arxiv.org/abs/2305.12477) (May 2023)
+- [TheoremQA: A Theorem-driven Question Answering dataset](https://arxiv.org/abs/2305.12524) (May 2023)
+- [Experimental results from applying GPT-4 to an unpublished formal language](https://arxiv.org/abs/2305.12196) (May 2023)
+- [LogiCoT: Logical Chain-of-Thought Instruction-Tuning Data Collection with GPT-4](https://arxiv.org/abs/2305.12147) (May 2023)
+- [Large-Scale Text Analysis Using Generative Language Models: A Case Study in Discovering Public Value Expressions in AI Patents](https://arxiv.org/abs/2305.10383) (May 2023)
+- [Can Language Models Solve Graph Problems in Natural Language?]https://arxiv.org/abs/2305.10037) (May 2023)
 - [chatIPCC: Grounding Conversational AI in Climate Science](https://arxiv.org/abs/2304.05510) (April 2023)
 - [Galactic ChitChat: Using Large Language Models to Converse with Astronomy Literature](https://arxiv.org/abs/2304.05406) (April 2023)
 - [Emergent autonomous scientific research capabilities of large language models](https://arxiv.org/abs/2304.05332) (April 2023)
--- a/pages/papers.en.mdx
+++ b/pages/papers.en.mdx
@ -4,6 +4,7 @@ The following are the latest papers (sorted by release date) on prompt engineeri

 ## Overviews

+  - [Jailbreaking ChatGPT via Prompt Engineering: An Empirical Study](https://arxiv.org/abs/2305.13860) (May 2023)
  - [Tool Learning with Foundation Models](https://arxiv.org/abs/2304.08354) (April 2023)
  - [One Small Step for Generative AI, One Giant Leap for AGI: A Complete Survey on ChatGPT in AIGC Era](https://arxiv.org/abs/2304.06488) (April 2023)
  - [A Bibliometric Review of Large Language Models Research from 2017 to 2023](https://arxiv.org/abs/2304.02020) (April 2023)
@ -18,7 +19,32 @@ The following are the latest papers (sorted by release date) on prompt engineeri
  - [Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing](https://arxiv.org/abs/2107.13586) (Jul 2021)

 ## Approaches
-  
+
+  - []() (May 2023)
+  - []() (May 2023)
+  - [Self-Critique Prompting with Large Language Models for Inductive Instructions](https://arxiv.org/abs/2305.13733) (May 2023)
+  - [Better Zero-Shot Reasoning with Self-Adaptive Prompting](https://arxiv.org/abs/2305.14106) (May 2023)
+  - [Hierarchical Prompting Assists Large Language Model on Web Navigation](https://arxiv.org/abs/2305.14257) (May 2023)
+  - [Interactive Natural Language Processing](https://arxiv.org/abs/2305.13246) (May 2023)
+  - [Can We Edit Factual Knowledge by In-Context Learning?](https://arxiv.org/abs/2305.12740) (May 2023)
+  - [In-Context Learning of Large Language Models Explained as Kernel Regression](https://arxiv.org/abs/2305.12766) (May 2023)
+  - [Meta-in-context learning in large language models](https://arxiv.org/abs/2305.12907) (May 2023)
+  - [Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning with LLMs](https://arxiv.org/abs/2305.11860) (May 2023)
+  - [Post Hoc Explanations of Language Models Can Improve Language Models](https://arxiv.org/abs/2305.11426) (May 2023)  
+  - [Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt](https://arxiv.org/abs/2305.11186) (May 2023)
+  - [TreePrompt: Learning to Compose Tree Prompts for Explainable Visual Grounding](https://arxiv.org/abs/2305.11497) (May 2023)
+  - [TELeR: A General Taxonomy of LLM Prompts for Benchmarking Complex Tasks](https://arxiv.org/abs/2305.11430) (May 2023)  
+  - [Efficient Prompting via Dynamic In-Context Learning](https://arxiv.org/abs/2305.11170) (May 2023)
+  - [The Web Can Be Your Oyster for Improving Large Language Models](https://arxiv.org/abs/2305.10998) (May 2023)
+  - [Flatness-Aware Prompt Selection Improves Accuracy and Sample Efficiency](https://arxiv.org/abs/2305.10713) (May 2023)  
+  - [Tree of Thoughts: Deliberate Problem Solving with Large Language Models](https://arxiv.org/abs/2305.10601) (May 2023)
+  - [ZeroPrompt: Streaming Acoustic Encoders are Zero-Shot Masked LMs](https://arxiv.org/abs/2305.10649) (May 2023)
+  - [Chain-of-Symbol Prompting Elicits Planning in Large Langauge Models](https://arxiv.org/abs/2305.10276) (May 2023)  
+  - [CooK: Empowering General-Purpose Language Models with Modular and Collaborative Knowledge](https://arxiv.org/abs/2305.09955) (May 2023)
+  - [What In-Context Learning "Learns" In-Context: Disentangling Task Recognition and Task Learning](https://arxiv.org/abs/2305.09731) (May 2023)
+  - [Reprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs Sampling](https://arxiv.org/abs/2305.09993) (May 2023)  
+  - [Satisfiability-Aided Language Models Using Declarative Prompting](https://arxiv.org/abs/2305.09656) (May 2023)
+  - [Pre-Training to Learn in Context](https://arxiv.org/abs/2305.09137) (May 2023)
  - [Boosted Prompt Ensembles for Large Language Models](https://arxiv.org/abs/2304.05970) (April 2023)
  - [Global Prompt Cell: A Portable Control Module for Effective Prompt](https://arxiv.org/abs/2304.05642) (April 2023)
  - [Why think step-by-step? Reasoning emerges from the locality of experience](https://arxiv.org/abs/2304.03843) (April 2023)
@ -130,6 +156,93 @@ The following are the latest papers (sorted by release date) on prompt engineeri

 ## Applications

+  - [PromptClass: Weakly-Supervised Text Classification with Prompting Enhanced Noise-Robust Self-Training](https://arxiv.org/abs/2305.13723) (May 2023)
+  - [Aligning Large Language Models through Synthetic Feedback](https://arxiv.org/abs/2305.13735) (May 2023)
+  - [Concept-aware Training Improves In-context Learning Ability of Language Models](https://arxiv.org/abs/2305.13775) (May 2023)
+  - [Enhancing Black-Box Few-Shot Text Classification with Prompt-Based Data Augmentation](https://arxiv.org/abs/2305.13785) (May 2023)
+  - [Detecting automatically the layout of clinical documents to enhance the performances of downstream natural language processing](https://arxiv.org/abs/2305.13817) (May 2023)
+  - ["Is the Pope Catholic?" Applying Chain-of-Thought Reasoning to Understanding Conversational Implicatures](https://arxiv.org/abs/2305.13826) (May 2023)
+  - [Let's Think Frame by Frame: Evaluating Video Chain of Thought with Video Infilling and Prediction](https://arxiv.org/abs/2305.13903) (May 2023)
+  - [Generating Data for Symbolic Language with Large Language Models](https://arxiv.org/abs/2305.13917) (May 2023)
+  - [Make a Choice! Knowledge Base Question Answering with In-Context Learning](https://arxiv.org/abs/2305.13972) (May 2023)
+  - [Improving Language Models via Plug-and-Play Retrieval Feedback](https://arxiv.org/abs/2305.14002) (May 2023)
+  - [Multi-Granularity Prompts for Topic Shift Detection in Dialogue](https://arxiv.org/abs/2305.14006) (May 2023)
+  - [The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning](https://arxiv.org/abs/2305.14045) (May 2023)
+  - [Can Language Models Understand Physical Concepts?](https://arxiv.org/abs/2305.14057) (May 2023)
+  - [Evaluating Factual Consistency of Summaries with Large Language Models](https://arxiv.org/abs/2305.14069) (May 2023)
+  - [Dr.ICL: Demonstration-Retrieved In-context Learning](https://arxiv.org/abs/2305.14128) (May 2023)
+  - [Probing in Context: Toward Building Robust Classifiers via Probing Large Language Models](https://arxiv.org/abs/2305.14171) (May 2023)
+  - [Skill-Based Few-Shot Selection for In-Context Learning](https://arxiv.org/abs/2305.14210) (May 2023)
+  - [Exploring Chain-of-Thought Style Prompting for Text-to-SQL](https://arxiv.org/abs/2305.14215) (May 2023)
+  - [Enhancing Chat Language Models by Scaling High-quality Instructional Conversations](https://arxiv.org/abs/2305.14233) (May 2023)
+  - [On Learning to Summarize with Large Language Models as References](https://arxiv.org/abs/2305.14239) (May 2023)
+  - [Learning to Generate Novel Scientific Directions with Contextualized Literature-based Discovery](https://arxiv.org/abs/2305.14259) (May 2023)
+  - [Active Learning Principles for In-Context Learning with Large Language Models](https://arxiv.org/abs/2305.14264) (May 2023)
+  - [Two Failures of Self-Consistency in the Multi-Step Reasoning of LLMs](https://arxiv.org/abs/2305.14279) (May 2023)
+  - [Improving Factuality and Reasoning in Language Models through Multiagent Debate](https://arxiv.org/abs/2305.14325) (May 2023)
+  - [ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on\\ Chat-based Large Language Models](https://arxiv.org/abs/2305.14323) (May 2023)
+  - [WikiChat: A Few-Shot LLM-Based Chatbot Grounded with Wikipedia](https://arxiv.org/abs/2305.14292) (May 2023)
+  - [Query Rewriting for Retrieval-Augmented Large Language Models](https://arxiv.org/abs/2305.14283) (May 2023)
+  - [Discrete Prompt Optimization via Constrained Generation for Zero-shot Re-ranker](https://arxiv.org/abs/2305.13729) (May 2023)
+  - [Element-aware Summarization with Large Language Models: Expert-aligned Evaluation and Chain-of-Thought Method](https://arxiv.org/abs/2305.13412) (May 2023)
+  - [Small Language Models Improve Giants by Rewriting Their Outputs](https://arxiv.org/abs/2305.13514) (May 2023)
+  - [Prompting and Evaluating Large Language Models for Proactive Dialogues: Clarification, Target-guided, and Non-collaboration](https://arxiv.org/abs/2305.13626) (May 2023)
+  - [Prompt-Based Monte-Carlo Tree Search for Goal-Oriented Dialogue Policy Planning](https://arxiv.org/abs/2305.13660) (May 2023)
+  - [Mitigating Language Model Hallucination with Interactive Question-Knowledge Alignment](https://arxiv.org/abs/2305.13669) (May 2023)
+  - [Making Language Models Better Tool Learners with Execution Feedback](https://arxiv.org/abs/2305.13068) (May 2023)
+  - [Text-to-SQL Error Correction with Language Models of Code](https://arxiv.org/abs/2305.13073) (May 2023)
+  - [Decomposed Prompting for Machine Translation Between Related Languages using Large Language Models](https://arxiv.org/abs/2305.13085) (May 2023)
+  - [SPARSEFIT: Few-shot Prompting with Sparse Fine-tuning for Jointly Generating Predictions and Natural Language Explanations](https://arxiv.org/abs/2305.13235) (May 2023)
+  - ["According to ..." Prompting Language Models Improves Quoting from Pre-Training Data](https://arxiv.org/abs/2305.13252) (May 2023)
+  - [Prompt-based methods may underestimate large language models' linguistic generalizations](https://arxiv.org/abs/2305.13264) (May 2023)
+  - [Chain of Knowledge: A Framework for Grounding Large Language Models with Structured Knowledge Bases](https://arxiv.org/abs/2305.13269) (May 2023)
+  - [Measuring Inductive Biases of In-Context Learning with Underspecified Demonstrations](https://arxiv.org/abs/2305.13299) (May 2023)
+  - [Automated Few-shot Classification with Instruction-Finetuned Language Models](https://arxiv.org/abs/2305.12576) (May 2023)
+  - [Enhancing Few-shot Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies](https://arxiv.org/abs/2305.12586) (May 2023)
+  - [MvP: Multi-view Prompting Improves Aspect Sentiment Tuple Prediction](https://arxiv.org/abs/2305.12627) (May 2023)
+  - [Learning Interpretable Style Embeddings via Prompting LLMs](https://arxiv.org/abs/2305.12696) (May 2023)
+  - [Enhancing Small Medical Learners with Privacy-preserving Contextual Prompting](https://arxiv.org/abs/2305.12723) (May 2023)
+  - [Fact-Checking Complex Claims with Program-Guided Reasoning](https://arxiv.org/abs/2305.12744) (May 2023)
+  - [A Benchmark on Extremely Weakly Supervised Text Classification: Reconcile Seed Matching and Prompting Approaches](https://arxiv.org/abs/2305.12749) (May 2023)
+  - [This Prompt is Measuring <MASK>: Evaluating Bias Evaluation in Language Models](https://arxiv.org/abs/2305.12757) (May 2023)
+  - [Enhancing Cross-lingual Natural Language Inference by Soft Prompting with Multilingual Verbalizer](https://arxiv.org/abs/2305.12761) (May 2023)
+  - [Evaluating Prompt-based Question Answering for Object Prediction in the Open Research Knowledge Graph](https://arxiv.org/abs/2305.12900) (May 2023)
+  - [Explaining How Transformers Use Context to Build Predictions](https://arxiv.org/abs/2305.12535) (May 2023)
+  - [PiVe: Prompting with Iterative Verification Improving Graph-based Generative Capability of LLMs](https://arxiv.org/abs/2305.12392) (May 2023)
+  - [PromptNER: A Prompting Method for Few-shot Named Entity Recognition via k Nearest Neighbor Search](https://arxiv.org/abs/2305.12217) (May 2023)
+  - [Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning](https://arxiv.org/abs/2305.12295) (May 2023)
+  - [Enhancing Few-shot NER with Prompt Ordering based Data Augmentation](https://arxiv.org/abs/2305.11791) (May 2023)
+  - [Chain-of-thought prompting for responding to in-depth dialogue questions with LLM](https://arxiv.org/abs/2305.11792) (May 2023)
+  - [How to Prompt LLMs for Text-to-SQL: A Study in Zero-shot, Single-domain, and Cross-domain Settings](https://arxiv.org/abs/2305.11853) (May 2023)
+  - [Evaluation of medium-large Language Models at zero-shot closed book generative question answering](https://arxiv.org/abs/2305.11991) (May 2023)
+  - [Few-Shot Dialogue Summarization via Skeleton-Assisted Prompt Transfer](https://arxiv.org/abs/2305.12077) (May 2023)
+  - [Can NLP Models Correctly Reason Over Contexts that Break the Common Assumptions?](https://arxiv.org/abs/2305.12096) (May 2023)
+  - [Reasoning Implicit Sentiment with Chain-of-Thought Prompting](https://arxiv.org/abs/2305.11255) (May 2023)
+  - [Writing your own book: A method for going from closed to open book QA to improve robustness and performance of smaller LLMs](https://arxiv.org/abs/2305.11334) (May 2023)
+  - [AutoTrial: Prompting Language Models for Clinical Trial Design](https://arxiv.org/abs/2305.11366) (May 2023)
+  - [CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing](https://arxiv.org/abs/2305.11738) (May 2023)
+  - [Controlling the Extraction of Memorized Data from Large Language Models via Prompt-Tuning](https://arxiv.org/abs/2305.11759) (May 2023)
+  - [Prompting with Pseudo-Code Instructions](https://arxiv.org/abs/2305.11790) (May 2023)
+  - [TrueTeacher: Learning Factual Consistency Evaluation with Large Language Models](https://arxiv.org/abs/2305.11171) (May 2023)
+  - [Aligning Instruction Tasks Unlocks Large Language Models as Zero-Shot Relation Extractors](https://arxiv.org/abs/2305.11159) (May 2023)
+  - [Exploiting Biased Models to De-bias Text: A Gender-Fair Rewriting Model](https://arxiv.org/abs/2305.11140) (May 2023)
+  - [Learning In-context Learning for Named Entity Recognition](https://arxiv.org/abs/2305.11038) (May 2023)  
+  - [Take a Break in the Middle: Investigating Subgoals towards Hierarchical Script Generation](https://arxiv.org/abs/2305.10907) (May 2023)
+  - [TEPrompt: Task Enlightenment Prompt Learning for Implicit Discourse Relation Recognition](https://arxiv.org/abs/2305.10866) (May 2023)
+  - [Large Language Models can be Guided to Evade AI-Generated Text Detection](https://arxiv.org/abs/2305.10847) (May 2023)  
+  - [Temporal Knowledge Graph Forecasting Without Knowledge Using In-Context Learning](https://arxiv.org/abs/2305.10613) (May 2023)
+  - [Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization](https://arxiv.org/abs/2305.11095) (May 2023)
+  - [Think Outside the Code: Brainstorming Boosts Large Language Models in Code Generation](https://arxiv.org/abs/2305.10679) (May 2023)  
+  - [Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback](https://arxiv.org/abs/2305.10142) (May 2023)
+  - [ConvXAI: Delivering Heterogeneous AI Explanations via Conversations to Support Human-AI Scientific Writing](https://arxiv.org/abs/2305.09770) (May 2023)
+  - [StructGPT: A General Framework for Large Language Model to Reason over Structured Data](https://arxiv.org/abs/2305.09645) (May 2023)  
+  - [Towards Expert-Level Medical Question Answering with Large Language Models](https://arxiv.org/abs/2305.09617) (May 2023)
+  - [Large Language Models are Built-in Autoregressive Search Engines](https://arxiv.org/abs/2305.09612) (May 2023)
+  - [MsPrompt: Multi-step Prompt Learning for Debiasing Few-shot Event Detection](https://arxiv.org/abs/2305.09335) (May 2023)  
+  - [Exploring the Impact of Layer Normalization for Zero-shot Neural Machine Translation](https://arxiv.org/abs/2305.09312) (May 2023)
+  - [SGP-TOD: Building Task Bots Effortlessly via Schema-Guided LLM Prompting](https://arxiv.org/abs/2305.09067) (May 2023)
+  - [Multi-modal Visual Understanding with Prompts for Semantic Information Disentanglement of Image](https://arxiv.org/abs/2305.09333) (May 2023)
+  - [Soft Prompt Decoding for Multilingual Dense Retrieval](https://arxiv.org/abs/2305.09025) (May 2023)
  - [PaLM 2 Technical Report](https://ai.google/static/documents/palm2techreport.pdf) (May 2023)
  - [Are LLMs All You Need for Task-Oriented Dialogue?](https://arxiv.org/abs/2304.06556) (April 2023)
  - [HiPrompt: Few-Shot Biomedical Knowledge Fusion via Hierarchy-Oriented Prompting](https://arxiv.org/abs/2304.05973) (April 2023)