added new papers

1 year ago · 923f16779d
parent edb3e91c4c
commit 923f16779d
4 changed files with 36 additions and 0 deletions
--- a/pages/models/chatgpt.en.mdx
+++ b/pages/models/chatgpt.en.mdx
@ -145,8 +145,11 @@ The current recommendation for `gpt-3.5-turbo-0301` is to add instructions in th
 ---
 ## References

+- [The Larger They Are, the Harder They Fail: Language Models do not Recognize Identifier Swaps in Python](https://arxiv.org/pdf/2305.15507v1.pdf) (May 2023)
+- [InternGPT: Solving Vision-Centric Tasks by Interacting with ChatGPT Beyond Language](https://arxiv.org/abs/2305.05662v3) (May 2023)
 - [Narrative XL: A Large-scale Dataset For Long-Term Memory Models](https://arxiv.org/abs/2305.13877) (May 2023)
 - [Does ChatGPT have Theory of Mind?](https://arxiv.org/abs/2305.14020) (May 2023)
+- [Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs](https://arxiv.org/abs/2305.03111v2) (May 2023)
 - [ZeroSCROLLS: A Zero-Shot Benchmark for Long Text Understanding](https://arxiv.org/abs/2305.14196) (May 2023)
 - [Navigating Prompt Complexity for Zero-Shot Classification: A Study of Large Language Models in Computational Social Science](https://arxiv.org/abs/2305.14310) (May 2023)
 - [ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings](https://arxiv.org/abs/2305.13724) (May 2023)
--- a/pages/models/collection.en.mdx
+++ b/pages/models/collection.en.mdx
@ -10,7 +10,10 @@ This section consists of a collection and summary of notable and foundational LL
 | --- | --- | --- | --- | --- |
 | [Falcon LLM](https://falconllm.tii.ae/) | May 2023 | 7, 40 | [Falcon-7B](https://huggingface.co/tiiuae), [Falcon-40B](https://huggingface.co/tiiuae/falcon-40b) | Falcon LLM is a foundational large language model (LLM) with 40 billion parameters trained on one trillion tokens. TII has now released Falcon LLM – a 40B model. |
 | [PaLM 2](https://arxiv.org/abs/2305.10403) | May 2023 | - | - | A Language Model that has better multilingual and reasoning capabilities and is more compute-efficient than its predecessor PaLM. |
+| [Med-PaLM 2](https://arxiv.org/abs/2305.09617v1) | May 2023 | - | - | Towards Expert-Level Medical Question Answering with Large Language Models |
+| [Gorilla](https://arxiv.org/abs/2305.15334v1) | May 2023 | 7 | [Gorilla](https://github.com/ShishirPatil/gorilla) | Gorilla: Large Language Model Connected with Massive APIs | 
 | [RedPajama-INCITE](https://www.together.xyz/blog/redpajama-models-v1) | May 2023 | 3, 7 | [RedPajama-INCITE](https://huggingface.co/togethercomputer) | A family of models including base, instruction-tuned & chat models. |
+| [LIMA](https://arxiv.org/abs/2305.11206v1) | May 2023 | 65 | - |  A 65B parameter LLaMa language model fine-tuned with the standard supervised loss on only 1,000 carefully curated prompts and responses, without any reinforcement learning or human preference modeling. | 
 | [Replit Code](https://huggingface.co/replit) | May 2023 | 3 | [Replit Code](https://huggingface.co/replit) | replit-code-v1-3b model is a 2.7B LLM trained on 20 languages from the Stack Dedup v1.2 dataset. |
 | [h2oGPT](https://github.com/h2oai/h2ogpt) | May 2023 | 12 | [h2oGPT](https://github.com/h2oai/h2ogpt) | h2oGPT is a large language model (LLM) fine-tuning framework and chatbot UI with document(s) question-answer capabilities. | 
 | [CodeGen2](https://arxiv.org/abs/2305.02309) | May 2023 | 1, 3, 7, 16 | [CodeGen2](https://github.com/salesforce/codegen2) | Code models for program synthesis. |
@ -27,6 +30,7 @@ This section consists of a collection and summary of notable and foundational LL
 | [PanGu-Σ](https://arxiv.org/abs/2303.10845v1) | March 2023 | 1085 | - | PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing |
 | [GPT-4](https://arxiv.org/abs/2303.08774v3) | March 2023 | - | - | GPT-4 Technical Report |
 | [LLaMA](https://arxiv.org/abs/2302.13971v1) | Feb 2023 | 7, 13, 33, 65 | [LLaMA](https://github.com/facebookresearch/llama) | LLaMA: Open and Efficient Foundation Language Models |
+| [ChatGPT](https://openai.com/blog/chatgpt) | Nov 2022 | - | - | A model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer followup questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests. | 
 | [Galactica](https://arxiv.org/abs/2211.09085v1) | Nov 2022 | 0.125 - 120 | [Galactica](https://huggingface.co/models?other=galactica) | Galactica: A Large Language Model for Science |
 | [mT0](https://arxiv.org/abs/2211.01786v1) | Nov 2022 | 13 | [mT0-xxl](https://huggingface.co/bigscience/mt0-xxl) | Crosslingual Generalization through Multitask Finetuning |
 | [BLOOM](https://arxiv.org/abs/2211.05100v3) | Nov 2022 | 176 | [BLOOM](https://huggingface.co/bigscience/bloom) | BLOOM: A 176B-Parameter Open-Access Multilingual Language Model |
--- a/pages/models/gpt-4.en.mdx
+++ b/pages/models/gpt-4.en.mdx
@ -160,6 +160,11 @@ Coming soon!

 ## References / Papers

+- [Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks](https://arxiv.org/abs/2305.14201v1) (May 2023)
+- [How Language Model Hallucinations Can Snowball](https://arxiv.org/abs/2305.13534v1) (May 2023)
+- [Have LLMs Advanced Enough? A Challenging Problem Solving Benchmark For Large Language Models](https://arxiv.org/abs/2305.15074v1) (May 2023)
+- [GPT4GEO: How a Language Model Sees the World's Geography](https://arxiv.org/abs/2306.00020v1) (May 2023)
+- [SPRING: GPT-4 Out-performs RL Algorithms by Studying Papers and Reasoning](https://arxiv.org/abs/2305.15486v2) (May 2023)
 - [Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks](https://arxiv.org/abs/2305.14201) (May 2023)
 - [How Language Model Hallucinations Can Snowball](https://arxiv.org/abs/2305.13534) (May 2023)
 - [LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future Opportunities](https://arxiv.org/abs/2305.13168) (May 2023)
--- a/pages/papers.en.mdx
+++ b/pages/papers.en.mdx
@ -21,12 +21,15 @@ The following are the latest papers (sorted by release date) on prompt engineeri

 ## Approaches

+  - [PEARL: Prompting Large Language Models to Plan and Execute Actions Over Long Documents](https://arxiv.org/abs/2305.14564v1) (May 2023)
+  - [Reasoning with Language Model is Planning with World Model](https://arxiv.org/abs/2305.14992v1) (May 2023)
  - [Self-Critique Prompting with Large Language Models for Inductive Instructions](https://arxiv.org/abs/2305.13733) (May 2023)
  - [Better Zero-Shot Reasoning with Self-Adaptive Prompting](https://arxiv.org/abs/2305.14106) (May 2023)
  - [Hierarchical Prompting Assists Large Language Model on Web Navigation](https://arxiv.org/abs/2305.14257) (May 2023)
  - [Interactive Natural Language Processing](https://arxiv.org/abs/2305.13246) (May 2023)
  - [Can We Edit Factual Knowledge by In-Context Learning?](https://arxiv.org/abs/2305.12740) (May 2023)
  - [In-Context Learning of Large Language Models Explained as Kernel Regression](https://arxiv.org/abs/2305.12766) (May 2023)
+  - [Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models](https://arxiv.org/abs/2305.04091v3) (May 2023)
  - [Meta-in-context learning in large language models](https://arxiv.org/abs/2305.12907) (May 2023)
  - [Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning with LLMs](https://arxiv.org/abs/2305.11860) (May 2023)
  - [Post Hoc Explanations of Language Models Can Improve Language Models](https://arxiv.org/abs/2305.11426) (May 2023)  
@ -155,9 +158,30 @@ The following are the latest papers (sorted by release date) on prompt engineeri

 ## Applications

+  - [Small Language Models Improve Giants by Rewriting Their Outputs](https://arxiv.org/abs/2305.13514v1) (May 2023)
+  - [On the Planning Abilities of Large Language Models -- A Critical Investigation](https://arxiv.org/abs/2305.15771v1) (May 2023)
+  - [PRODIGY: Enabling In-context Learning Over Graphs](https://arxiv.org/abs/2305.12600v1) (May 2023)
+  - [Large Language Models are Few-Shot Health Learners](https://arxiv.org/abs/2305.15525v1) (May 2023)
+  - [Measuring Inductive Biases of In-Context Learning with Underspecified Demonstrations](https://arxiv.org/abs/2305.13299v1) (May 2023)
+  - [Fact-Checking Complex Claims with Program-Guided Reasoning](https://arxiv.org/abs/2305.12744v1) (May 2023)
+  - [Large Language Models as Tool Makers](https://arxiv.org/abs/2305.17126v1) (May 2023)
+  - [Iterative Forward Tuning Boosts In-context Learning in Language Models](https://arxiv.org/abs/2305.13016v2) (May 2023)
+  - [SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks](https://arxiv.org/abs/2305.17390v1) (May 2023)
+  - [Interactive Natural Language Processing](https://arxiv.org/abs/2305.13246v1) (May 2023)
+  - [An automatically discovered chain-of-thought prompt generalizes to novel models and datasets](https://arxiv.org/abs/2305.02897v1) (May 2023)
+  - [Large Language Model Guided Tree-of-Thought](https://arxiv.org/abs/2305.08291v1) (May 2023)
+  - [Active Retrieval Augmented Generation](https://arxiv.org/abs/2305.06983v1) (May 2023) 
+  - [A PhD Student's Perspective on Research in NLP in the Era of Very Large Language Models](https://arxiv.org/abs/2305.12544v1) (May 2023)
+  - [Visual Chain of Thought: Bridging Logical Gaps with Multimodal Infillings](https://arxiv.org/abs/2305.02317v1) (May 2023)
+  - [Mirages: On Anthropomorphism in Dialogue Systems](https://arxiv.org/abs/2305.09800v1) (May 2023)
+  - [Model evaluation for extreme risks](https://arxiv.org/abs/2305.15324v1) (May 2023)
+  - [Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting](https://arxiv.org/abs/2305.04388v1) (May 2023)
+  - [Cognitive Reframing of Negative Thoughts through Human-Language Model Interaction](https://arxiv.org/abs/2305.02466v1) (May 2023)
  - [PromptClass: Weakly-Supervised Text Classification with Prompting Enhanced Noise-Robust Self-Training](https://arxiv.org/abs/2305.13723) (May 2023)
+  - [Augmented Large Language Models with Parametric Knowledge Guiding](https://arxiv.org/abs/2305.04757v2) (May 2023)
  - [Aligning Large Language Models through Synthetic Feedback](https://arxiv.org/abs/2305.13735) (May 2023)
  - [Concept-aware Training Improves In-context Learning Ability of Language Models](https://arxiv.org/abs/2305.13775) (May 2023)
+  - [FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance](https://arxiv.org/abs/2305.05176v1) (May 2023)
  - [Enhancing Black-Box Few-Shot Text Classification with Prompt-Based Data Augmentation](https://arxiv.org/abs/2305.13785) (May 2023)
  - [Detecting automatically the layout of clinical documents to enhance the performances of downstream natural language processing](https://arxiv.org/abs/2305.13817) (May 2023)
  - ["Is the Pope Catholic?" Applying Chain-of-Thought Reasoning to Understanding Conversational Implicatures](https://arxiv.org/abs/2305.13826) (May 2023)