added youtube videos

1 month ago · 215bb7377b
parent 3168dbf4b8
commit 215bb7377b
5 changed files with 53 additions and 0 deletions
--- a/img/research/structured_outputs.png
+++ b/img/research/structured_outputs.png
--- a/pages/introduction/examples.en.mdx
+++ b/pages/introduction/examples.en.mdx
@ -2,6 +2,7 @@

 import {Cards, Card} from 'nextra-theme-docs'
 import {CodeIcon} from 'components/icons'
+import {Bleed} from 'nextra-theme-docs'

 The previous section introduced a basic example of how to prompt LLMs. 

@ -18,6 +19,14 @@ Topics:

 ---

+<Bleed>
+  <iframe width="100%"
+    height="415px"
+    src="https://www.youtube.com/embed/TBhRC4Dath4?si=6nwh0GuYAOv1H6yT" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
+    allowFullScreen
+    />
+</Bleed>
+
 ## Text Summarization
 One of the standard tasks in natural language generation is text summarization. Text summarization can include many different flavors and domains. In fact, one of the most promising applications of language models is the ability to summarize articles and concepts into quick and easy-to-read summaries. Let's try a basic summarization task using prompts.

--- a/pages/research/_meta.en.json
+++ b/pages/research/_meta.en.json
@ -2,6 +2,8 @@
    "llm-agents": "LLM Agents",
    "rag": "RAG for LLMs",
    "llm-reasoning": "LLM Reasoning",
+    "rag_hallucinations": "RAG Reduces Hallucination",
+    "synthetic_data": "Synthetic Data",
    "thoughtsculpt": "ThoughtSculpt",
    "infini-attention": "Infini-Attention",
    "guided-cot": "LM-Guided CoT",
--- a/pages/research/rag_hallucinations.en.mdx
+++ b/pages/research/rag_hallucinations.en.mdx
@ -0,0 +1,21 @@
+# Reducing Hallucination in Structured Outputs via RAG
+
+import {Bleed} from 'nextra-theme-docs'
+
+<Bleed>
+  <iframe width="100%"
+    height="415px"
+    src="https://www.youtube.com/embed/TUL5guqZejw?si=Doc7lzyAY-SKr21L" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
+    allowFullScreen
+    />
+</Bleed>
+
+Researchers at ServiceNow shared a [new paper](https://arxiv.org/abs/2404.08189) where they discuss how to deploy an efficient RAG system for structured output tasks.
+
+!["RAG Hallucination"](../../img/research/structured_outputs.png)
+
+The RAG system combines a small language model with a very small retriever. It shows that RAG can enable deploying powerful LLM-powered systems in limited-resource settings while mitigating issues like hallucination and increasing the reliability of outputs.
+
+The paper covers the very useful enterprise application of translating natural language requirements to workflows (formatted in JSON). So much productivity can come from this task but there is a lot of optimization that can be further achieved (eg., using speculative decoding or using YAML instead of JSON).
+
+The paper provides some great insights and practical tips on how to effectively develop RAG systems for the real world.
--- a/pages/research/synthetic_data.en.mdx
+++ b/pages/research/synthetic_data.en.mdx
@ -0,0 +1,21 @@
+# Best Practices and Lessons Learned on Synthetic Data for Language Models
+
+import {Bleed} from 'nextra-theme-docs'
+
+<Bleed>
+  <iframe width="100%"
+    height="415px"
+    src="https://www.youtube.com/embed/YnlArBZJHY8?si=ZH3hFzwixUopxU5Z" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
+    allowFullScreen
+    />
+</Bleed>
+
+This [paper](https://arxiv.org/abs/2404.07503) provides an overview of best practices and lessons learned on synthetic data for language models ans was published by Google DeepMind and other collaborators. 
+
+It focuses on synthetic data and covers applications, challenges, and future directions. This is an important paper given the significant advancements we are seeing from the use of synthetic data in the field of AI.
+
+We know for sure that the more high-quality data we give these models, the better the performance. Creating synthetic data is not hard but ensuring its quality is really the challenge.
+
+The paper also discusses important topics when working with synthetic data such as ensuring quality, factuality, fidelity, unbiasedness, trustworthiness, privacy, and more.
+
+There are a lot of great references mentioned in the related work section as well.