From 8685d53adcdd0310e76349ecb4e2b87f980c4673 Mon Sep 17 00:00:00 2001
From: Harrison Chase <hw.chase.17@gmail.com>
Date: Sat, 18 Mar 2023 11:12:18 -0700
Subject: [PATCH] querying tabular data (#1758)

---
 docs/index.rst            |  3 +++
 docs/use_cases/tabular.md | 31 +++++++++++++++++++++++++++++++
 2 files changed, 34 insertions(+)
 create mode 100644 docs/use_cases/tabular.md

diff --git a/docs/index.rst b/docs/index.rst
index 3b716f5a..8b8c8be7 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -97,6 +97,8 @@ The above modules can be used in a variety of ways. LangChain also provides guid
 
 - `Summarization <./use_cases/summarization.html>`_: Summarizing longer documents into shorter, more condensed chunks of information. A type of Data Augmented Generation.
 
+- `Querying Tabular Data <./use_cases/tabular.html>`_: If you want to understand how to use LLMs to query data that is stored in a tabular format (csvs, SQL, dataframes, etc) you should read this page.
+
 - `Evaluation <./use_cases/evaluation.html>`_: Generative models are notoriously hard to evaluate with traditional metrics. One new way of evaluating them is using language models themselves to do the evaluation. LangChain provides some prompts/chains for assisting in this.
 
 - `Generate similar examples <./use_cases/generate_examples.html>`_: Generating similar examples to a given input. This is a common use case for many applications, and LangChain provides some prompts/chains for assisting in this.
@@ -117,6 +119,7 @@ The above modules can be used in a variety of ways. LangChain also provides guid
    ./use_cases/combine_docs.md
    ./use_cases/question_answering.md
    ./use_cases/summarization.md
+   ./use_cases/tabular.rst
    ./use_cases/evaluation.rst
    ./use_cases/model_laboratory.ipynb
 
diff --git a/docs/use_cases/tabular.md b/docs/use_cases/tabular.md
new file mode 100644
index 00000000..c4dd0dd2
--- /dev/null
+++ b/docs/use_cases/tabular.md
@@ -0,0 +1,31 @@
+# Querying Tabular Data
+
+Lots of data and information is stored in tabular data, whether it be csvs, excel sheets, or SQL tables.
+This page covers all resources available in LangChain for working with data in this format.
+
+## Document Loading
+If you have text data stored in a tabular format, you may want to load the data into a Document and then index it as you would
+other text/unstructured data. For this, you should use a document loader like the [CSVLoader](../modules/document_loaders/examples/csv.ipynb)
+and then you should [create an index](../modules/indexes.rst) over that data, and [query it that way](../modules/indexes/chain_examples/vector_db_qa.ipynb).
+
+## Querying
+If you have more numeric tabular data, or have a large amount of data and don't want to index it, you should get started
+by looking at various chains and agents we have for dealing with this data.
+
+### Chains
+
+If you are just getting started, and you have relatively small/simple tabular data, you should get started with chains.
+Chains are a sequence of predetermined steps, so they are good to get started with as they give you more control and let you 
+understand what is happening better.
+
+- [SQL Database Chain](../modules/chains/examples/sqlite.ipynb)
+
+### Agents
+
+Agents are more complex, and involve multiple queries to the LLM to understand what to do.
+The downside of agents are that you have less control. The upside is that they are more powerful,
+which allows you to use them on larger databases and more complex schemas. 
+
+- [SQL Agent](../modules/agents/agent_toolkits/sql_database.ipynb)
+- [Pandas Agent](../modules/agents/agent_toolkits/pandas.ipynb)
+- [CSV Agent](../modules/agents/agent_toolkits/csv.ipynb)