langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-10 01:10:59 +00:00

Author	SHA1	Message	Date
Tim Asp	72ef69d1ba	Add new iFixit document loader (#1333 ) iFixit is a wikipedia-like site that has a huge amount of open content on how to fix things, questions/answers for common troubleshooting and "things" related content that is more technical in nature. All content is licensed under CC-BY-SA-NC 3.0 Adding docs from iFixit as context for user questions like "I dropped my phone in water, what do I do?" or "My macbook pro is making a whining noise, what's wrong with it?" can yield significantly better responses than context free response from LLMs.	2023-02-27 20:40:20 -08:00
Matt Robinson	1aa41b5741	feat: document loader for image files (#1330 ) ### Summary Adds a document loader for image files such as `.jpg` and `.png` files. ### Testing Run the following using the example document from the [`unstructured` repo](https://github.com/Unstructured-IO/unstructured/tree/main/example-docs). ```python from langchain.document_loaders.image import UnstructuredImageLoader loader = UnstructuredImageLoader("layout-parser-paper-fast.jpg") loader.load() ```	2023-02-27 14:43:32 -08:00
Eugene Yurtsev	c14cff60d0	Documentation: Minor typo fixes (#1327 ) Fixing a few minor typos in the documentation (and likely introducing other ones in the process).	2023-02-27 14:40:43 -08:00
Harrison Chase	f61858163d	bump version to 0.0.95 (#1324 )	2023-02-27 07:45:54 -08:00
Harrison Chase	0824d65a5c	Harrison/indexing pipeline (#1317 )	2023-02-27 00:31:36 -08:00
Akshay	a0bf856c70	Update agent_vectorstore.ipynb (#1318 ) nitpicking but just thought i'd add this typo which I found when going through the How-to 😄 (unless it was intentional) also, it's amazing that you added ReAct to LangChain!	2023-02-26 23:22:35 -08:00
Harrison Chase	166cda2cc6	Harrison/deeplake (#1316 ) Co-authored-by: Davit Buniatyan <d@activeloop.ai>	2023-02-26 22:35:04 -08:00
Harrison Chase	aaad6cc954	Harrison/atlas db (#1315 ) Co-authored-by: Brandon Duderstadt <brandonduderstadt@gmail.com>	2023-02-26 22:11:38 -08:00
Marc Puig	3989c793fd	Making it possible to use "certainty" as a parameter for the weaviate similarity_search (#1218 ) Checking if weaviate similarity_search kwargs contains "certainty" and use it accordingly. The minimal level of certainty must be a float, and it is computed by normalized distance.	2023-02-26 17:55:28 -08:00
Harrison Chase	81abcae91a	Harrison/banana fix (#1311 ) Co-authored-by: Erik Dunteman <44653944+erik-dunteman@users.noreply.github.com>	2023-02-26 17:53:57 -08:00
Casey A. Fitzpatrick	648b3b3909	Fix use case sentence for bash util doc (#1295 ) Thanks for all your hard work! I noticed a small typo in the bash util doc so here's a quick update. Additionally, my formatter caught some spacing in the `.md` as well. Happy to revert that if it's an issue. The main change is just ``` - A common use case this is for letting it interact with your local file system. + A common use case for this is letting the LLM interact with your local file system. ``` ## Testing `make docs_build` succeeds locally and the changes show as expected ✌️ <img width="704" alt="image" src="https://user-images.githubusercontent.com/17773666/221376160-e99e59a6-b318-49d1-a1d7-89f5c17cdab4.png">	2023-02-26 17:41:03 -08:00
Ingo Kleiber	fd9975dad7	add CoNLL-U document loader (#1297 ) I've added a simple [CoNLL-U](https://universaldependencies.org/format.html) document loader. CoNLL-U is a common format for NLP tasks and is used, for example, in the Universal Dependencies treebank corpora. The loader reads a single file in standard CoNLL-U format and returns a document.	2023-02-26 17:27:00 -08:00
Harrison Chase	d29f74114e	copy paste loader (#1302 )	2023-02-26 17:26:37 -08:00
Harrison Chase	ce441edd9c	improve docs (#1309 )	2023-02-26 11:25:16 -08:00
Harrison Chase	6f30d68581	add example of using agent with vectorstores (#1285 )	2023-02-25 13:27:24 -08:00
Matt Robinson	2f15c11b87	feat: document loader for MS Word documents (#1282 ) ### Summary Adds a document loader for MS Word Documents. Works with both `.docx` and `.doc` files as longer as the user has installed `unstructured>=0.4.11`. ### Testing The follow workflow test the loader for both `.doc` and `.docx` files using example docs from the `unstructured` repo. #### `.docx` ```python from langchain.document_loaders import UnstructuredWordDocumentLoader filename = "../unstructured/example-docs/fake.docx" loader = UnstructuredWordDocumentLoader(filename) loader.load() ``` #### `.doc` ```python from langchain.document_loaders import UnstructuredWordDocumentLoader filename = "../unstructured/example-docs/fake.doc" loader = UnstructuredWordDocumentLoader(filename) loader.load() ```	2023-02-24 08:26:19 -08:00
Harrison Chase	96db6ed073	cleanup (#1274 )	2023-02-24 07:38:24 -08:00
Harrison Chase	42167a1e24	Harrison/fb loader (#1277 ) Co-authored-by: Vairo Di Pasquale <vairo.dp@gmail.com>	2023-02-24 07:22:48 -08:00
Klein Tahiraj	8a0751dadd	adding .ipynb loader and documentation Fixes #1248 (#1252 ) `NotebookLoader.load()` loads the `.ipynb` notebook file into a `Document` object. Parameters: * `include_outputs` (bool): whether to include cell outputs in the resulting document (default is False). * `max_output_length` (int): the maximum number of characters to include from each cell output (default is 10). * `remove_newline` (bool): whether to remove newline characters from the cell sources and outputs (default is False). * `traceback` (bool): whether to include full traceback (default is False).	2023-02-24 07:10:35 -08:00
Enrico Shippole	9becdeaadf	Add Writer, Banana, Modal, StochasticAI (#1270 ) Add LLM wrappers and examples for Banana, Writer, Modal, Stochastic AI Added rigid json format for Banana and Modal	2023-02-24 06:58:58 -08:00
Matt Robinson	10e73a3723	docs: remove nltk download steps (#1253 ) ### Summary Updates the docs to remove the `nltk` download steps from `unstructured`. As of `unstructured` `0.4.14`, this is handled automatically in the relevant modules within `unstructured`.	2023-02-23 12:34:44 -08:00
Justin Torre	5bc6dc076e	added caching and properties docs (#1255 )	2023-02-23 11:03:04 -08:00
Iskren Ivov Chernev	8e3cd3e0dd	Add DeepInfra LLM support (#1232 ) DeepInfra is an Inference-as-a-Service provider. Add a simple wrapper using HTTPS requests.	2023-02-23 07:37:15 -08:00
Dmitri Melikyan	b7765a95a0	docs: add Graphsignal ecosystem page (#1228 ) Adds a Graphsignal ecosystem page	2023-02-23 07:33:00 -08:00
Harrison Chase	6085fe18d4	add ifttt tool (#1244 )	2023-02-22 22:29:43 -08:00
Harrison Chase	71709ad5d5	Update key_concepts.md (#1209 ) (#1237 ) Link for easier navigation (it's not immediately clear where to find more info on SimpleSequentialChain (3 clicks away) --------- Co-authored-by: Larry Fisherman <l4rryfisherman@protonmail.com>	2023-02-22 13:30:53 -08:00
Dennis Antela Martinez	53c67e04d4	add aleph alpha llm (#1207 ) Integrate Aleph Alpha's client into Langchain to provide access to the luminous models - more info on latest benchmarks here: https://www.aleph-alpha.com/luminous-performance-benchmarks	2023-02-22 10:37:36 -08:00
Ikko Eltociear Ashimine	334b553260	Update petals.md (#1225 ) Huggingface -> Hugging Face	2023-02-22 10:34:16 -08:00
Sason	cc7d2e5621	Correct typo in "Question Answering" How-To Guide (#1221 )	2023-02-21 17:02:58 -08:00
Matt Robinson	3d5f56a8a1	docs: add quotes to `unstructured[local-inference]` install instructions (#1208 ) ### Summary Corrects the install instruction for local inference to `pip install "unstructured[local-inference]"`	2023-02-21 08:06:43 -08:00
Harrison Chase	047231840d	add docs for chroma persistance (#1202 )	2023-02-20 23:04:17 -08:00
Harrison Chase	5bdb8dd6fe	Harrison/unstructured io (#1200 )	2023-02-20 22:54:49 -08:00
Harrison Chase	d90a287d8f	Harrison/updating docs (#1196 )	2023-02-20 22:54:26 -08:00
Dennis Antela Martinez	23243ae69c	add gitbook document loader (#1180 ) Added a GitBook document loader. It lets you both, (1) fetch text from any single GitBook page, or (2) fetch all relative paths and return their respective content in Documents. I've modified the `scrape` method in the `WebBaseLoader` to accept custom web paths if given, but happy to remove it and move that logic into the `GitbookLoader` itself.	2023-02-20 20:05:04 -08:00
Naveen Tatikonda	0118706fd6	Add Support for OpenSearch Vector database (#1191 ) ### Description This PR adds a wrapper which adds support for the OpenSearch vector database. Using opensearch-py client we are ingesting the embeddings of given text into opensearch cluster using Bulk API. We can perform the `similarity_search` on the index using the 3 popular searching methods of OpenSearch k-NN plugin: - `Approximate k-NN Search` use approximate nearest neighbor (ANN) algorithms from the [nmslib](https://github.com/nmslib/nmslib), [faiss](https://github.com/facebookresearch/faiss), and [Lucene](https://lucene.apache.org/) libraries to power k-NN search. - `Script Scoring` extends OpenSearch’s script scoring functionality to execute a brute force, exact k-NN search. - `Painless Scripting` adds the distance functions as painless extensions that can be used in more complex combinations. Also, supports brute force, exact k-NN search like Script Scoring. ### Issues Resolved https://github.com/hwchase17/langchain/issues/1054 --------- Signed-off-by: Naveen Tatikonda <navtat@amazon.com>	2023-02-20 18:39:34 -08:00
Harrison Chase	926c121b98	Harrison/text splitter docs (#1188 )	2023-02-20 15:14:03 -08:00
Harrison Chase	91446a5e9b	clean up text splitting docs (#1184 )	2023-02-20 11:24:31 -08:00
Harrison Chase	5a954efdd7	update gallery with slack bot (#1177 )	2023-02-20 08:21:00 -08:00
blob42	9962bda70b	searx_search: docs updates (#1175 ) - fix notebook formatting, remove empty cells and add scrolling for long text --------- Co-authored-by: blob42 <spike@w530>	2023-02-20 06:46:44 -08:00
Harrison Chase	4f3fbd7267	improve docs for indexes (#1146 )	2023-02-19 23:14:50 -08:00
Harrison Chase	28781a6213	Harrison/markdown splitter (#1169 ) Co-authored-by: Michael Chen <flamingdescent@gmail.com> Co-authored-by: Michael Chen <michaelchen@stripe.com>	2023-02-19 21:31:58 -08:00
Nan Wang	e8f224fd3a	docs: add missing links to toc (#1163 ) add missing links to toc --------- Signed-off-by: Nan Wang <nan.wang@jina.ai>	2023-02-19 21:15:11 -08:00
Nick	afe884fb96	AI21 documentation incorrectly titled Cohere (#1167 )	2023-02-19 21:14:59 -08:00
Harrison Chase	955c89fccb	pass in prompts to vectordbqa (#1158 )	2023-02-19 20:47:17 -08:00
Harrison Chase	65cc81c479	directory loader improvements (#1162 )	2023-02-19 20:47:08 -08:00
Harrison Chase	9d6d8f85da	Harrison/self hosted runhouse (#1154 ) Co-authored-by: Donny Greenberg <dongreenberg2@gmail.com> Co-authored-by: John Dagdelen <jdagdelen@users.noreply.github.com> Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net> Co-authored-by: Andrew White <white.d.andrew@gmail.com> Co-authored-by: Peng Qu <82029664+pengqu123@users.noreply.github.com> Co-authored-by: Matt Robinson <mthw.wm.robinson@gmail.com> Co-authored-by: jeff <tangj1122@gmail.com> Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MacBook-Pro.local> Co-authored-by: zanderchase <zander@unfold.ag> Co-authored-by: Charles Frye <cfrye59@gmail.com> Co-authored-by: zanderchase <zanderchase@gmail.com> Co-authored-by: Shahriar Tajbakhsh <sh.tajbakhsh@gmail.com> Co-authored-by: Stefan Keselj <skeselj@princeton.edu> Co-authored-by: Francisco Ingham <fpingham@gmail.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: cragwolfe <cragcw@gmail.com> Co-authored-by: Anton Troynikov <atroyn@users.noreply.github.com> Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com> Co-authored-by: Oliver Klingefjord <oliver@klingefjord.com> Co-authored-by: blob42 <contact@blob42.xyz> Co-authored-by: blob42 <spike@w530> Co-authored-by: Enrico Shippole <henryshippole@gmail.com> Co-authored-by: Ibis Prevedello <ibiscp@gmail.com> Co-authored-by: jped <jonathanped@gmail.com> Co-authored-by: Justin Torre <justintorre75@gmail.com> Co-authored-by: Ivan Vendrov <ivan@anthropic.com> Co-authored-by: Sasmitha Manathunga <70096033+mmz-001@users.noreply.github.com> Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com> Co-authored-by: Matt Robinson <mrobinson@unstructuredai.io> Co-authored-by: Jeff Huber <jeffchuber@gmail.com> Co-authored-by: Akshay <64036106+akshayvkt@users.noreply.github.com> Co-authored-by: Andrew Huang <jhuang16888@gmail.com> Co-authored-by: rogerserper <124558887+rogerserper@users.noreply.github.com> Co-authored-by: seanaedmiston <seane999@gmail.com> Co-authored-by: Hasegawa Yuya <52068175+Hase-U@users.noreply.github.com> Co-authored-by: Ivan Vendrov <ivendrov@gmail.com> Co-authored-by: Chen Wu (吴尘) <henrychenwu@cmu.edu> Co-authored-by: Dennis Antela Martinez <dennis.antela@gmail.com> Co-authored-by: Maxime Vidal <max.vidal@hotmail.fr> Co-authored-by: Rishabh Raizada <110235735+rishabh-ti@users.noreply.github.com>	2023-02-19 09:53:45 -08:00
CG80499	af8f5c1a49	Added constitutional chain. (#1147 ) - Added self-critique constitutional chain based on this [paper](https://www.anthropic.com/constitutional.pdf).	2023-02-18 19:31:51 -08:00
Harrison Chase	a83ba44efa	Harrison/ver0089 (#1144 )	2023-02-18 14:25:37 -08:00
Ankush Gola	7b5e160d28	Make Tools own model, add ToolKit Concept (#1095 ) Follow-up of @hinthornw's PR: - Migrate the Tool abstraction to a separate file (`BaseTool`). - `Tool` implementation of `BaseTool` takes in function and coroutine to more easily maintain backwards compatibility - Add a Toolkit abstraction that can own the generation of tools around a shared concept or state --------- Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Francisco Ingham <fpingham@gmail.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: cragwolfe <cragcw@gmail.com> Co-authored-by: Anton Troynikov <atroyn@users.noreply.github.com> Co-authored-by: Oliver Klingefjord <oliver@klingefjord.com> Co-authored-by: William Fu-Hinthorn <whinthorn@Williams-MBP-3.attlocal.net> Co-authored-by: Bruno Bornsztein <bruno.bornsztein@gmail.com>	2023-02-18 13:40:43 -08:00
Harrison Chase	45b5640fe5	fix sql (#1141 )	2023-02-18 11:49:08 -08:00
Sam Hogan	85c1449a96	Fix typo in HyDE docs (#1142 )	2023-02-18 11:48:46 -08:00
Harrison Chase	fb3c73d194	add srt loader (#1140 )	2023-02-18 10:58:39 -08:00
Harrison Chase	483821ea3b	fix docs (#1133 )	2023-02-18 08:13:54 -08:00
Harrison Chase	d5f3dfa1e1	Harrison/hn loader (#1130 ) Co-authored-by: William X <william.y.xuan@gmail.com>	2023-02-17 15:15:02 -08:00
Harrison Chase	511d41114f	return source documents for chat vector db chain (#1128 )	2023-02-17 13:40:52 -08:00
Matt Robinson	b956070f08	docs: add an unstructured section to the ecosystem page (#1125 ) ### Summary Adds an Unstructured section to the ecosystem page.	2023-02-17 13:02:23 -08:00
Francisco Ingham	3462130e2d	Modify number of types of chains (#1089 ) Changed number of types of chains to make it consistent with the rest of the docs	2023-02-16 07:06:30 -08:00
Harrison Chase	7745505482	chat qa with sources (#1084 )	2023-02-16 00:29:47 -08:00
Harrison Chase	badeeb37b0	fix stuff count (#1083 )	2023-02-15 23:57:13 -08:00
Harrison Chase	971458c5de	docs for batch size (#1082 )	2023-02-15 23:53:56 -08:00
Harrison Chase	5e10e19bfe	Harrison/align table (#1081 ) Co-authored-by: Francisco Ingham <fpingham@gmail.com>	2023-02-15 23:53:37 -08:00
Harrison Chase	c60954d0f8	Harrison/telegram loader (#1080 ) Co-authored-by: Maxime Vidal <max.vidal@hotmail.fr>	2023-02-15 23:24:32 -08:00
Dennis Antela Martinez	a1c296bc3c	docs: increase width (#1049 ) This addresses #948. I set the documentation max width to 2560px, but can be adjusted - see screenshot below. <img width="1741" alt="Screenshot 2023-02-14 at 13 05 57" src="https://user-images.githubusercontent.com/23406704/218749076-ea51e90a-a220-4558-b4fe-5a95b39ebf15.png">	2023-02-15 23:07:01 -08:00
Harrison Chase	19c2797bed	add anthropic example (#1041 ) Co-authored-by: Ivan Vendrov <ivendrov@gmail.com> Co-authored-by: Sasmitha Manathunga <70096033+mmz-001@users.noreply.github.com>	2023-02-15 23:04:28 -08:00
blob42	3ecdea8be4	SearxNG meta search api helper (#854 ) This is a work in progress PR to track my progres. ## TODO: - [x] Get results using the specifed searx host - [x] Prioritize returning an `answer` or results otherwise - [ ] expose the field `infobox` when available - [ ] expose `score` of result to help agent's decision - [ ] expose the `suggestions` field to agents so they could try new queries if no results are found with the orignial query ? - [ ] Dynamic tool description for agents ? - Searx offers many engines and a search syntax that agents can take advantage of. It would be nice to generate a dynamic Tool description so that it can be used many times as a tool but for different purposes. - [x] Limit number of results - [ ] Implement paging - [x] Miror the usage of the Google Search tool - [x] easy selection of search engines - [x] Documentation - [ ] update HowTo guide notebook on Search Tools - [ ] Handle async - [ ] Tests ### Add examples / documentation on possible uses with - [ ] getting factual answers with `!wiki` option and `infoboxes` - [ ] getting `suggestions` - [ ] getting `corrections` --------- Co-authored-by: blob42 <spike@w530> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-02-15 23:03:57 -08:00
seanaedmiston	f0a258555b	Support similarity search by vector (in FAISS) (#961 ) Alternate implementation to PR #960 Again - only FAISS is implemented. If accepted can add this to other vectorstores or leave as NotImplemented? Suggestions welcome...	2023-02-15 22:50:00 -08:00
Jonathan Pedoeem	05ad399abe	Update PromptLayerOpenAI LLM to include support for ASYNC API (#1066 ) This PR updates `PromptLayerOpenAI` to now support requests using the [Async API](https://langchain.readthedocs.io/en/latest/modules/llms/async_llm.html) It also updates the documentation on Async API to let users know that PromptLayerOpenAI also supports this. `PromptLayerOpenAI` now redefines `_agenerate` a similar was to how it redefines `_generate`	2023-02-15 22:48:09 -08:00
Harrison Chase	98186ef180	Harrison/evernote nb (#1078 ) Co-authored-by: Akshay <64036106+akshayvkt@users.noreply.github.com>	2023-02-15 22:47:30 -08:00
rogerserper	e46cd3b7db	Google Search API integration with serper.dev (wrapper, tests, docs, … (#909 ) Adds Google Search integration with [Serper](https://serper.dev) a low-cost alternative to SerpAPI (10x cheaper + generous free tier). Includes documentation, tests and examples. Hopefully I am not missing anything. Developers can sign up for a free account at [serper.dev](https://serper.dev) and obtain an api key. ## Usage ```python from langchain.utilities import GoogleSerperAPIWrapper from langchain.llms.openai import OpenAI from langchain.agents import initialize_agent, Tool import os os.environ["SERPER_API_KEY"] = "" os.environ['OPENAI_API_KEY'] = "" llm = OpenAI(temperature=0) search = GoogleSerperAPIWrapper() tools = [ Tool( name="Intermediate Answer", func=search.run ) ] self_ask_with_search = initialize_agent(tools, llm, agent="self-ask-with-search", verbose=True) self_ask_with_search.run("What is the hometown of the reigning men's U.S. Open champion?") ``` ### Output ``` Entering new AgentExecutor chain... Yes. Follow up: Who is the reigning men's U.S. Open champion? Intermediate answer: Current champions Carlos Alcaraz, 2022 men's singles champion. Follow up: Where is Carlos Alcaraz from? Intermediate answer: El Palmar, Spain So the final answer is: El Palmar, Spain > Finished chain. 'El Palmar, Spain' ```	2023-02-15 22:47:17 -08:00
Jonathan Pedoeem	05df480376	Update `PromptLayerOpenAI` LLM usage instructions in documentation (#1053 ) This PR updates the usage instructions for PromptLayerOpenAI in Langchain's documentation. The updated instructions provide more detail and conform better to the style of other LLM integration documentation pages. No code changes were made in this PR, only improvements to the documentation. This update will make it easier for users to understand how to use `PromptLayerOpenAI`	2023-02-15 22:37:48 -08:00
Ankush Gola	d8ac274fc2	add to async chain notebook (#1056 )	2023-02-14 18:20:38 -08:00
Ankush Gola	caa8e4742e	Enable streaming for OpenAI LLM (#986 ) * Support a callback `on_llm_new_token` that users can implement when `OpenAI.streaming` is set to `True`	2023-02-14 15:06:14 -08:00
Sasmitha Manathunga	c67c5383fd	docs: fix typo in notebook (#1046 )	2023-02-14 07:06:08 -08:00
Harrison Chase	88bebb4caa	Harrison/llm integrations (#1039 ) Co-authored-by: jped <jonathanped@gmail.com> Co-authored-by: Justin Torre <justintorre75@gmail.com> Co-authored-by: Ivan Vendrov <ivan@anthropic.com>	2023-02-13 22:06:25 -08:00
Harrison Chase	ec727bf166	Align table info (#999 ) (#1034 ) Currently the chain is getting the column names and types on the one side and the example rows on the other. It is easier for the llm to read the table information if the column name and examples are shown together so that it can easily understand to which columns do the examples refer to. For an instantiation of this, please refer to the changes in the `sqlite.ipynb` notebook. Also changed `eval` for `ast.literal_eval` when interpreting the results from the sample row query since it is a better practice. --------- Co-authored-by: Francisco Ingham <> --------- Co-authored-by: Francisco Ingham <fpingham@gmail.com>	2023-02-13 21:48:41 -08:00
Enrico Shippole	f30dcc6359	Add GooseAI, CerebriumAI, Petals, ForefrontAI (#981 ) Add GooseAI, CerebriumAI, Petals, ForefrontAI	2023-02-13 21:20:19 -08:00
Harrison Chase	6a31a59400	add links (#1027 )	2023-02-13 16:33:30 -08:00
Harrison Chase	7fb33fca47	chroma docs (#1012 )	2023-02-12 23:02:01 -08:00
Harrison Chase	0c553d2064	Harrion/kg (#1016 ) Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>	2023-02-12 23:01:26 -08:00
cragwolfe	05d8969c79	Unstructured example notebook: add a pdf, related deps (#1011 ) Updates the Unstructured example notebook with a PDF example. Includes additional dependencies for PDF processing (and images, etc).	2023-02-12 14:56:48 -08:00
Dhruv Anand	03e5794978	typo fix on chat vector db docs (#1007 ) simple typo fix: because --> between	2023-02-12 12:09:21 -08:00
Harrison Chase	0998577dfe	Harrison/unstructured structured (#1004 )	2023-02-12 07:36:11 -08:00
Harrison Chase	bbb06ca4cf	pdfminer (#1003 )	2023-02-12 07:29:26 -08:00
Francisco Ingham	0b6aa6a024	Added initial capital letter to bullet points that had it missing (#1000 ) Co-authored-by: Francisco Ingham <>	2023-02-11 20:31:34 -08:00
Harrison Chase	10e7297306	Harrison/fake llm (#990 ) Co-authored-by: Stefan Keselj <skeselj@princeton.edu> Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>	2023-02-11 15:12:35 -08:00
Harrison Chase	e51fad1488	Harrison/0083 (#996 ) Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>	2023-02-11 08:29:28 -08:00
Harrison Chase	2e96704d59	Harrison/airbyte (#989 ) Co-authored-by: zanderchase <zanderchase@gmail.com> Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MacBook-Pro.local>	2023-02-10 18:08:00 -08:00
Charles Frye	e9799d6821	improves huggingface_hub example (#988 ) The provided example uses the default `max_length` of `20` tokens, which leads to the example generation getting cut off. 20 tokens is way too short to show CoT reasoning, so I boosted it to `64`. Without knowing HF's API well, it can be hard to figure out just where those `model_kwargs` come from, and `max_length` is a super critical one.	2023-02-10 17:56:15 -08:00
zanderchase	c2d1d903fa	Zander/online pdf loader (#984 )	2023-02-10 15:42:30 -08:00
Harrison Chase	055a53c27f	add texts example (#985 ) Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MacBook-Pro.local>	2023-02-10 12:32:44 -08:00
jeff	6ab432d62e	docs: update spelling typos (#982 ) Wonder why "with" is spelled "wiht" so many times by human	2023-02-10 11:37:59 -08:00
Matt Robinson	07a407d89a	feat: adds `UnstructuredURLLoader` for loading data from urls (#979 ) ### Summary Adds a `UnstructuredURLLoader` that supports loading data from a list of URLs. ### Testing ```python from langchain.document_loaders import UnstructuredURLLoader urls = [ "https://www.understandingwar.org/backgrounder/russian-offensive-campaign-assessment-february-8-2023", "https://www.understandingwar.org/backgrounder/russian-offensive-campaign-assessment-february-9-2023" ] loader = UnstructuredURLLoader(urls=urls) raw_documents = loader.load() ```	2023-02-10 10:18:38 -08:00
Harrison Chase	c64f98e2bb	Harrison/format agent instructions (#973 ) Co-authored-by: Andrew White <white.d.andrew@gmail.com> Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net> Co-authored-by: Peng Qu <82029664+pengqu123@users.noreply.github.com>	2023-02-10 10:07:26 -08:00
Harrison Chase	5469d898a9	Harrison/everynote (#974 ) Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>	2023-02-10 08:02:35 -08:00
Harrison Chase	3d639d1539	update lint (#975 ) Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>	2023-02-10 08:01:13 -08:00
Harrison Chase	01fa2d8117	Harrison/youtube fixes (#955 ) Co-authored-by: Ji <jizhang.work@gmail.com> Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>	2023-02-09 08:12:22 -08:00
zanderchase	8e126bc9bd	adding webpage loading logic (#942 )	2023-02-09 07:52:50 -08:00
Harrison Chase	c71027e725	add docs for steamship deployment (#949 ) Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>	2023-02-08 16:01:19 -08:00
Harrison Chase	3e1901e1aa	gutenberg books (#946 ) Co-authored-by: zanderchase <zander@unfold.ag> Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>	2023-02-08 12:00:47 -08:00
jeff	6a4f602156	docs: fix spelling typo (#934 )	2023-02-08 11:13:35 -08:00

1 2 3 4 5 ...

363 Commits