langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-29 17:07:25 +00:00

Author	SHA1	Message	Date
Harrison Chase	cc423f40f1	Harrison/youtube loader (#1545 ) Co-authored-by: Julian Wustl <57504258+Julianwustl@users.noreply.github.com>	2023-03-08 20:53:27 -08:00
Harrison Chase	b053f831cd	Harrison/contributing (#1542 ) Co-authored-by: Saurav Maheshkar <sauravvmaheshkar@gmail.com>	2023-03-08 20:53:16 -08:00
Harrison Chase	523ad8d2e2	Harrison/chat history formatter1 (#1538 ) Co-authored-by: Youssef A. Abukwaik <yousseb@users.noreply.github.com>	2023-03-08 20:46:37 -08:00
Graham Neubig	31303d0b11	Added other evaluation metrics for data-augmented QA (#1521 ) This PR adds additional evaluation metrics for data-augmented QA, resulting in a report like this at the end of the notebook: ![Screen Shot 2023-03-08 at 8 53 23 AM](https://user-images.githubusercontent.com/398875/223731199-8eb8e77f-5ff3-40a2-a23e-f3bede623344.png) The score calculation is based on the [Critique](https://docs.inspiredco.ai/critique/) toolkit, an API-based toolkit (like OpenAI) that has minimal dependencies, so it should be easy for people to run if they choose. The code could further be simplified by actually adding a chain that calls Critique directly, but that probably should be saved for another PR if necessary. Any comments or change requests are welcome!	2023-03-08 20:41:03 -08:00
gidler	494c9d341a	[DOCS] Assorted wording, punctuation, and consistency revisions (#1443 ) Contributing some small fixes I noticed while reading through the documentation. Thank you for a creating and maintaining this project!	2023-03-08 20:16:09 -08:00
Harrison Chase	519f0187b6	Harrison/gdrive pdf (#1433 ) Co-authored-by: LM <93918064+LuisMalhadas@users.noreply.github.com> Co-authored-by: Luis Malhadas <luis@sia.so>	2023-03-08 20:15:36 -08:00
Florian Leuerer	64c6435545	Added client_settings support for chromadb vecstore (#1528 ) # Problem The ChromaDB vecstore only supported local connection. There was no way to use a chromadb server. # Fix Added `client_settings` as Chroma attribute. # Usage ``` from chromadb.config import Settings from langchain.vectorstores import Chroma chroma_settings = Settings(chroma_api_impl="rest", chroma_server_host="localhost", chroma_server_http_port="80") docsearch = Chroma.from_documents(chunks, embeddings, metadatas=metadatas, client_settings=chroma_settings, collection_name=COLLECTION_NAME) ```	2023-03-08 17:42:09 -08:00
Harrison Chase	7eba828e1b	Harrison/update regex (#1534 ) Co-authored-by: Luis <57528712+LuisLechugaRuiz@users.noreply.github.com>	2023-03-08 17:41:17 -08:00
Harrison Chase	2a7215bc3b	Harrison/prompt issues (#1537 )	2023-03-08 16:56:10 -08:00
Alpri Else	784d24a1d5	Support S3 Object keys with `/` in `S3FileLoader` (#1517 ) Resolves https://github.com/hwchase17/langchain/issues/1510 ### Problem When loading S3 Objects with `/` in the object key (eg. `folder/some-document.txt`) using `S3FileLoader`, the objects are downloaded into a temporary directory and saved as a file. This errors out when the parent directory does not exist within the temporary directory. See https://github.com/hwchase17/langchain/issues/1510#issuecomment-1459583696 on how to reproduce this bug ### What this pr does Creates parent directories based on object key. This also works with deeply nested keys: `folder/subfolder/some-document.txt`	2023-03-08 16:17:26 -08:00
Harrison Chase	aba58e9e2e	Harrison/bumpver104 (#1525 )	2023-03-08 09:46:02 -08:00
Harrison Chase	c4a557bdd4	add concept of prompt collection (#1507 )	2023-03-08 08:31:29 -08:00
Ivan	97e3666e0d	changed requests.run to requests.get (#1485 ) This pull request proposes an update to the Lightweight wrapper library's documentation. The current documentation provides an example of how to use the library's requests.run method, as follows: requests.run("https://www.google.com"). However, this example does not work for the 0.0.102 version of the library. Testing: The changes have been tested locally to ensure they are working as intended. Thank you for considering this pull request.	2023-03-07 21:10:23 -08:00
Harrison Chase	7ade419a0e	allow passing of messages into prompt template (#1505 )	2023-03-07 21:10:12 -08:00
Harrison Chase	a4a2d79087	Harrison/rtd loader (#1513 ) Co-authored-by: Youssef A. Abukwaik <yousseb@users.noreply.github.com>	2023-03-07 21:09:54 -08:00
Harrison Chase	8f21605d71	add return source docs (#1515 )	2023-03-07 21:09:36 -08:00
Harrison Chase	064741db58	Harrison/fix text splitter (#1511 ) Co-authored-by: ajaysolanky <ajsolanky@gmail.com> Co-authored-by: Ajay Solanky <ajaysolanky@saw-l14668307kd.myfiosgateway.com>	2023-03-07 15:42:28 -08:00
Tom Dyson	e3354404ad	Fix link to Pinecone notebook (#1492 )	2023-03-07 15:24:03 -08:00
Harrison Chase	3610ef2830	add fake embeddings class (#1503 )	2023-03-07 15:23:46 -08:00
Ankush Gola	27104d4921	fix `ChatOpenAI.agenerate` (#1504 )	2023-03-07 15:22:05 -08:00
Harrison Chase	4f41e20f09	memory docs (#1501 )	2023-03-07 11:02:46 -08:00
Harrison Chase	d0062c7a9a	bump version to 103 (#1498 )	2023-03-07 10:08:01 -08:00
Harrison Chase	8e6f599822	change to baselanguagemodel (#1496 )	2023-03-07 09:29:59 -08:00
Harrison Chase	f276bfad8e	Harrison/chat memory (#1495 )	2023-03-07 09:02:40 -08:00
Harrison Chase	7bec461782	Harrison/memory refactor (#1478 ) moves memory to own module, factors out common stuff	2023-03-07 07:59:37 -08:00
kahkeng	df6865cd52	Allow no token limit for ChatGPT API (#1481 ) The endpoint default is inf if we don't specify max_tokens, so unlike regular completion API, we don't need to calculate this based on the prompt.	2023-03-06 13:18:55 -08:00
Harrison Chase	312c319d8b	bump version to 102 (#1471 )	2023-03-06 10:50:44 -08:00
Harrison Chase	0e21463f07	(rfc) chat models (#1424 ) Co-authored-by: Ankush Gola <ankush.gola@gmail.com>	2023-03-06 08:34:24 -08:00
Juanky Soriano	dec3750875	Change method to calculate number of tokens for OpenAIChat (#1457 ) Solves https://github.com/hwchase17/langchain/issues/1412 Currently `OpenAIChat` inherits the way it calculates the number of tokens, `get_num_token`, from `BaseLLM`. In the other hand `OpenAI` inherits from `BaseOpenAI`. `BaseOpenAI` and `BaseLLM` uses different methodologies for doing this. The first relies on `tiktoken` while the second on `GPT2TokenizerFast`. The motivation of this PR is: 1. Bring consistency about the way of calculating number of tokens `get_num_token` to the `OpenAI` family, regardless of `Chat` vs `non Chat` scenarios. 2. Give preference to the `tiktoken` method as it's serverless friendly. It doesn't require downloading models which might make it incompatible with `readonly` filesystems.	2023-03-06 07:20:25 -08:00
Tim Asp	763f879536	fix always verbose on summarization checker (#1440 )	2023-03-05 07:10:08 -08:00
Harrison Chase	56b850648f	cr (#1436 )	2023-03-04 08:38:56 -08:00
Harrison Chase	63a5614d23	Harrison/simple memory (#1435 ) Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>	2023-03-04 08:15:52 -08:00
Harrison Chase	a1b9dfc099	Harrison/similarity search chroma (#1434 ) Co-authored-by: shibuiwilliam <shibuiyusuke@gmail.com>	2023-03-04 08:10:15 -08:00
Peng Qu	68ce68f290	Fix an unusual issue that occurs when using OpenAIChat for llm_math (#1410 ) Fix an issue that occurs when using OpenAIChat for llm_math, refer to the code style of the "Final Answer:" in Mrkl。 the reason is I found a issue when I try OpenAIChat for llm_math, when I try the question in Chinese, the model generate the format like "\n\nQuestion: What is the square of 29?\nAnswer: 841", it translate the question first , then answer. below is my snapshot: <img width="945" alt="snapshot" src="https://user-images.githubusercontent.com/82029664/222642193-10ecca77-db7b-4759-bc46-32a8f8ddc48f.png">	2023-03-04 07:56:07 -08:00
Ikko Eltociear Ashimine	b8a7828d1f	Update huggingface_datasets.ipynb (#1417 ) HuggingFace -> Hugging Face	2023-03-04 00:22:31 -08:00
Kentaro Tanaka	6a4ee07e4f	Fix type hint of 'vectorstore_cls' arg in `SemanticSimilarityExampleSelector` (#1427 ) Hello! Thank you for the amazing library you've created! While following the tutorial at [the link(`Using an example selector`)](https://langchain.readthedocs.io/en/latest/modules/prompts/examples/few_shot_examples.html#using-an-example-selector), I noticed that passing Chroma as an argument to from_examples results in a type hint error. Error message(mypy): ``` Argument 3 to "from_examples" of "SemanticSimilarityExampleSelector" has incompatible type "Type[Chroma]"; expected "VectorStore" [arg-type]mypy(error) ``` This pull request fixes the type hint and allows the VectorStore class to be specified as an argument.	2023-03-04 00:20:18 -08:00
Tim Asp	23231d65a9	Add PyMuPDF PDF loader (#1426 ) Different PDF libraries have different strengths and weaknesses. PyMuPDF does a good job at extracting the most amount of content from the doc, regardless of the source quality, extremely fast (especially compared to Unstructured). https://pymupdf.readthedocs.io/en/latest/index.html	2023-03-03 20:59:28 -08:00
blob42	3d54b05863	searx: add install instructions, update doc and notebooks (#1420 ) - Added instructions on setting up self hosted searx - Add notebook example with agent - Use `localhost:8888` as example url to stay consistent since public instances are not really usable. Co-authored-by: blob42 <spike@w530>	2023-03-03 20:57:50 -08:00
Tim Asp	bca0935d90	[docs] fix minor import error (#1425 )	2023-03-03 16:10:07 -08:00
Jon Luo	882f7964fb	fix sql misinterpretation of % in query (#1408 ) % is being misinterpreted by sqlalchemy as parameter passing, so any `LIKE 'asdf%'` will result in a value error with mysql, mariadb, and maybe some others. This is one way to fix it - the alternative is to simply double up %, like `LIKE 'asdf%%'` but this seemed cleaner in terms of output. Fixes #1383	2023-03-02 16:03:16 -08:00
JonLuca De Caro	443992c4d5	[Docs] Add missing word from prompt docs (#1406 ) The prompt in the first example of the quickstart guide was missing `for `	2023-03-02 16:02:54 -08:00
Eugene Yurtsev	a83a371069	Minor documentation update in initialize_agent (#1397 ) Updating documentation in initialize_agent. One thing that could benefit from further clarification is the responsibility breakdown by between an AgentExecutor vs. an Agent. The documentation for an AgentExecutor does not clarify that. From the class attributes, it appears that executor has access to the tools, while the agent is only aware of the tool names. Anyway, additional clarification would be beneficial on the AgentExecutor class.	2023-03-02 11:46:35 -08:00
Nuno Campos	499e76b199	Allow the regular openai class to be used for ChatGPT models (#1393 ) Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-03-02 09:04:18 -08:00
Kacper Łukawski	8947797250	Return Cohere embeddings as lists of floats (#1394 ) This PR fixes the types returned by Cohere embeddings. Currently, Cohere client returns instances of `cohere.embeddings.Embeddings`. Since the transport layer relies on JSON, some numbers might be represented as ints, not floats, which happens quite often. While that doesn't seem to be an issue, it breaks some pydantic models if they require strict floats.	2023-03-02 09:02:10 -08:00
Jason Gill	1989e7d4c2	Update examples to prevent confusing missing _type warning (#1391 ) The YAML and JSON examples of prompt serialization now give a strange `No '_type' key found, defaulting to 'prompt'` message when you try to run them yourself or copy the format of the files. The reason for this harmless warning is that the _type key was not in the config files, which means they are parsed as a standard prompt. This could be confusing to new users (like it was confusing to me after upgrading from 0.0.85 to 0.0.86+ for my few_shot prompts that needed a _type added to the example_prompt config), so this update includes the _type key just for clarity. Obviously this is not critical as the warning is harmless, but it could be confusing to track down or be interpreted as an error by a new user, so this update should resolve that.	2023-03-02 07:39:57 -08:00
Harrison Chase	dda5259f68	bump version to 0.0.99 (#1390 )	2023-03-02 07:25:59 -08:00
Kacper Łukawski	f032609f8d	Add `recursive` parameter to `DirectoryLoader` (#1389 ) This PR allows loading a directory recursively.	2023-03-02 07:06:26 -08:00
Kacper Łukawski	9ac442624c	Add Qdrant named arguments (#1386 ) This PR: - Increases `qdrant-client` version to 1.0.4 - Introduces custom content and metadata keys (as requested in #1087) - Moves all the `QdrantClient` parameters into the method parameters to simplify code completion	2023-03-02 07:05:14 -08:00
Francisco Ingham	34abcd31b9	remove limit clause from prompt for compatibility with ms sql server (#1385 ) For reference see: `8a35811556` Co-authored-by: Francisco Ingham <>	2023-03-02 07:02:42 -08:00
Ankush Gola	fe30be6fba	add async and streaming support to `OpenAIChat` (#1378 ) title says it all	2023-03-01 21:55:43 -08:00

... 3 4 5 6 7 ...

964 Commits