mirror of
https://github.com/hwchase17/langchain
synced 2024-10-29 17:07:25 +00:00
803d0d9656
- Description: this PR adds the possibility to configure boto3 in the S3 loaders. Any named argument you add will be used to create the Boto3 session. This is useful when the AWS credentials can't be passed as env variables or can't be read from the credentials file. - Issue: N/A - Dependencies: N/A - Tag maintainer: ? - Twitter handle: cbornet_ --------- Co-authored-by: Bagatur <baskaryan@gmail.com>
122 lines
2.9 KiB
Plaintext
122 lines
2.9 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "66a7777e",
|
|
"metadata": {},
|
|
"source": [
|
|
"# AWS S3 File\n",
|
|
"\n",
|
|
">[Amazon Simple Storage Service (Amazon S3)](https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-folders.html) is an object storage service.\n",
|
|
"\n",
|
|
">[AWS S3 Buckets](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html)\n",
|
|
"\n",
|
|
"This covers how to load document objects from an `AWS S3 File` object."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"id": "9ec8a3b3",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from langchain.document_loaders import S3FileLoader"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"id": "43128d8d",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"#!pip install boto3"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 8,
|
|
"id": "35d6809a",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"loader = S3FileLoader(\"testing-hwc\", \"fake.docx\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 9,
|
|
"id": "efd6be84",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"[Document(page_content='Lorem ipsum dolor sit amet.', lookup_str='', metadata={'source': 's3://testing-hwc/fake.docx'}, lookup_index=0)]"
|
|
]
|
|
},
|
|
"execution_count": 9,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"loader.load()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "93689594",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Configuring the AWS Boto3 client\n",
|
|
"You can configure the AWS [Boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html) client by passing\n",
|
|
"named arguments when creating the S3DirectoryLoader.\n",
|
|
"This is useful for instance when AWS credentials can't be set as environment variables.\n",
|
|
"See the [list of parameters](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html#boto3.session.Session) that can be configured."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"outputs": [],
|
|
"source": [
|
|
"loader = S3FileLoader(\"testing-hwc\", \"fake.docx\", aws_access_key_id=\"xxxx\", aws_secret_access_key=\"yyyy\")"
|
|
],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"outputs": [],
|
|
"source": [
|
|
"loader.load()"
|
|
],
|
|
"metadata": {}
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.10.6"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|
|
|