Be more explicit in the docs about creating an instance of the
UnstructuredClient if you want to customize it versus using sdk
parameters with the UnstructuredLoader.
Bump the unstructured-client dependency as discussed
[here](https://github.com/langchain-ai/langchain/discussions/25328#discussioncomment-10350949)
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
"Below is an example showing how you can customize some features of the client and use your own `requests.Session()`, pass in an alternative `server_url`, or customize the `RetryConfig` object for more control over how failed requests are handled.\n",
"Note that the example below may not use the latest version of the UnstructuredClient and there could be breaking changes in future releases. For the latest examples, refer to the [Unstructured Python SDK](https://docs.unstructured.io/api-reference/api-services/sdk-python) docs."
"If you want to customize the client, you will have to pass an `UnstructuredClient` instance to the `UnstructuredLoader`. Below is an example showing how you can customize features of the client such as using your own `requests.Session()`, passing an alternative `server_url`, and customizing the `RetryConfig` object. For more information about customizing the client or what additional parameters the sdk client accepts, refer to the [Unstructured Python SDK](https://docs.unstructured.io/api-reference/api-services/sdk-python) docs and the client section of the [API Parameters](https://docs.unstructured.io/api-reference/api-services/api-parameters) docs. Note that all API Parameters should be passed to the `UnstructuredLoader`."
]
},
{
"cell_type": "markdown",
"id": "ebb69c85",
"metadata": {},
"source": [
"<div class=\"alert alert-block alert-warning\"><b>Warning:</b> The example below may not use the latest version of the UnstructuredClient and there could be breaking changes in future releases. For the latest examples, refer to the <a href=\"https://docs.unstructured.io/api-reference/api-services/sdk-python\">Unstructured Python SDK</a> docs.</div>"
]
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 10,
"id": "58e55264",
"metadata": {},
"outputs": [
@ -394,13 +399,15 @@
"text": [
"INFO: Preparing to split document for partition.\n",
"INFO: Concurrency level set to 5\n",
"INFO: Splitting pages 1 to 16 (16 total)\n",
"INFO: Determined optimal split size of 4 pages.\n",
"INFO: Partitioning 4 files with 4 page(s) each.\n",
"INFO: Partitioning set #1 (pages 1-4).\n",
"INFO: Partitioning set #2 (pages 5-8).\n",
"INFO: Partitioning set #3 (pages 9-12).\n",
"INFO: Partitioning set #4 (pages 13-16).\n",
"INFO: Splitting pages 1 to 10 (10 total)\n",
"INFO: Determined optimal split size of 2 pages.\n",
"INFO: Partitioning 5 files with 2 page(s) each.\n",
"INFO: Partitioning set #1 (pages 1-2).\n",
"INFO: Partitioning set #2 (pages 3-4).\n",
"INFO: Partitioning set #3 (pages 5-6).\n",
"INFO: Partitioning set #4 (pages 7-8).\n",
"INFO: Partitioning set #5 (pages 9-10).\n",
"INFO: HTTP Request: POST https://api.unstructuredapp.io/general/v0/general \"HTTP/1.1 200 OK\"\n",
"INFO: HTTP Request: POST https://api.unstructuredapp.io/general/v0/general \"HTTP/1.1 200 OK\"\n",
"INFO: HTTP Request: POST https://api.unstructuredapp.io/general/v0/general \"HTTP/1.1 200 OK\"\n",
"INFO: HTTP Request: POST https://api.unstructuredapp.io/general/v0/general \"HTTP/1.1 200 OK\"\n",
@ -408,6 +415,7 @@
"INFO: Successfully partitioned set #2, elements added to the final result.\n",
"INFO: Successfully partitioned set #3, elements added to the final result.\n",
"INFO: Successfully partitioned set #4, elements added to the final result.\n",
"INFO: Successfully partitioned set #5, elements added to the final result.\n",
"INFO: Successfully partitioned the document.\n"
]
},
@ -429,8 +437,8 @@
" api_key_auth=os.getenv(\n",
" \"UNSTRUCTURED_API_KEY\"\n",
" ), # Note: the client API param is \"api_key_auth\" instead of \"api_key\"\n",