langchain/docs/extras/integrations/document_loaders/dropbox.ipynb

150 lines
79 KiB
Plaintext
Raw Normal View History

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Dropbox\n",
"\n",
"[Drobpox](https://en.wikipedia.org/wiki/Dropbox) is a file hosting service that brings everything-traditional files, cloud content, and web shortcuts together in one place.\n",
"\n",
"This notebook covers how to load documents from *Dropbox*. In addition to common files such as text and PDF files, it also supports *Dropbox Paper* files.\n",
"\n",
"## Prerequisites\n",
"\n",
"1. Create a Dropbox app.\n",
"2. Give the app these scope permissions: `files.metadata.read` and `files.content.read`.\n",
"3. Generate access token: https://www.dropbox.com/developers/apps/create.\n",
"4. `pip install dropbox` (requires `pip install unstructured` for PDF filetype).\n",
"\n",
"## Intructions\n",
"\n",
"`DropboxLoader`` requires you to create a Dropbox App and generate an access token. This can be done from https://www.dropbox.com/developers/apps/create. You also need to have the Dropbox Python SDK installed (pip install dropbox).\n",
"\n",
"DropboxLoader can load data from a list of Dropbox file paths or a single Dropbox folder path. Both paths should be relative to the root directory of the Dropbox account linked to the access token."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: dropbox in /Users/rbarragan/.local/share/virtualenvs/langchain-kv0dsrF5/lib/python3.11/site-packages (11.36.2)\n",
"Requirement already satisfied: requests>=2.16.2 in /Users/rbarragan/.local/share/virtualenvs/langchain-kv0dsrF5/lib/python3.11/site-packages (from dropbox) (2.31.0)\n",
"Requirement already satisfied: six>=1.12.0 in /Users/rbarragan/.local/share/virtualenvs/langchain-kv0dsrF5/lib/python3.11/site-packages (from dropbox) (1.16.0)\n",
"Requirement already satisfied: stone>=2 in /Users/rbarragan/.local/share/virtualenvs/langchain-kv0dsrF5/lib/python3.11/site-packages (from dropbox) (3.3.1)\n",
"Requirement already satisfied: charset-normalizer<4,>=2 in /Users/rbarragan/.local/share/virtualenvs/langchain-kv0dsrF5/lib/python3.11/site-packages (from requests>=2.16.2->dropbox) (3.2.0)\n",
"Requirement already satisfied: idna<4,>=2.5 in /Users/rbarragan/.local/share/virtualenvs/langchain-kv0dsrF5/lib/python3.11/site-packages (from requests>=2.16.2->dropbox) (3.4)\n",
"Requirement already satisfied: urllib3<3,>=1.21.1 in /Users/rbarragan/.local/share/virtualenvs/langchain-kv0dsrF5/lib/python3.11/site-packages (from requests>=2.16.2->dropbox) (2.0.4)\n",
"Requirement already satisfied: certifi>=2017.4.17 in /Users/rbarragan/.local/share/virtualenvs/langchain-kv0dsrF5/lib/python3.11/site-packages (from requests>=2.16.2->dropbox) (2023.7.22)\n",
"Requirement already satisfied: ply>=3.4 in /Users/rbarragan/.local/share/virtualenvs/langchain-kv0dsrF5/lib/python3.11/site-packages (from stone>=2->dropbox) (3.11)\n",
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"pip install dropbox"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import DropboxLoader"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"# Generate access token: https://www.dropbox.com/developers/apps/create.\n",
"dropbox_access_token = \"<DROPBOX_ACCESS_TOKEN>\"\n",
"# Dropbox root folder\n",
"dropbox_folder_path = \"\""
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"loader = DropboxLoader(\n",
" dropbox_access_token=dropbox_access_token,\n",
" dropbox_folder_path=dropbox_folder_path,\n",
" recursive=False\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"File /JHSfLKn0.jpeg could not be decoded as text. Skipping.\n",
"File /A REPORT ON WILES CAMBRIDGE LECTURES.pdf could not be decoded as text. Skipping.\n"
]
}
],
"source": [
"documents = loader.load()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"page_content='# 🎉 Getting Started with Dropbox Paper\\nDropbox Paper is great for capturing ideas and gathering quick feedback from your team. You can use words, images, code, or media from other apps, or go ahead and connect your calendar and add to-dos for projects.\\n\\n*Explore and edit this doc to play with some of these features. This doc is all yours. No one will see your edits unless you share this doc.*\\n\\n\\n# The basics\\n\\n**Selecting text** activates the formatting toolbar, where you can apply basic formatting, create lists, and add comments.\\n\\n[ ] Create to-do lists\\n- Bulleted lists\\n1. Numbered lists\\n\\n**Starting a new line** activates the insert toolbar, where you can add media from other apps, links to Dropbox files, photos, and more.\\n\\n![](https://paper-attachments.dropbox.com/s_72143DBFDAF4C9DE702BB246920BC47FE7E1FA76AC23CC699374430D94E96DD2_1523574441249_paper-insert.png)\\n\\n\\n\\n**Add emojis** to your doc or comment by typing `**:**` ****and choosing a character. \\n\\n# 👍 👎 👏 ✅ ❌ ❤️ ⭐ 💡 📌\\n\\n\\n# Images\\n\\n**Selecting images** activates the image toolbar, where you can align images left, center, right or expand them to full width.\\n\\n![](https://paper-attachments.dropbox.com/s_72143DBFDAF4C9DE702BB246920BC47FE7E1FA76AC23CC699374430D94E96DD2_1523473869783_Hot_Sauce.jpg)\\n\\n\\nPaste images or gifs right next to each other and they\\'ll organize automatically. Click on an image twice to start full-screen gallery view.\\n\\n\\n![](https://paper-attachments.dropbox.com/s_72143DBFDAF4C9DE702BB246920BC47FE7E1FA76AC23CC699374430D94E96DD2_1523564536543_Clock_Melt.png)\\n![](https://paper-attachments.dropbox.com/s_72143DBFDAF4C9DE702BB246920BC47FE7E1FA76AC23CC699374430D94E96DD2_1523564528339_Boom_Box_Melt.png)\\n![](https://paper-attachments.dropbox.com/s_72143DBFDAF4C9DE702BB246920BC47FE7E1FA76AC23CC699374430D94E96DD2_1523564549819_Soccerball_Melt.png)\\n\\n![You can add captions too](https://paper-attachments.dropbox.com/s_72143DBFDAF4C9DE702BB246920BC47FE7E1FA76AC23CC699374430D94E96DD2_1523564518899_Cacti_Melt.png)\\n![What a strange, melting toaster!](https://paper-attachments.dropbox.com/s_72143DBFDAF4C9DE702BB246920BC47FE7E1FA76AC23CC699374430D94E96DD2_1523564508553_Toaster_Melt.png)\\n\\n\\n \\n\\n\\n# Form meets function\\n\\nYou and your team can create the way you want, with what you want. Dropbox Paper adapts to the way your team captures ideas.\\n\\n**Add media from apps** like YouTube and Vimeo, or add audio from Spotify and SoundCloud. Files from Google Drive and Dropbox update automatically. Start a new line and choose add media, or drop in a link to try it out.\\n\\n\\n![](https://paper-attachments.dropbox.com/s_72143DBFDAF4C9DE702BB246920BC47FE7E1FA76AC23CC699374430D94E96DD2_1523575138939_paper-embed.png)\\n\\n\\n\\n## YouTube\\nhttps://www.youtube.com/watch?v=fmsq1uKOa08&\\n\\n\\n[https://youtu.be/fmsq1uKOa08](https://youtu.be/fmsq1uKOa08)\\n\\n\\n\\n## SoundCloud\\nhttps://w.soundcloud.com/player/?url=https%3A%2F%2Fsoundcloud.com%2Ftycho%2Fspoon-inside-out-tycho-version&autoplay=false\\n\\n\\n[https://soundcloud.com/tycho/spoon-inside-out-tycho-version](https://soundcloud.com/tycho/spoon-inside-out-tycho-version) \\n\\n\\n\\n## Dropbox files\\nhttps://www.dropbox.com/s/bgi58tkovntch5e/Wireframe%20render.pdf?dl=0\\n\\n\\n\\n\\n## Code\\n\\n**Write code** in Dropbox Paper with automatic language detection and syntax highlighting. Start a new line and type three backticks (```).\\n\\n\\n public class HelloWorld { \\n public static void main(String[] args) { \\n System.out.println(\"Hello, World\");\\n }\\n }\\n\\n\\n\\n## Tables\\n\\n**Create a table** with the menu that shows up on the right when you start a new line.\\n\\n| To insert a row or column, hover over a dividing line and click the + | ⭐ |\\n| ------------------------------------------------------------------------------------------------------- | ----- |\\n| To delete, select rows/columns and click the tr
"page_content='# 🥂 Toast to Droplets\\n❓ **Rationale:** Reflection, especially writing, is the key to deep learning! Lets take a few minutes to reflect on your first day at Dropbox individually, and then one lucky person will have the chance to share their toast.\\n\\n✍ **How to fill out this template:**\\n\\n- Option 1: You can sign in and then click “Create doc” to make a copy of this template. Fill in the blanks!\\n- Option 2: If you dont know your personal Dropbox login quickly, you can copy and paste this text into another word processing tool and start typing! \\n\\n\\n\\n## To my Droplet class:\\n\\nI feel so happy and excited to be making a toast to our newest Droplet class at Dropbox Basecamp.\\n\\nAt the beginning of our first day, I felt a bit underwhelmed with all information, and now, at the end of our first day at Dropbox, I feel I know enough for me to ramp up, but still a lot to learn**.**\\n\\nI cant wait to explore every drl, but especially drl/(App Center)/benefits/allowance. I heard its so informative!\\n\\nDesigning an enlightened way of working is important, and to me, it means **a lot since I love what I do and I can help people around the globe**.\\n\\nI am excited to work with my team and flex my **technical and social** skills in my role as a **Software Engineer**.\\n\\nAs a Droplet, I pledge to:\\n\\n\\n1. Be worthy of trust by **working always with values and integrity**.\\n\\n\\n1. Keep my customers first by **caring about their happiness and the value that we provide as a company**.\\n\\n\\n1. Own it, keep it simple, and especially make work human by **providing value to people****.**\\n\\nCongrats, Droplets!\\n\\n' metadata={'source': 'dropbox:///_ Toast to Droplets.paper', 'title': '_ Toast to Droplets.paper'}\n",
"page_content='APPEARED IN BULLETIN OF THE AMERICAN MATHEMATICAL SOCIETY Volume 31, Number 1, July 1994, Pages 15-38\\n\\nA REPORT ON WILES CAMBRIDGE LECTURES\\n\\n4 9 9 1\\n\\nK. RUBIN AND A. SILVERBERG\\n\\nl u J\\n\\nAbstract. In lectures at the Newton Institute in June of 1993, Andrew Wiles announced a proof of a large part of the Taniyama-Shimura Conjecture and, as a consequence, Fermats Last Theorem. This report for nonexperts dis- cusses the mathematics involved in Wiles lectures, including the necessary background and the mathematical history.\\n\\n1\\n\\n] T N . h t a m\\n\\nIntroduction\\n\\nOn June 23, 1993, Andrew Wiles wrote on a blackboard, before an audience at the Newton Institute in Cambridge, England, that if p is a prime number, u, v, and w are rational numbers, and up + vp + wp = 0, then uvw = 0. In other words, he announced that he could prove Fermats Last Theorem. His announce- ment came at the end of his series of three talks entitled “Modular forms, elliptic curves, and Galois representations” at the week-long workshop on “p-adic Galois representations, Iwasawa theory, and the Tamagawa numbers of motives”.\\n\\n[\\n\\n1 v 0 2 2 7 0 4 9 / h t a m : v i X r a\\n\\nIn the margin of his copy of the works of Diophantus, next to a problem on\\n\\nPythagorean triples, Pierre de Fermat (16011665) wrote:\\n\\nCubum autem in duos cubos, aut quadratoquadratum in duos quadrato- quadratos, et generaliter nullam in infinitum ultra quadratum potestatem in duos ejusdem nominis fas est dividere : cujus rei demonstrationem mirabilem sane detexi. Hanc marginis exiguitas non caperet.\\n\\n(It is impossible to separate a cube into two cubes, or a fourth power into two fourth powers, or in general, any power higher than the second into two like powers. I have discovered a truly marvelous proof of this, which this margin is too narrow to contain.)\\n\\nWe restate Fermats conjecture as follows.\\n\\nFermats Last Theorem. If n > 2, then an +bn = cn has no solutions in nonzero integers a, b, and c.\\n\\nA proof by Fermat has never been found, and the problem has remained open, inspiring many generations of mathematicians. Much of modern number theory has been built on attempts to prove Fermats Last Theorem. For details on the\\n\\nReceived by the editors November 29, 1993. 1991 Mathematics Subject Classification. Primary 11G05; Secondary 11D41, 11G18. The authors thank the National Science Foundation for financial support.\\n\\nc(cid:13)1994 American Mathematical Society 0273-0979/94 $1.00 + $.25 per page\\n\\n1\\n\\n2\\n\\nK. RUBIN AND A. SILVERBERG\\n\\nhistory of Fermats Last Theorem (last because it is the last of Fermats questions to be answered) see [5], [6], and [26].\\n\\nWhat Andrew Wiles announced in Cambridge was that he could prove “many” elliptic curves are modular, sufficiently many to imply Fermats Last Theorem. In this paper we will explain Wiles work on elliptic curves and its connection with 1 we introduce elliptic curves and modularity, and Fermats Last Theorem. give the connection between Fermats Last Theorem and the Taniyama-Shimura Conjecture on the modularity of elliptic curves. In 2 we describe how Wiles re- duces the proof of the Taniyama-Shimura Conjecture to what we call the Modular Lifting Conjecture (which can be viewed as a weak form of the Taniyama-Shimura Conjecture), by using a theorem of Langlands and Tunnell. In 4 we show § how the Semistable Modular Lifting Conjecture is related to a conjecture of Mazur on deformations of Galois representations (Conjecture 4.2), and in 5 we describe Wiles method of attack on this conjecture. In order to make this survey as acces- sible as possible to nonspecialists, the more technical details are postponed as long as possible, some of them to the appendices.\\n\\nIn\\n\\n§\\n\\n§\\n\\n3 and §\\n\\n§\\n\\nMuch of this report is based on Wiles lectures in Cambridge. The authors apol- ogize for any errors we may have introduced. We also apologize to those whose mathematical contribut
"page_content='This is text file' metadata={'source': 'dropbox:///test.txt', 'title': 'test.txt'}\n"
]
}
],
"source": [
"for document in documents:\n",
" print(document)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "langchain",
"language": "python",
"name": "langchain"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}