Fixed PermissionError on windows (#6170)

Fixed PermissionError that occurred when downloading PDF files via http
in BasePDFLoader on windows.

When downloading PDF files via http in BasePDFLoader, NamedTemporaryFile
is used.
This function cannot open the file again on **Windows**.[Python
Doc](https://docs.python.org/3.9/library/tempfile.html#tempfile.NamedTemporaryFile)

So, we created a **temporary directory** with TemporaryDirectory and
placed the downloaded file there.
temporary directory is deleted in the deconstruct.

Fixes #2698

#### Who can review?

Tag maintainers/contributors who might be interested:

  - @eyurtsev
  - @hwchase17
This commit is contained in:
MIDORIBIN 2023-06-19 08:39:57 +09:00 committed by GitHub
parent 4fc7939848
commit 5be465bd86
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -62,15 +62,17 @@ class BasePDFLoader(BaseLoader, ABC):
) )
self.web_path = self.file_path self.web_path = self.file_path
self.temp_file = tempfile.NamedTemporaryFile() self.temp_dir = tempfile.TemporaryDirectory()
self.temp_file.write(r.content) temp_pdf = Path(self.temp_dir.name) / "tmp.pdf"
self.file_path = self.temp_file.name with open(temp_pdf, mode="wb") as f:
f.write(r.content)
self.file_path = str(temp_pdf)
elif not os.path.isfile(self.file_path): elif not os.path.isfile(self.file_path):
raise ValueError("File path %s is not a valid file or url" % self.file_path) raise ValueError("File path %s is not a valid file or url" % self.file_path)
def __del__(self) -> None: def __del__(self) -> None:
if hasattr(self, "temp_file"): if hasattr(self, "temp_dir"):
self.temp_file.close() self.temp_dir.cleanup()
@staticmethod @staticmethod
def _is_valid_url(url: str) -> bool: def _is_valid_url(url: str) -> bool: