fix: bypass hf cache retrieval bug

If one uses `hf_hub_download` only referencing specific commits the `refs` folder will not be created even though data will be cached via `snapshots` and `blobs`.  Subsequent calls to `try_to_load_from_cache` will return None even though the desired data was in the cache.

Example:

```python
# download something
hf_hub_download(repo_id=repo, revision=commit_hash, filename=filepath, token=token)
# returns None
try_to_load_from_cache(repo_id=repo, revision=commit_hash, filename=filepath)
```

https://github.com/huggingface/huggingface_hub/pull/1306
This commit is contained in:
Bryce 2023-01-23 20:16:47 -08:00 committed by Bryce Drennan
parent 3a7c861f0a
commit 9212f0227c

View File

@ -346,6 +346,12 @@ def huggingface_cached_path(url):
dest_path = hf_hub_download(
repo_id=repo, revision=commit_hash, filename=filepath, token=token
)
# make a refs folder so caching works
# work-around for
# https://github.com/huggingface/huggingface_hub/pull/1306
# https://github.com/brycedrennan/imaginAIry/issues/171
refs_url = dest_path[: dest_path.index("/snapshots/")] + "/refs/"
os.makedirs(refs_url, exist_ok=True)
return dest_path