Commit Graph

37 Commits (558ecd84a675ca3cf7ceb6f744401b0bd9e1be8b)

Author SHA1 Message Date
ManishMadan2882 9000838aab (feat:vectors): calc, add token in db 3 months ago
Siddhant Rai 53e86205ad fix: added more headers from default 4 months ago
Siddhant Rai aa670efe3a fix: connection aborted in WebBaseLoader 4 months ago
Alex 8d7a134cb4 lint: ruff 5 months ago
Siddhant Rai e01071426f feat: field to pass number of posts as a parameter 5 months ago
Siddhant Rai eed1bfbe50 feat: fields to handle reddit loader + minor changes 5 months ago
Siddhant Rai 60cfea1126 feat: added reddit loader 6 months ago
Alex 4a701cb993
Merge branch 'main' into feature/remote-loads 6 months ago
Pavel 54d187a0ad Fixing ingestion metadata grouping 6 months ago
Pavel c8d8a8d0b5 Fixing ingestion metadata grouping 6 months ago
Alex 0cb3d12d94 Refactor loader classes to accept inputs directly 7 months ago
Alex 2e14dec12d
Merge pull request #849 from arc53/main
Sync
7 months ago
Anton Larin 9e04b7796a application folder related changes:
* optimize content of requirements.txt
* upgrade libs
* fix imports
7 months ago
Anton Larin e8099c4db5 script folder related changes:
* optmize content of requirements.txt
* upgrade libs
* fix imports
7 months ago
Exterminator11 f3540aac0f Changed import 11 months ago
Exterminator11 889ce984a9 Made changes 11 months ago
Pavel 381a2740ee change input 11 months ago
Pavel 024674eef3 List check 11 months ago
Pavel b7d88b4c0f fix wrong link 11 months ago
Pavel 719ca63ec1 fixes 11 months ago
Pavel 2cfb416fd0 Desc loader 11 months ago
Pavel 50f07f9ef5 limit crawler 11 months ago
Pavel c517bdd2e1 Crawler + sitemap 11 months ago
Pavel 658867cb46 No crawler, no sitemap 11 months ago
Alex 8f2ad38503 tests 11 months ago
John Bampton 32ea0213f7 Remove unneeded duplicate words 11 months ago
John Bampton 2c6ab18e41 Fix spelling 11 months ago
Alex 347cfe253f elastic2 11 months ago
Alex 783e7f6939 working es 11 months ago
Anton Larin 98a97f34f5 fix packaging and imports and introduce tests with pytest.
still issues with celery worker.
1 year ago
Anton Larin bed25b317c Fix min_tokens logic for grouping documents: documents with (lengh >= min_tokens) should not be grouped into one document for indexing 1 year ago
Alex a64a30c088 fix 1 year ago
Alex dac76a867f fix tokens for header 1 year ago
Anton Larin 962becb9a5
Linting
* validate python formatting on every build with Ruff
* fix lint warnings
1 year ago
Anton Larin 168648e789 Proper PEP8 formatting 1 year ago
Alex 8e477c9d16 update worker 1 year ago
Alex 1d2162705d uploads backend first 2 years ago