2
0
mirror of https://github.com/WikiTeam/wikiteam synced 2024-11-12 07:12:41 +00:00
Commit Graph

687 Commits

Author SHA1 Message Date
Marek Šuppa
e370257aeb tests: Updated Index endpoint for WikiPapers
* Updated API endpoint for WikiPapers on Referata which was previously (http://wikipapers.referata.com/w/index.php) and now resolves to (http://wikipapers.referata.com/index.php).
2014-12-08 06:49:03 +01:00
Marek Šuppa
7b9ca8aa6b tests: Updated API endpoint for WikiPapers
* Updated API endpoint for WikiPapers on Referata. It used to be (http://wikipapers.referata.com/w/api.php), now it resolves to (http://wikipapers.referata.com/api.php). This was breaking the tests.
2014-12-08 06:37:29 +01:00
Federico Leva
e26711afc9 Merge branch 'master' of github.com:WikiTeam/wikiteam 2014-12-05 15:01:32 +01:00
Federico Leva
ed2d87418c Update with some wikis done in the last batch 2014-12-05 15:00:43 +01:00
Emilio J. Rodríguez-Posada
43cda4ec01 excluding wiki-site.com farm too 2014-12-03 11:39:53 +01:00
Emilio J. Rodríguez-Posada
7463b16b36 Merge branch 'master' of https://github.com/WikiTeam/wikiteam 2014-11-27 20:12:50 +01:00
Emilio J. Rodríguez-Posada
9681fdfd14 linking to GitHub 2014-11-27 20:12:27 +01:00
Marek Šuppa
b003cf94e2 tests: Disable broken Wiki
* Disabled http://wiki.greenmuseum.org/ since it's broken and was breaking the tests `'Unknown' != 'PhpWiki'`
2014-11-26 23:33:44 +01:00
Emilio J. Rodríguez-Posada
8d4def5885 improving duplicate filter, removing www. www1., etc; excluding editthis.info 2014-11-26 17:32:08 +01:00
Emilio J. Rodríguez-Posada
9ca67fa4d3 not archived wikis script 2014-11-26 16:34:14 +01:00
Federico Leva
8cf4d4e6ea Add 30k domains from another crawler
11011 were found alive by checkalive.py (though there could be more
if one checks more subdomains and subdirectories), some thousands
more by checklive.pl (but mostly or all false positives).

Of the alive ones, about 6245 were new to WikiApiary!
https://wikiapiary.com/wiki/Category:Oct_2014_Import
2014-11-01 22:23:25 +01:00
Federico Leva
7e0071ae7f Add some UseModWiki-looking domains 2014-11-01 22:03:01 +01:00
nemobis
6b11cef9dc A few thousands more doku.php URLs from own scraping 2014-10-29 19:02:06 +01:00
nemobis
0624d0303b Merge pull request #198 from Southparkfan/patch-1
Update list of Orain wikis
2014-10-08 19:45:49 +02:00
Southparkfan
8ca9eb8757 Update date of Orain wikilist 2014-10-08 19:11:05 +02:00
Southparkfan
2e2fe9b818 Update list of Orain wikis 2014-10-08 19:10:27 +02:00
Marek Šuppa
8c44cff165 readme: Small wording fixes
* Small fixed in `Download Wikimedia dumps` section.
2014-10-04 12:02:48 +02:00
nemobis
6f74781e78 Merge pull request #197 from mrshu/mrshu/autopep8fied-wikiadownloader
wikiadownloader: Autopep8fied
2014-10-03 23:27:28 +02:00
mr.Shu
f022b02e47 wikiadownloader: Autopep8fied
* Made the source look a bit better, though this script might not be
  used anymore.

Signed-off-by: mr.Shu <mr@shu.io>
2014-10-02 23:06:42 +02:00
nemobis
b3ef165529 Merge pull request #194 from mrshu/mrshu/dumpgenerator-pep8fied
dumpgenerator: AutoPEP8-fied
2014-10-01 23:56:36 +02:00
mr.Shu
04446a40a5 dumpgenerator: AutoPEP8-fied
* Used autopep8 to made sure the code looks nice and is actually PEP8
  compliant.

Signed-off-by: mr.Shu <mr@shu.io>
2014-10-01 22:26:56 +02:00
nemobis
23a60fa850 MediaWiki CamelCase 2014-10-01 08:27:10 +02:00
nemobis
31112b3a80 checkalive.py: more checks before accessing stuff 2014-09-29 13:26:26 +02:00
nemobis
225c3eb478 A thousand more doku.php URLs from search 2014-09-29 09:12:33 +02:00
nemobis
e0f8e36bf4 Merge pull request #190 from PiRSquared17/api-allpages-disabled
Fallback to getPageTitlesScraper() if API allpages disabled
2014-09-28 16:34:24 +02:00
nemobis
a7e1b13304 Merge pull request #193 from mrshu/mrshu/readme-fix-wording
readme: Fix wording
2014-09-28 16:32:56 +02:00
nemobis
3fc7dcb5de Add some more doku.php URLs 2014-09-26 23:55:57 +02:00
mr.Shu
54c373e9a0 readme: Fix wording
* Made a few wording changes to make the README.md more clear.

Signed-off-by: mr.Shu <mr@shu.io>
2014-09-25 18:55:00 +02:00
Marek Šuppa
40d863fb99 README: update working
* Updated wording to make the README more clear.
2014-09-25 18:54:36 +02:00
Emilio J. Rodríguez-Posada
87ce2d4540 Merge pull request #192 from mrshu/mrshu/add-travis-image
update: Add TravisCI image to README
2014-09-25 16:05:42 +02:00
mr.Shu
7b0b54b6e5 update: Add TravisCI image to README
* Added TravisCI image which specifies whether the tests are passing or
  not to Developers section.

Signed-off-by: mr.Shu <mr@shu.io>
2014-09-25 11:59:47 +02:00
Emilio J. Rodríguez-Posada
5c8e316e67 Merge pull request #189 from PiRSquared17/get-wiki-engine
Improve getWikiEngine()
2014-09-25 11:58:26 +02:00
Emilio J. Rodríguez-Posada
086415bc00 Merge pull request #191 from mrshu/mrshu/setup-travis
tests: Add .travis.yml and Travis CI
2014-09-25 11:55:02 +02:00
mr.Shu
14c62c6587 tests: Add .travis.yml and Travis CI
* Added .travis.yml to enable Travis CI

Signed-off-by: mr.Shu <mr@shu.io>
2014-09-23 23:12:40 +02:00
PiRSquared17
757019521a Fallback to scraper if API allpages disabled 2014-09-23 15:53:51 -04:00
PiRSquared17
4b3c862a58 Comment debugging print, fix test 2014-09-23 15:10:06 -04:00
PiRSquared17
7a1db0525b Add more wiki engines to getWikiEngine 2014-09-23 15:04:36 -04:00
nemobis
40c406cd00 Merge pull request #188 from PiRSquared17/wikiengine-lists
Add subdirectories to listsofwikis for different wiki engines
2014-09-23 07:19:43 +02:00
PiRSquared17
56c2177106 Add (incomplete) list of dokuwikis 2014-09-22 23:56:53 -04:00
PiRSquared17
03ddde3702 Move wiki lists to mediawiki subdirectory 2014-09-22 23:23:40 -04:00
Emilio J. Rodríguez-Posada
43a105335b Merge pull request #185 from PiRSquared17/fix-tests
Relax delay() test by 10 ms, add test for allpages
2014-09-21 13:06:11 +02:00
PiRSquared17
d7e43f92c7 Relax delay() test by 10 ms, add test for allpages 2014-09-20 12:14:43 -04:00
nemobis
f52051f8ae Merge pull request #184 from PiRSquared17/fix-tests
Fix tox.ini and clean up/update tests, avoid a loop to make tests pass
2014-09-20 17:54:16 +02:00
PiRSquared17
b4818d2985 Avoid infinite loop in getImageNamesScraper 2014-09-20 11:41:57 -04:00
PiRSquared17
f2b7716e72 Fix tox.ini and clean up/update tests 2014-09-20 11:36:24 -04:00
nemobis
8a9b50b51d Merge pull request #183 from PiRSquared17/patch-7
Retry on ConnectionError in getXMLPageCore
2014-09-19 21:08:52 +02:00
nemobis
9828cbec3c Add PiRSquared17 to credits 2014-09-19 09:46:17 +02:00
nemobis
19c48d3dd0 Merge pull request #180 from PiRSquared17/patch-2
Get as much information from siteinfo as possible
2014-09-19 09:43:07 +02:00
nemobis
d8360393da Merge pull request #182 from PiRSquared17/patch-6
AllPages API fix for old MediaWiki versions
2014-09-19 09:07:31 +02:00
Pi R. Squared
f7187b7048 Retry on ConnectionError in getXMLPageCore
Previously it just gave a fatal error.
2014-09-18 20:21:01 -04:00