348 Commits (master)
 

Author SHA1 Message Date
René Wagner b30a30afe9 add link to source in geminispace 3 days ago
René Wagner fa2db540f6 more meta data for index cleanup 3 days ago
René Wagner b484a4dadc
avoid crash when normalized_url is not set 6 days ago
René Wagner f928815d49
use cronjob for automated start 6 days ago
René Wagner 6eedbd4190 some cleanup 1 month ago
René Wagner 25cc314490
fix broken link to source code 1 month ago
René Wagner e8d4164718 do not add every single domain to the statistics file 1 month ago
René Wagner 7fa7a7d0fa news 2021-08-18 2 months ago
René Wagner 0f03e4fb66 some minor changes 2 months ago
René Wagner faad84dfd5 ensure that scheme is given when searching for backlinks 2 months ago
René Wagner e0fae1c4d6 update 2021-08-07 2 months ago
René Wagner 518e95dc99 ensure that seed-requests use absolute URIs 2 months ago
René Wagner e40712ca9d more excludes 2 months ago
René Wagner ffbf174790 implemented deletion of outdated data 3 months ago
René Wagner cbe22de43a small fixes and doc adjustments 3 months ago
René Wagner b8eb04a224 remove obsolete code 3 months ago
Hannu Hartikainen f6bd88672e support prioritized robots.txt user-agents 3 months ago
René Wagner 1ce3f6f92b more excludes and less logging 3 months ago
René Wagner e47d78ce30
treat schemeless links as non-gemini links 3 months ago
René Wagner 6b5a9f7b4c
remove pikkulog separation 3 months ago
René Wagner 73883455c2 minor code cleanup in db_model 3 months ago
René Wagner 6e524d1be9
update to some templates 3 months ago
René Wagner 39c6540bc6 remove Search model 3 months ago
René Wagner e1b3ac8ab4 enable 'newest-hosts' and 'newest-pages' sites again 3 months ago
René Wagner 7ce66303c3 remove raw data from excluded capsules 3 months ago
René Wagner 87d92bbfb3 index text files up to 5 MB 3 months ago
René Wagner 80e589b1d4
commit search index only when indexing is complete 3 months ago
René Wagner cddbb82dfd store document id in whoosh index 3 months ago
René Wagner a9e9cf27d5 some tweaks to indexing 3 months ago
René Wagner 87ef15df2e restructure crawl data 3 months ago
René Wagner b5bf01a445 remove Crawl table, all info is stored in page table now 3 months ago
René Wagner 9efd819e3e don't persist robots.txt over multiple crawls 3 months ago
René Wagner d4093761e1 improve indexing speed via optimized backlinks query 3 months ago
René Wagner 123895e2f0 again a new exclude 3 months ago
René Wagner 86365f71ae move gusmobile to new home 3 months ago
René Wagner e141576663 update 2021-07-04 & more excludes 4 months ago
René Wagner a85534a5bf additional filter 4 months ago
René Wagner 6a18d99fc1 update 2021-06-26 4 months ago
René Wagner c5bfdafcf5 exclude godocs.io 4 months ago
René Wagner 05c5bd7b5d error handling on page crawl save 4 months ago
René Wagner acd728e7c4 update 2021-06-04 5 months ago
René Wagner d3b1dd8e77 more exception handling on link update 5 months ago
René Wagner 3f7c0f84f9 fix wrong embedding of excludes 5 months ago
René Wagner 8b004af54d unify capitalisation of charset in statistics 5 months ago
René Wagner 5c9e5267cf move exclude definition to own file 5 months ago
René Wagner 14c3997724 news 2021-05-25 5 months ago
René Wagner e0fba80405 some exception handling and updated service files 5 months ago
René Wagner 52d2b4c86d fix last wrong exception in crawl 5 months ago
René Wagner 9b6ef8a0e2 fix wrong exception handling in crawl 5 months ago
René Wagner 06c0258323 update 2021-05-12 5 months ago