Browse Source

gsi specific updates 2021-02-26

master
René Wagner 8 months ago
parent
commit
e691231ec8
  1. 2
      serve/templates/documentation/indexing.gmi
  2. 5
      serve/templates/news.gmi

2
serve/templates/documentation/indexing.gmi

@ -24,7 +24,7 @@ GUS currently tends to update its index a few times per month. The last updated
To control crawling of your site, you can use a robots.txt file, Place it in your capsule's root directory such that a request for "robots.txt" will fetch it. It should be returned with a mimetype of `text/plain`.
GUS obeys User-agent of "gus" and "*".
GUS obeys User-agent of "indexer" and "*".
### How can I recognize GUS requests?

5
serve/templates/news.gmi

@ -3,6 +3,11 @@
## News
### 2021-02-26
I've made some adjustments on how GUS/geminispace.info uses robots.txt.
Previously we tried to honor the settings for *, indexer and gus user-agents. That didn't work out well with the available python libraries for robots parsing and GUS ended up crawling files it wasn't intended tto.
We now only use the settings for * and indexer, no special handling for GUS anymore. All indexers unite. ;)
### 2021-02-02
The first fully unattended index update has happened last night.
There are still some rough edges to be cleaned, but we are on the way to have up-to-date search results without manual intervention.

Loading…
Cancel
Save