source of geminispace.info - the search provider for gemini space
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
René Wagner ca95ee48b9 news 2021-05-16 1 day ago
docs don't persist robots.txt over multiple crawls 10 months ago
gus news 2021-05-16 1 day ago
infra news 2021-05-16 1 day ago
scripts [threads] Only work with textual pages 2 years ago
serve news 2021-05-16 1 day ago
tests/gus support prioritized robots.txt user-agents 10 months ago
.git-blame-ignore-revs Add .git-blame-ignore-revs file 2 years ago
.gitignore some cleanup 8 months ago
LICENSE Add GUS licence 2 years ago
README.md add info about redirect indexing 3 months ago
logging.ini save first_seen_at if a page is created through a link 7 months ago
poetry.lock updated dependencies, excludes 1 week ago
pyproject.toml updated dependencies, excludes 1 week ago

README.md

Gemini Universal Search (GUS)

Dependencies

  1. Install python (>3.5) and poetry
  2. Run: poetry install

Making an initial index

Make sure you have some gemini URLs for testing which are nicely sandboxed to avoid indexing huge parts of the gemini space.

  1. Create a "seed-requests.txt" file with you test gemini URLs
  2. Run: poetry run crawl -d
  3. Run: poetry run build_index -d

Now you'll have created index.new directory, rename it to index.

Running the frontend

  1. Run: poetry run serve
  2. Navigate your gemini client to: "gemini://localhost/"

Running the frontend in production with systemd

  1. update infra/gus.service to match your needs (directory, user)
  2. copy infra/gus.service to /etc/systemd/system/
  3. run systemctl enable gus and systemctl start gus

Running the crawl to update the index

  1. Run: poetry run crawl
  2. Run: poetry run build_index
  3. Restart frontend

Running the crawl & indexer in production with systemd

  1. update infra/gus-crawl.service & infra/gus-index.service to match your needs (directory, user)
  2. copy both files to /etc/systemd/system/
  3. set up a cron job for root with the following params: 0 9 */3 * * systemctl start gus-crawl --no-block

Running the test suite

Run: poetry run pytest