source of geminispace.info - the search provider for gemini space
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

49 lines
1.2 KiB

1 year ago
# Gemini Universal Search (GUS)
2 years ago
## Dependencies
1. Install python and poetry
2. Run: "poetry install"
## Making an initial index
Make sure you have some gemini URLs for testing which are nicely
sandboxed to avoid indexing huge parts of the gemini space.
1. Create a "seed-requests.txt" file with you test gemini URLs
2. Run: "poetry run crawl -d"
3. Run: "poetry run build_index -d"
Now you'll have created `index.new` directory, rename it to `index`.
## Running the frontend
1. Run: "poetry run serve"
2. Navigate your gemini client to: "gemini://localhost/"
# Updating the index
1. Run: "poetry run crawl"
2. Run: "poetry run build_index"
3. Restart frontend
## Running test suite
Run: "poetry run python -m pytest"
1 year ago
## Roadmap / TODOs
2 years ago
- TODO: improve crawl and build_index automation
- TODO: get crawl to run on a schedule with systemd
- TODO: add functionality to create a mock index
- TODO: exclude raw-text blocks from indexed content
- TODO: strip control characters from logged output like URLs
- TODO: fix bug in calulation of backlinks (iirc the bug is visible on gemini.circumlunar.space)
- TODO: refactor manual exclusion logic to be regex-based instead of prefix-based. we could get more nuanced with exclusion logic this way