index: clearance of old data #24

Closed
opened 7 months ago by René Wagner · 0 comments
Owner

old data is not removed from the data store/index once it has been added.

data to be cleaned

  • pages which have no successfull crawl since 1 month (last_crawl_success older than 1 month)
    delete from page where last_crawl_success_at < (datetime.utcnow() - 1 month) and last_crawl_at => last_crawl_success_at
  • pages that where excluded from the crawl
old data is not removed from the data store/index once it has been added. ### data to be cleaned - [x] pages which have no successfull crawl since 1 month (last_crawl_success older than 1 month) `delete from page where last_crawl_success_at < (datetime.utcnow() - 1 month) and last_crawl_at => last_crawl_success_at` - [x] pages that where excluded from the crawl
René Wagner added the
question
label 7 months ago
René Wagner added
enhancement
and removed
question
labels 6 months ago
René Wagner added a new dependency 5 months ago
René Wagner changed title from clearance of old data to index: clearance of old data 5 months ago
René Wagner self-assigned this 5 months ago
René Wagner closed this issue 4 months ago
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date

No due date set.

Depends on
Loading…
There is no content yet.