crawl: breaks due to NoneType exception #34

Closed
opened 2 months ago by René Wagner · 0 comments
Owner
Oct 11 04:36:19 poetry[10361]: Traceback (most recent call last):
Oct 11 04:36:19 poetry[10361]:   File "<string>", line 1, in <module>
Oct 11 04:36:19 poetry[10361]:   File "/home/gus/gus/crawl.py", line 572, in main
Oct 11 04:36:19 poetry[10361]:     run_crawl(args.should_run_destructive, seed_urls=args.seed_urls)
Oct 11 04:36:19 poetry[10361]:   File "/home/gus/gus/crawl.py", line 561, in run_crawl
Oct 11 04:36:19 poetry[10361]:     crawl_page(resource, 0, should_check_if_expired=False)
Oct 11 04:36:19 poetry[10361]:   File "/home/gus/gus/crawl.py", line 491, in crawl_page
Oct 11 04:36:19 poetry[10361]:     crawl_page(
Oct 11 04:36:19 poetry[10361]:   File "/home/gus/gus/crawl.py", line 491, in crawl_page
Oct 11 04:36:19 poetry[10361]:     crawl_page(
Oct 11 04:36:19 poetry[10361]:   File "/home/gus/gus/crawl.py", line 450, in crawl_page
Oct 11 04:36:19 poetry[10361]:     index_links(gr, [redirect_resource])
Oct 11 04:36:19 poetry[10361]:   File "/home/gus/gus/crawl.py", line 272, in index_links
Oct 11 04:36:19 poetry[10361]:     if should_skip(cr):
Oct 11 04:36:19 poetry[10361]:   File "/home/gus/gus/crawl.py", line 248, in should_skip
Oct 11 04:36:19 poetry[10361]:     if resource.normalized_url.startswith(excluded_prefix):
Oct 11 04:36:19 poetry[10361]: AttributeError: 'NoneType' object has no attribute 'startswith'
``` Oct 11 04:36:19 poetry[10361]: Traceback (most recent call last): Oct 11 04:36:19 poetry[10361]: File "<string>", line 1, in <module> Oct 11 04:36:19 poetry[10361]: File "/home/gus/gus/crawl.py", line 572, in main Oct 11 04:36:19 poetry[10361]: run_crawl(args.should_run_destructive, seed_urls=args.seed_urls) Oct 11 04:36:19 poetry[10361]: File "/home/gus/gus/crawl.py", line 561, in run_crawl Oct 11 04:36:19 poetry[10361]: crawl_page(resource, 0, should_check_if_expired=False) Oct 11 04:36:19 poetry[10361]: File "/home/gus/gus/crawl.py", line 491, in crawl_page Oct 11 04:36:19 poetry[10361]: crawl_page( Oct 11 04:36:19 poetry[10361]: File "/home/gus/gus/crawl.py", line 491, in crawl_page Oct 11 04:36:19 poetry[10361]: crawl_page( Oct 11 04:36:19 poetry[10361]: File "/home/gus/gus/crawl.py", line 450, in crawl_page Oct 11 04:36:19 poetry[10361]: index_links(gr, [redirect_resource]) Oct 11 04:36:19 poetry[10361]: File "/home/gus/gus/crawl.py", line 272, in index_links Oct 11 04:36:19 poetry[10361]: if should_skip(cr): Oct 11 04:36:19 poetry[10361]: File "/home/gus/gus/crawl.py", line 248, in should_skip Oct 11 04:36:19 poetry[10361]: if resource.normalized_url.startswith(excluded_prefix): Oct 11 04:36:19 poetry[10361]: AttributeError: 'NoneType' object has no attribute 'startswith' ```
René Wagner added the
bug
label 2 months ago
René Wagner added the due date 2021-10-15 2 months ago
René Wagner self-assigned this 2 months ago
René Wagner closed this issue 2 months ago
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date

2021-10-15

Dependencies

This issue currently doesn't have any dependencies.

Loading…
There is no content yet.