crawl: sqlite database locked #19

Closed
opened 9 months ago by René Wagner · 4 comments
Owner

I've seen this error twice, the crawl stops at this point.

Mar 20 22:30:16 v2202102141844144675 poetry[87329]: Traceback (most recent call last):
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:   File "/home/gus/.cache/pypoetry/virtualenvs/gus-BXommJzs-py3.7/lib/python3.7/site-packages/peewee.py", line 3129, in execute_sql
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:     cursor.execute(sql, params or ())
Mar 20 22:30:16 v2202102141844144675 poetry[87329]: sqlite3.OperationalError: database is locked
Mar 20 22:30:16 v2202102141844144675 poetry[87329]: During handling of the above exception, another exception occurred:
Mar 20 22:30:16 v2202102141844144675 poetry[87329]: Traceback (most recent call last):
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:   File "<string>", line 1, in <module>
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:   File "/home/gus/gus/crawl.py", line 870, in main
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:     run_crawl(args.should_run_destructive, seed_urls=args.seed_urls)
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:   File "/home/gus/gus/crawl.py", line 855, in run_crawl
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:     crawl_page(resource, 0, should_check_if_expired=False)
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:   File "/home/gus/gus/crawl.py", line 663, in crawl_page
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:     page, is_different = index_content(gr, response)
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:   File "/home/gus/gus/crawl.py", line 398, in index_content
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:     page.save()
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:   File "/home/gus/.cache/pypoetry/virtualenvs/gus-BXommJzs-py3.7/lib/python3.7/site-packages/peewee.py", line 6532, in save
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:     rows = self.update(**field_dict).where(self._pk_expr()).execute()
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:   File "/home/gus/.cache/pypoetry/virtualenvs/gus-BXommJzs-py3.7/lib/python3.7/site-packages/peewee.py", line 1898, in inner
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:     return method(self, database, *args, **kwargs)
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:   File "/home/gus/.cache/pypoetry/virtualenvs/gus-BXommJzs-py3.7/lib/python3.7/site-packages/peewee.py", line 1969, in execute
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:     return self._execute(database)
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:   File "/home/gus/.cache/pypoetry/virtualenvs/gus-BXommJzs-py3.7/lib/python3.7/site-packages/peewee.py", line 2465, in _execute
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:     cursor = database.execute(self)
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:   File "/home/gus/.cache/pypoetry/virtualenvs/gus-BXommJzs-py3.7/lib/python3.7/site-packages/peewee.py", line 3142, in execute
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:     return self.execute_sql(sql, params, commit=commit)
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:   File "/home/gus/.cache/pypoetry/virtualenvs/gus-BXommJzs-py3.7/lib/python3.7/site-packages/peewee.py", line 3136, in execute_sql
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:     self.commit()
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:   File "/home/gus/.cache/pypoetry/virtualenvs/gus-BXommJzs-py3.7/lib/python3.7/site-packages/peewee.py", line 2902, in __exit__
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:     reraise(new_type, new_type(exc_value, *exc_args), traceback)
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:   File "/home/gus/.cache/pypoetry/virtualenvs/gus-BXommJzs-py3.7/lib/python3.7/site-packages/peewee.py", line 185, in reraise
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:     raise value.with_traceback(tb)
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:   File "/home/gus/.cache/pypoetry/virtualenvs/gus-BXommJzs-py3.7/lib/python3.7/site-packages/peewee.py", line 3129, in execute_sql
Mar 20 22:30:16 v2202102141844144675 poetry[87329]:     cursor.execute(sql, params or ())
Mar 20 22:30:16 v2202102141844144675 poetry[87329]: peewee.OperationalError: database is locked
I've seen this error twice, the crawl stops at this point. ``` Mar 20 22:30:16 v2202102141844144675 poetry[87329]: Traceback (most recent call last): Mar 20 22:30:16 v2202102141844144675 poetry[87329]: File "/home/gus/.cache/pypoetry/virtualenvs/gus-BXommJzs-py3.7/lib/python3.7/site-packages/peewee.py", line 3129, in execute_sql Mar 20 22:30:16 v2202102141844144675 poetry[87329]: cursor.execute(sql, params or ()) Mar 20 22:30:16 v2202102141844144675 poetry[87329]: sqlite3.OperationalError: database is locked Mar 20 22:30:16 v2202102141844144675 poetry[87329]: During handling of the above exception, another exception occurred: Mar 20 22:30:16 v2202102141844144675 poetry[87329]: Traceback (most recent call last): Mar 20 22:30:16 v2202102141844144675 poetry[87329]: File "<string>", line 1, in <module> Mar 20 22:30:16 v2202102141844144675 poetry[87329]: File "/home/gus/gus/crawl.py", line 870, in main Mar 20 22:30:16 v2202102141844144675 poetry[87329]: run_crawl(args.should_run_destructive, seed_urls=args.seed_urls) Mar 20 22:30:16 v2202102141844144675 poetry[87329]: File "/home/gus/gus/crawl.py", line 855, in run_crawl Mar 20 22:30:16 v2202102141844144675 poetry[87329]: crawl_page(resource, 0, should_check_if_expired=False) Mar 20 22:30:16 v2202102141844144675 poetry[87329]: File "/home/gus/gus/crawl.py", line 663, in crawl_page Mar 20 22:30:16 v2202102141844144675 poetry[87329]: page, is_different = index_content(gr, response) Mar 20 22:30:16 v2202102141844144675 poetry[87329]: File "/home/gus/gus/crawl.py", line 398, in index_content Mar 20 22:30:16 v2202102141844144675 poetry[87329]: page.save() Mar 20 22:30:16 v2202102141844144675 poetry[87329]: File "/home/gus/.cache/pypoetry/virtualenvs/gus-BXommJzs-py3.7/lib/python3.7/site-packages/peewee.py", line 6532, in save Mar 20 22:30:16 v2202102141844144675 poetry[87329]: rows = self.update(**field_dict).where(self._pk_expr()).execute() Mar 20 22:30:16 v2202102141844144675 poetry[87329]: File "/home/gus/.cache/pypoetry/virtualenvs/gus-BXommJzs-py3.7/lib/python3.7/site-packages/peewee.py", line 1898, in inner Mar 20 22:30:16 v2202102141844144675 poetry[87329]: return method(self, database, *args, **kwargs) Mar 20 22:30:16 v2202102141844144675 poetry[87329]: File "/home/gus/.cache/pypoetry/virtualenvs/gus-BXommJzs-py3.7/lib/python3.7/site-packages/peewee.py", line 1969, in execute Mar 20 22:30:16 v2202102141844144675 poetry[87329]: return self._execute(database) Mar 20 22:30:16 v2202102141844144675 poetry[87329]: File "/home/gus/.cache/pypoetry/virtualenvs/gus-BXommJzs-py3.7/lib/python3.7/site-packages/peewee.py", line 2465, in _execute Mar 20 22:30:16 v2202102141844144675 poetry[87329]: cursor = database.execute(self) Mar 20 22:30:16 v2202102141844144675 poetry[87329]: File "/home/gus/.cache/pypoetry/virtualenvs/gus-BXommJzs-py3.7/lib/python3.7/site-packages/peewee.py", line 3142, in execute Mar 20 22:30:16 v2202102141844144675 poetry[87329]: return self.execute_sql(sql, params, commit=commit) Mar 20 22:30:16 v2202102141844144675 poetry[87329]: File "/home/gus/.cache/pypoetry/virtualenvs/gus-BXommJzs-py3.7/lib/python3.7/site-packages/peewee.py", line 3136, in execute_sql Mar 20 22:30:16 v2202102141844144675 poetry[87329]: self.commit() Mar 20 22:30:16 v2202102141844144675 poetry[87329]: File "/home/gus/.cache/pypoetry/virtualenvs/gus-BXommJzs-py3.7/lib/python3.7/site-packages/peewee.py", line 2902, in __exit__ Mar 20 22:30:16 v2202102141844144675 poetry[87329]: reraise(new_type, new_type(exc_value, *exc_args), traceback) Mar 20 22:30:16 v2202102141844144675 poetry[87329]: File "/home/gus/.cache/pypoetry/virtualenvs/gus-BXommJzs-py3.7/lib/python3.7/site-packages/peewee.py", line 185, in reraise Mar 20 22:30:16 v2202102141844144675 poetry[87329]: raise value.with_traceback(tb) Mar 20 22:30:16 v2202102141844144675 poetry[87329]: File "/home/gus/.cache/pypoetry/virtualenvs/gus-BXommJzs-py3.7/lib/python3.7/site-packages/peewee.py", line 3129, in execute_sql Mar 20 22:30:16 v2202102141844144675 poetry[87329]: cursor.execute(sql, params or ()) Mar 20 22:30:16 v2202102141844144675 poetry[87329]: peewee.OperationalError: database is locked ```
René Wagner added the
bug
label 9 months ago
Poster
Owner

This may be a race condition between the crawler trying to save a page and a search request which stores the search term in the same sqlite db.

This may be a race condition between the crawler trying to save a page and a search request which stores the search term in the same sqlite db.
Poster
Owner

this happened in 2 subsequent crawls,

this happened in 2 subsequent crawls,
René Wagner self-assigned this 8 months ago
Poster
Owner

first attempt: search log in gus.sqlite is disabled for the moment

first attempt: search log in gus.sqlite is disabled for the moment
Poster
Owner

fixed in 1266d9a93b

fixed in 1266d9a93b
René Wagner closed this issue 7 months ago
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date

No due date set.

Dependencies

This issue currently doesn't have any dependencies.

Loading…
There is no content yet.