Skip to content

GMAIL Connector fails at a some dates #5467

@BoldizsarP

Description

@BoldizsarP

Hi Team! I am using the gmail connector and run into an issue.

My specs:

  • Locally Hosted Onyx with GPU, docker compose
  • Version 1.7.0
  • 2x Xeon 5687, 160GB ram, NVIDIA 1080
  • OS: Linux, Ubuntu 24.
  • The only change from the dev-gpu compose template is, that I use a generic vespa (because of the CPU architecture)

Stacktrace

Traceback (most recent call last):
  File "/app/onyx/background/indexing/run_docfetching.py", line 1115, in connector_document_extraction
    for document_batch, failure, next_checkpoint in connector_runner.run(
  File "/app/onyx/connectors/connector_runner.py", line 186, in run
    for document_batch in self.connector.poll_source(
  File "/app/onyx/connectors/gmail/connector.py", line 401, in poll_source
    raise e
  File "/app/onyx/connectors/gmail/connector.py", line 397, in poll_source
    yield from self._fetch_threads(start, end)
  File "/app/onyx/connectors/gmail/connector.py", line 329, in _fetch_threads
    doc = thread_to_document(full_thread, user_email)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/onyx/connectors/gmail/connector.py", line 189, in thread_to_document
    updated_at_datetime = time_str_to_utc(updated_at)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/onyx/connectors/cross_connector_utils/miscellaneous_utils.py", line 37, in time_str_to_utc
    dt = parse(datetime_str)
         ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/dateutil/parser/_parser.py", line 1368, in parse
    return DEFAULTPARSER.parse(timestr, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/dateutil/parser/_parser.py", line 643, in parse
    raise ParserError("Unknown string format: %s", timestr)
dateutil.parser._parser.ParserError: Unknown string format: Mon, 12 May 2014 13:39:52 "GMT"

The issue happened while I was indexing gmail (initial index)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions