Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datetime: pure Python implementation of fromisoformat() handles times with trailing spaces inconsistently with the C extension #130959

Open
mgorny opened this issue Mar 7, 2025 · 3 comments
Labels
extension-modules C modules in the Modules dir stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@mgorny
Copy link
Contributor

mgorny commented Mar 7, 2025

Bug report

Bug description:

Discovered primarily because PyPy uses the Python version of datetime from the stdlib, and django.utils.dateparse.parse_datetime() supports wider range of values than the C version of datetime.datetime.fromisoformat(), and falls back to their own parser.

>>> datetime.datetime.fromisoformat("2012-04-23T10:20:30.400")
datetime.datetime(2012, 4, 23, 10, 20, 30, 400000)
>>> datetime.datetime.fromisoformat("2012-04-23T10:20:30.400 ")
datetime.datetime(2012, 4, 23, 10, 20, 30, 40000)
>>> datetime.datetime.fromisoformat("2012-04-23T10:20:30.400  ")
datetime.datetime(2012, 4, 23, 10, 20, 30, 4000)
>>> datetime.datetime.fromisoformat("2012-04-23T10:20:30.400   ")
datetime.datetime(2012, 4, 23, 10, 20, 30, 400)
>>> datetime.datetime.fromisoformat("2012-04-23T10:20:30.400    ")
Traceback (most recent call last):
  File "<python-input-9>", line 1, in <module>
    datetime.datetime.fromisoformat("2012-04-23T10:20:30.400    ")
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mgorny/git/cpython/Lib/_pydatetime.py", line 1949, in fromisoformat
    raise ValueError(
        f'Invalid isoformat string: {date_string!r}') from None
ValueError: Invalid isoformat string: '2012-04-23T10:20:30.400    '
>>> datetime.datetime.fromisoformat("2012-04-23T10:20:30.400 +02:30")
datetime.datetime(2012, 4, 23, 10, 20, 30, 40000, tzinfo=datetime.timezone(datetime.timedelta(seconds=9000)))

For comparison, the C extension rejects all variants containing spaces, causing Django to use its own parser.

PyPy bug report: pypy/pypy#5240

I'm going to try preparing a patch.

CPython versions tested on:

3.11, 3.13, CPython main branch

Operating systems tested on:

Linux

Linked PRs

@mgorny mgorny added the type-bug An unexpected behavior, bug, or error label Mar 7, 2025
@mgorny mgorny changed the title datetime: pure Python implementation of fromisoformat() mishandles times with trailing spaces datetime: pure Python implementation of fromisoformat() handles times with trailing spaces inconsistently Mar 7, 2025
@mgorny mgorny changed the title datetime: pure Python implementation of fromisoformat() handles times with trailing spaces inconsistently datetime: pure Python implementation of fromisoformat() handles times with trailing spaces inconsistently with the C extension Mar 7, 2025
@mgorny
Copy link
Contributor Author

mgorny commented Mar 7, 2025

Hmm, another difference is that when giving 6 digits of microseconds, C implementation accepts them but Python implementation does not:

>>> datetime.datetime.fromisoformat("2021-12-20T05:05:05.600000 +02:30")
datetime.datetime(2021, 12, 20, 5, 5, 5, 600000, tzinfo=datetime.timezone(datetime.timedelta(seconds=9000)))
>>> _pydatetime.datetime.fromisoformat("2021-12-20T05:05:05.600000 +02:30")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mgorny/miniforge3/lib/python3.12/_pydatetime.py", line 1874, in fromisoformat
    raise ValueError(
ValueError: Invalid isoformat string: '2021-12-20T05:05:05.600000 +02:30'

mgorny added a commit to mgorny/cpython that referenced this issue Mar 7, 2025
…isoformat()`

Fix the pure Python implementation of `fromisoformat()` to reject any
non-digit characters, including whitespace, in the fractional part
of time specification.  This makes the behavior consistent with the C
implementation, and prevents incorrect parsing of these fractions
(e.g. `.400 ` would be misinterpreted as `.04`).
@ZeroIntensity ZeroIntensity added stdlib Python modules in the Lib dir extension-modules C modules in the Modules dir labels Mar 8, 2025
@pganssle
Copy link
Member

pganssle commented Mar 8, 2025

Hmm, another difference is that when giving 6 digits of microseconds, C implementation accepts them but Python implementation does not:

I think the spec doesn't allow for whitespace, right? If so, we should do whatever the spec says. It's slightly more likely to break things if we make changes to the C implementation than the Python implementation (since the vast majority of people are using the C implementation).

I think we can fix the issue with the Python implementation and backport the changes to 3.12+, since right now it just gives the wrong answer, and it's better to start raising exceptions here. If we reject the version with a space from the C implementation, I think we would want that to target only 3.14+, because that will take something that currently parses and gives the "correct" answer and start raising exceptions. We don't have to go as far as deprecating the behavior, since the documented scope of the function says that this should be rejected, but I don't think on net it serves users to make that kind of change in a bugfix version.

@mgorny
Copy link
Contributor Author

mgorny commented Mar 8, 2025

I think the spec doesn't allow for whitespace, right?

Yes, the standard doesn't permit whitespace, and it's not listed in "exceptions" in the documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension-modules C modules in the Modules dir stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
Development

No branches or pull requests

3 participants