问题:如何解析ISO 8601格式的日期?

我需要将RFC 3339字符串解析"2008-09-03T20:56:35.450686Z"为Python的datetime类型。

我已经strptime在Python标准库中找到了,但这不是很方便。

做这个的最好方式是什么?

I need to parse RFC 3339 strings like "2008-09-03T20:56:35.450686Z" into Python’s datetime type.

I have found strptime in the Python standard library, but it is not very convenient.

What is the best way to do this?


回答 0

Python-dateutil包可以解析不仅RFC 3339日期时间字符串像在的问题,还包括其他ISO 8601的日期和时间字符串不符合RFC 3339(如那些没有UTC偏移,或那些代表仅一个日期)。

>>> import dateutil.parser
>>> dateutil.parser.isoparse('2008-09-03T20:56:35.450686Z') # RFC 3339 format
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686, tzinfo=tzutc())
>>> dateutil.parser.isoparse('2008-09-03T20:56:35.450686') # ISO 8601 extended format
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686)
>>> dateutil.parser.isoparse('20080903T205635.450686') # ISO 8601 basic format
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686)
>>> dateutil.parser.isoparse('20080903') # ISO 8601 basic format, date only
datetime.datetime(2008, 9, 3, 0, 0)

请注意,这dateutil.parser.isoparse可能比更严格的方法更严格dateutil.parser.parse,但是它们两者都是相当宽容的,并且会尝试解释您传入的字符串。如果要消除任何误读的可能性,则需要使用比这两种方法都更严格的方法功能。

Pypi名称是,不是dateutil(感谢code3monk3y):

pip install python-dateutil

如果您使用的是Python 3.7,请查看有关的答案datetime.datetime.fromisoformat

The python-dateutil package can parse not only RFC 3339 datetime strings like the one in the question, but also other ISO 8601 date and time strings that don’t comply with RFC 3339 (such as ones with no UTC offset, or ones that represent only a date).

>>> import dateutil.parser
>>> dateutil.parser.isoparse('2008-09-03T20:56:35.450686Z') # RFC 3339 format
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686, tzinfo=tzutc())
>>> dateutil.parser.isoparse('2008-09-03T20:56:35.450686') # ISO 8601 extended format
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686)
>>> dateutil.parser.isoparse('20080903T205635.450686') # ISO 8601 basic format
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686)
>>> dateutil.parser.isoparse('20080903') # ISO 8601 basic format, date only
datetime.datetime(2008, 9, 3, 0, 0)

Note that dateutil.parser.isoparse is presumably stricter than the more hacky dateutil.parser.parse, but both of them are quite forgiving and will attempt to interpret the string that you pass in. If you want to eliminate the possibility of any misreads, you need to use something stricter than either of these functions.

The Pypi name is , not dateutil (thanks code3monk3y):

pip install python-dateutil

If you’re using Python 3.7, have a look at this answer about datetime.datetime.fromisoformat.


回答 1

Python 3.7+中的新功能


datetime标准库中引入了一个功能反转datetime.isoformat()

classmethod datetime.fromisoformat(date_string)

以和发出的格式之一返回datetime对应于的。date_stringdate.isoformat()datetime.isoformat()

具体来说,此函数支持以下格式的字符串:

YYYY-MM-DD[*HH[:MM[:SS[.mmm[mmm]]]][+HH:MM[:SS[.ffffff]]]]

在哪里*可以匹配任何单个字符。

注意:这不支持解析任意ISO 8601字符串-只能用作的反操作datetime.isoformat()

使用示例:

from datetime import datetime

date = datetime.fromisoformat('2017-01-01T12:30:59.000000')

New in Python 3.7+


The datetime standard library introduced a function for inverting datetime.isoformat().

classmethod datetime.fromisoformat(date_string):

Return a datetime corresponding to a date_string in one of the formats emitted by date.isoformat() and datetime.isoformat().

Specifically, this function supports strings in the format(s):

YYYY-MM-DD[*HH[:MM[:SS[.mmm[mmm]]]][+HH:MM[:SS[.ffffff]]]]

where * can match any single character.

Caution: This does not support parsing arbitrary ISO 8601 strings – it is only intended as the inverse operation of datetime.isoformat().

Example of use:

from datetime import datetime

date = datetime.fromisoformat('2017-01-01T12:30:59.000000')

回答 2

请注意,在Python 2.6+和Py3K中,%f字符捕获微秒。

>>> datetime.datetime.strptime("2008-09-03T20:56:35.450686Z", "%Y-%m-%dT%H:%M:%S.%fZ")

在这里查看问题

Note in Python 2.6+ and Py3K, the %f character catches microseconds.

>>> datetime.datetime.strptime("2008-09-03T20:56:35.450686Z", "%Y-%m-%dT%H:%M:%S.%fZ")

See issue here


回答 3

这里有几个答案 建议使用解析时区的RFC 3339或ISO 8601日期时间,就像问题中展示的那样: datetime.datetime.strptime

2008-09-03T20:56:35.450686Z

这是一个坏主意。

假设您要支持完整的RFC 3339格式,包括对非零的UTC偏移量的支持,那么这些答案所建议的代码将不起作用。事实上,它不能工作,因为解析RFC 3339语法使用strptime是不可能的。Python的datetime模块使用的格式字符串无法描述RFC 3339语法。

问题是UTC偏移量。在RFC 3339互联网日期/时间格式要求每个日期时间包括UTC偏移,并且这些偏移可以是Z(以下简称“祖鲁时间”),或在+HH:MM-HH:MM格式,如+05:00-10:30

因此,这些都是有效的RFC 3339日期时间:

  • 2008-09-03T20:56:35.450686Z
  • 2008-09-03T20:56:35.450686+05:00
  • 2008-09-03T20:56:35.450686-10:30

可惜的是,所使用的格式字符串通过strptimestrftime没有指令,对应于RFC 3339格式的UTC偏移。可以在https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior中找到它们支持的指令的完整列表,并且列表中唯一包含的UTC偏移量指令是%z

%z

UTC偏移量,格式为+ HHMM或-HHMM(如果对象是天真对象,则为空字符串)。

例如:(空),+ 0000,-0400,+ 1030

这与RFC 3339偏移量的格式不匹配,实际上,如果我们尝试%z在格式字符串中使用并解析RFC 3339日期,则将失败:

>>> from datetime import datetime
>>> datetime.strptime("2008-09-03T20:56:35.450686Z", "%Y-%m-%dT%H:%M:%S.%f%z")
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/lib/python3.4/_strptime.py", line 500, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "/usr/lib/python3.4/_strptime.py", line 337, in _strptime
    (data_string, format))
ValueError: time data '2008-09-03T20:56:35.450686Z' does not match format '%Y-%m-%dT%H:%M:%S.%f%z'
>>> datetime.strptime("2008-09-03T20:56:35.450686+05:00", "%Y-%m-%dT%H:%M:%S.%f%z")
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/lib/python3.4/_strptime.py", line 500, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "/usr/lib/python3.4/_strptime.py", line 337, in _strptime
    (data_string, format))
ValueError: time data '2008-09-03T20:56:35.450686+05:00' does not match format '%Y-%m-%dT%H:%M:%S.%f%z'

(实际上,以上内容就是您在Python 3中看到的。在Python 2中,我们失败的原因更为简单,这是因为strptime%z在Python 2根本没有实现该指令。)

推荐使用以下strptime所有方法的多个答案都可以通过Z在其格式字符串中包含一个字面量来解决此问题,该字面量与Z问题质询者的示例datetime字符串中的匹配(并丢弃它,从而生成datetime没有时区的对象):

>>> datetime.strptime("2008-09-03T20:56:35.450686Z", "%Y-%m-%dT%H:%M:%S.%fZ")
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686)

由于这会丢弃原始datetime字符串中包含的时区信息,因此我们是否应该甚至将此结果都视为正确还值得怀疑。但更重要的是,由于此方法涉及将特定的UTC偏移量硬编码到格式字符串中,因此它将在尝试解析具有不同UTC偏移量的任何RFC 3339日期时间时将阻塞:

>>> datetime.strptime("2008-09-03T20:56:35.450686+05:00", "%Y-%m-%dT%H:%M:%S.%fZ")
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/lib/python3.4/_strptime.py", line 500, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "/usr/lib/python3.4/_strptime.py", line 337, in _strptime
    (data_string, format))
ValueError: time data '2008-09-03T20:56:35.450686+05:00' does not match format '%Y-%m-%dT%H:%M:%S.%fZ'

除非您确定只需要在Zulu时间中支持RFC 3339日期时间,而不是具有其他时区偏移量的日期时间,请不要使用strptime。请改用此处答案中描述的许多其他方法之一。

Several answers here suggest using datetime.datetime.strptime to parse RFC 3339 or ISO 8601 datetimes with timezones, like the one exhibited in the question:

2008-09-03T20:56:35.450686Z

This is a bad idea.

Assuming that you want to support the full RFC 3339 format, including support for UTC offsets other than zero, then the code these answers suggest does not work. Indeed, it cannot work, because parsing RFC 3339 syntax using strptime is impossible. The format strings used by Python’s datetime module are incapable of describing RFC 3339 syntax.

The problem is UTC offsets. The RFC 3339 Internet Date/Time Format requires that every date-time includes a UTC offset, and that those offsets can either be Z (short for “Zulu time”) or in +HH:MM or -HH:MM format, like +05:00 or -10:30.

Consequently, these are all valid RFC 3339 datetimes:

  • 2008-09-03T20:56:35.450686Z
  • 2008-09-03T20:56:35.450686+05:00
  • 2008-09-03T20:56:35.450686-10:30

Alas, the format strings used by strptime and strftime have no directive that corresponds to UTC offsets in RFC 3339 format. A complete list of the directives they support can be found at https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior, and the only UTC offset directive included in the list is %z:

%z

UTC offset in the form +HHMM or -HHMM (empty string if the the object is naive).

Example: (empty), +0000, -0400, +1030

This doesn’t match the format of an RFC 3339 offset, and indeed if we try to use %z in the format string and parse an RFC 3339 date, we’ll fail:

>>> from datetime import datetime
>>> datetime.strptime("2008-09-03T20:56:35.450686Z", "%Y-%m-%dT%H:%M:%S.%f%z")
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/lib/python3.4/_strptime.py", line 500, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "/usr/lib/python3.4/_strptime.py", line 337, in _strptime
    (data_string, format))
ValueError: time data '2008-09-03T20:56:35.450686Z' does not match format '%Y-%m-%dT%H:%M:%S.%f%z'
>>> datetime.strptime("2008-09-03T20:56:35.450686+05:00", "%Y-%m-%dT%H:%M:%S.%f%z")
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/lib/python3.4/_strptime.py", line 500, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "/usr/lib/python3.4/_strptime.py", line 337, in _strptime
    (data_string, format))
ValueError: time data '2008-09-03T20:56:35.450686+05:00' does not match format '%Y-%m-%dT%H:%M:%S.%f%z'

(Actually, the above is just what you’ll see in Python 3. In Python 2 we’ll fail for an even simpler reason, which is that strptime does not implement the %z directive at all in Python 2.)

The multiple answers here that recommend strptime all work around this by including a literal Z in their format string, which matches the Z from the question asker’s example datetime string (and discards it, producing a datetime object without a timezone):

>>> datetime.strptime("2008-09-03T20:56:35.450686Z", "%Y-%m-%dT%H:%M:%S.%fZ")
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686)

Since this discards timezone information that was included in the original datetime string, it’s questionable whether we should regard even this result as correct. But more importantly, because this approach involves hard-coding a particular UTC offset into the format string, it will choke the moment it tries to parse any RFC 3339 datetime with a different UTC offset:

>>> datetime.strptime("2008-09-03T20:56:35.450686+05:00", "%Y-%m-%dT%H:%M:%S.%fZ")
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/lib/python3.4/_strptime.py", line 500, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "/usr/lib/python3.4/_strptime.py", line 337, in _strptime
    (data_string, format))
ValueError: time data '2008-09-03T20:56:35.450686+05:00' does not match format '%Y-%m-%dT%H:%M:%S.%fZ'

Unless you’re certain that you only need to support RFC 3339 datetimes in Zulu time, and not ones with other timezone offsets, don’t use strptime. Use one of the many other approaches described in answers here instead.


回答 4

尝试使用iso8601模块;它正是这样做的。

python.org Wiki 上的WorkingWithTime页面上提到了其他几个选项。

Try the iso8601 module; it does exactly this.

There are several other options mentioned on the WorkingWithTime page on the python.org wiki.


回答 5

导入时间,日期时间
s =“ 2008-09-03T20:56:35.450686Z”
d = datetime.datetime(* map(int,re.split('[^ \ d]',s)[:-1]))
import re,datetime
s="2008-09-03T20:56:35.450686Z"
d=datetime.datetime(*map(int, re.split('[^\d]', s)[:-1]))

回答 6

您得到的确切错误是什么?像下面吗?

>>> datetime.datetime.strptime("2008-08-12T12:20:30.656234Z", "%Y-%m-%dT%H:%M:%S.Z")
ValueError: time data did not match format:  data=2008-08-12T12:20:30.656234Z  fmt=%Y-%m-%dT%H:%M:%S.Z

如果是,则可以在“。”上分割输入字符串,然后将微秒添加到您获得的日期时间。

尝试这个:

>>> def gt(dt_str):
        dt, _, us= dt_str.partition(".")
        dt= datetime.datetime.strptime(dt, "%Y-%m-%dT%H:%M:%S")
        us= int(us.rstrip("Z"), 10)
        return dt + datetime.timedelta(microseconds=us)

>>> gt("2008-08-12T12:20:30.656234Z")
datetime.datetime(2008, 8, 12, 12, 20, 30, 656234)

What is the exact error you get? Is it like the following?

>>> datetime.datetime.strptime("2008-08-12T12:20:30.656234Z", "%Y-%m-%dT%H:%M:%S.Z")
ValueError: time data did not match format:  data=2008-08-12T12:20:30.656234Z  fmt=%Y-%m-%dT%H:%M:%S.Z

If yes, you can split your input string on “.”, and then add the microseconds to the datetime you got.

Try this:

>>> def gt(dt_str):
        dt, _, us= dt_str.partition(".")
        dt= datetime.datetime.strptime(dt, "%Y-%m-%dT%H:%M:%S")
        us= int(us.rstrip("Z"), 10)
        return dt + datetime.timedelta(microseconds=us)

>>> gt("2008-08-12T12:20:30.656234Z")
datetime.datetime(2008, 8, 12, 12, 20, 30, 656234)

回答 7

从Python 3.7开始,strptime在UTC偏移量()中支持冒号分隔符。因此,您可以使用:

import datetime
datetime.datetime.strptime('2018-01-31T09:24:31.488670+00:00', '%Y-%m-%dT%H:%M:%S.%f%z')

编辑:

正如Martijn所指出的那样,如果您使用isoformat()创建了datetime对象,则只需使用datetime.fromisoformat()

Starting from Python 3.7, strptime supports colon delimiters in UTC offsets (source). So you can then use:

import datetime
datetime.datetime.strptime('2018-01-31T09:24:31.488670+00:00', '%Y-%m-%dT%H:%M:%S.%f%z')

EDIT:

As pointed out by Martijn, if you created the datetime object using isoformat(), you can simply use datetime.fromisoformat()


回答 8

如今,Arrow还可以用作第三方解决方案:

>>> import arrow
>>> date = arrow.get("2008-09-03T20:56:35.450686Z")
>>> date.datetime
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686, tzinfo=tzutc())

In these days, Arrow also can be used as a third-party solution:

>>> import arrow
>>> date = arrow.get("2008-09-03T20:56:35.450686Z")
>>> date.datetime
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686, tzinfo=tzutc())

回答 9

只需使用python-dateutil模块:

>>> import dateutil.parser as dp
>>> t = '1984-06-02T19:05:00.000Z'
>>> parsed_t = dp.parse(t)
>>> print(parsed_t)
datetime.datetime(1984, 6, 2, 19, 5, tzinfo=tzutc())

文献资料

Just use the python-dateutil module:

>>> import dateutil.parser as dp
>>> t = '1984-06-02T19:05:00.000Z'
>>> parsed_t = dp.parse(t)
>>> print(parsed_t)
datetime.datetime(1984, 6, 2, 19, 5, tzinfo=tzutc())

Documentation


回答 10

如果您不想使用dateutil,可以尝试以下功能:

def from_utc(utcTime,fmt="%Y-%m-%dT%H:%M:%S.%fZ"):
    """
    Convert UTC time string to time.struct_time
    """
    # change datetime.datetime to time, return time.struct_time type
    return datetime.datetime.strptime(utcTime, fmt)

测试:

from_utc("2007-03-04T21:08:12.123Z")

结果:

datetime.datetime(2007, 3, 4, 21, 8, 12, 123000)

If you don’t want to use dateutil, you can try this function:

def from_utc(utcTime,fmt="%Y-%m-%dT%H:%M:%S.%fZ"):
    """
    Convert UTC time string to time.struct_time
    """
    # change datetime.datetime to time, return time.struct_time type
    return datetime.datetime.strptime(utcTime, fmt)

Test:

from_utc("2007-03-04T21:08:12.123Z")

Result:

datetime.datetime(2007, 3, 4, 21, 8, 12, 123000)

回答 11

如果使用Django,它将提供dateparse模块,该模块接受一堆类似于ISO格式的格式,包括时区。

如果您不使用Django,并且不想使用此处提到的其他库之一,则可以将dateparse的Django源代码修改为您的项目。

If you are working with Django, it provides the dateparse module that accepts a bunch of formats similar to ISO format, including the time zone.

If you are not using Django and you don’t want to use one of the other libraries mentioned here, you could probably adapt the Django source code for dateparse to your project.


回答 12

我发现ciso8601是解析ISO 8601时间戳的最快方法。顾名思义,它是用C实现的。

import ciso8601
ciso8601.parse_datetime('2014-01-09T21:48:00.921000+05:30')

GitHub库自述相对于其他答案中列出的所有其他库显示了它们的> 10倍加速。

我的个人项目涉及很多ISO 8601解析。能够切换通话并加快10倍速度真是太好了。:)

编辑:我从此成为ciso8601的维护者。现在比以往更快!

I have found ciso8601 to be the fastest way to parse ISO 8601 timestamps. As the name suggests, it is implemented in C.

import ciso8601
ciso8601.parse_datetime('2014-01-09T21:48:00.921000+05:30')

The GitHub Repo README shows their >10x speedup versus all of the other libraries listed in the other answers.

My personal project involved a lot of ISO 8601 parsing. It was nice to be able to just switch the call and go 10x faster. :)

Edit: I have since become a maintainer of ciso8601. It’s now faster than ever!


回答 13

这适用于从Python 3.2开始的stdlib(假设所有时间戳均为UTC):

from datetime import datetime, timezone, timedelta
datetime.strptime(timestamp, "%Y-%m-%dT%H:%M:%S.%fZ").replace(
    tzinfo=timezone(timedelta(0)))

例如,

>>> datetime.utcnow().replace(tzinfo=timezone(timedelta(0)))
... datetime.datetime(2015, 3, 11, 6, 2, 47, 879129, tzinfo=datetime.timezone.utc)

This works for stdlib on Python 3.2 onwards (assuming all the timestamps are UTC):

from datetime import datetime, timezone, timedelta
datetime.strptime(timestamp, "%Y-%m-%dT%H:%M:%S.%fZ").replace(
    tzinfo=timezone(timedelta(0)))

For example,

>>> datetime.utcnow().replace(tzinfo=timezone(timedelta(0)))
... datetime.datetime(2015, 3, 11, 6, 2, 47, 879129, tzinfo=datetime.timezone.utc)

回答 14

我是iso8601 utils的作者。可以在GitHubPyPI 上找到它。这是解析示例的方法:

>>> from iso8601utils import parsers
>>> parsers.datetime('2008-09-03T20:56:35.450686Z')
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686)

I’m the author of iso8601 utils. It can be found on GitHub or on PyPI. Here’s how you can parse your example:

>>> from iso8601utils import parsers
>>> parsers.datetime('2008-09-03T20:56:35.450686Z')
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686)

回答 15

datetime.datetime在不安装第三方模块的情况下,在所有受支持的Python版本中将类似于ISO 8601的日期字符串转换为UNIX时间戳或对象的一种直接方法是使用SQLite日期解析器

#!/usr/bin/env python
from __future__ import with_statement, division, print_function
import sqlite3
import datetime

testtimes = [
    "2016-08-25T16:01:26.123456Z",
    "2016-08-25T16:01:29",
]
db = sqlite3.connect(":memory:")
c = db.cursor()
for timestring in testtimes:
    c.execute("SELECT strftime('%s', ?)", (timestring,))
    converted = c.fetchone()[0]
    print("%s is %s after epoch" % (timestring, converted))
    dt = datetime.datetime.fromtimestamp(int(converted))
    print("datetime is %s" % dt)

输出:

2016-08-25T16:01:26.123456Z is 1472140886 after epoch
datetime is 2016-08-25 12:01:26
2016-08-25T16:01:29 is 1472140889 after epoch
datetime is 2016-08-25 12:01:29

One straightforward way to convert an ISO 8601-like date string to a UNIX timestamp or datetime.datetime object in all supported Python versions without installing third-party modules is to use the date parser of SQLite.

#!/usr/bin/env python
from __future__ import with_statement, division, print_function
import sqlite3
import datetime

testtimes = [
    "2016-08-25T16:01:26.123456Z",
    "2016-08-25T16:01:29",
]
db = sqlite3.connect(":memory:")
c = db.cursor()
for timestring in testtimes:
    c.execute("SELECT strftime('%s', ?)", (timestring,))
    converted = c.fetchone()[0]
    print("%s is %s after epoch" % (timestring, converted))
    dt = datetime.datetime.fromtimestamp(int(converted))
    print("datetime is %s" % dt)

Output:

2016-08-25T16:01:26.123456Z is 1472140886 after epoch
datetime is 2016-08-25 12:01:26
2016-08-25T16:01:29 is 1472140889 after epoch
datetime is 2016-08-25 12:01:29

回答 16

我已经为ISO 8601标准编写了一个解析器,并将其放在GitHub上:https : //github.com/boxed/iso8601。此实现支持规范中的所有内容,但持续时间,间隔,周期性间隔和日期不在Python datetime模块支持的日期范围内。

测试包括在内!:P

I’ve coded up a parser for the ISO 8601 standard and put it on GitHub: https://github.com/boxed/iso8601. This implementation supports everything in the specification except for durations, intervals, periodic intervals, and dates outside the supported date range of Python’s datetime module.

Tests are included! :P


回答 17

Django的parse_datetime()函数支持带有UTC偏移量的日期:

parse_datetime('2016-08-09T15:12:03.65478Z') =
datetime.datetime(2016, 8, 9, 15, 12, 3, 654780, tzinfo=<UTC>)

因此,它可用于解析整个项目中字段中的ISO 8601日期:

from django.utils import formats
from django.forms.fields import DateTimeField
from django.utils.dateparse import parse_datetime

class DateTimeFieldFixed(DateTimeField):
    def strptime(self, value, format):
        if format == 'iso-8601':
            return parse_datetime(value)
        return super().strptime(value, format)

DateTimeField.strptime = DateTimeFieldFixed.strptime
formats.ISO_INPUT_FORMATS['DATETIME_INPUT_FORMATS'].insert(0, 'iso-8601')

Django’s parse_datetime() function supports dates with UTC offsets:

parse_datetime('2016-08-09T15:12:03.65478Z') =
datetime.datetime(2016, 8, 9, 15, 12, 3, 654780, tzinfo=<UTC>)

So it could be used for parsing ISO 8601 dates in fields within entire project:

from django.utils import formats
from django.forms.fields import DateTimeField
from django.utils.dateparse import parse_datetime

class DateTimeFieldFixed(DateTimeField):
    def strptime(self, value, format):
        if format == 'iso-8601':
            return parse_datetime(value)
        return super().strptime(value, format)

DateTimeField.strptime = DateTimeFieldFixed.strptime
formats.ISO_INPUT_FORMATS['DATETIME_INPUT_FORMATS'].insert(0, 'iso-8601')

回答 18


如果您只想使用带有Z后缀的UTC的基本情况,例如2016-06-29T19:36:29.3453Z

datetime.datetime.strptime(timestamp.translate(None, ':-'), "%Y%m%dT%H%M%S.%fZ")


如果您想处理时区偏移 2016-06-29T19:36:29.3453-0400,请2008-09-03T20:56:35.450686+05:00使用以下方法。这些将所有变体转换成没有变量定界符的东西,例如 20080903T205635.450686+0500使其更一致/更容易解析。

import re
# this regex removes all colons and all 
# dashes EXCEPT for the dash indicating + or - utc offset for the timezone
conformed_timestamp = re.sub(r"[:]|([-](?!((\d{2}[:]\d{2})|(\d{4}))$))", '', timestamp)
datetime.datetime.strptime(conformed_timestamp, "%Y%m%dT%H%M%S.%f%z" )


如果您的系统不支持%zstrptime指令(您看到类似的信息ValueError: 'z' is a bad directive in format '%Y%m%dT%H%M%S.%f%z'),那么您需要手动将时间与Z(UTC)相抵消。注意%z在python版本<3中可能无法在您的系统上运行,因为它取决于c库支持,该支持因系统/ python构建类型(即Jython,Cython等)而异。

import re
import datetime

# this regex removes all colons and all 
# dashes EXCEPT for the dash indicating + or - utc offset for the timezone
conformed_timestamp = re.sub(r"[:]|([-](?!((\d{2}[:]\d{2})|(\d{4}))$))", '', timestamp)

# split on the offset to remove it. use a capture group to keep the delimiter
split_timestamp = re.split(r"[+|-]",conformed_timestamp)
main_timestamp = split_timestamp[0]
if len(split_timestamp) == 3:
    sign = split_timestamp[1]
    offset = split_timestamp[2]
else:
    sign = None
    offset = None

# generate the datetime object without the offset at UTC time
output_datetime = datetime.datetime.strptime(main_timestamp +"Z", "%Y%m%dT%H%M%S.%fZ" )
if offset:
    # create timedelta based on offset
    offset_delta = datetime.timedelta(hours=int(sign+offset[:-2]), minutes=int(sign+offset[-2:]))
    # offset datetime with timedelta
    output_datetime = output_datetime + offset_delta


If you just want a basic case that work for UTC with the Z suffix like 2016-06-29T19:36:29.3453Z:
datetime.datetime.strptime(timestamp.translate(None, ':-'), "%Y%m%dT%H%M%S.%fZ")


If you want to handle timezone offsets like 2016-06-29T19:36:29.3453-0400 or 2008-09-03T20:56:35.450686+05:00 use the following. These will convert all variations into something without variable delimiters like 20080903T205635.450686+0500 making it more consistent/easier to parse.
import re
# this regex removes all colons and all 
# dashes EXCEPT for the dash indicating + or - utc offset for the timezone
conformed_timestamp = re.sub(r"[:]|([-](?!((\d{2}[:]\d{2})|(\d{4}))$))", '', timestamp)
datetime.datetime.strptime(conformed_timestamp, "%Y%m%dT%H%M%S.%f%z" )


If your system does not support the %z strptime directive (you see something like ValueError: 'z' is a bad directive in format '%Y%m%dT%H%M%S.%f%z') then you need to manually offset the time from Z (UTC). Note %z may not work on your system in python versions < 3 as it depended on the c library support which varies across system/python build type (i.e. Jython, Cython, etc.).
import re
import datetime

# this regex removes all colons and all 
# dashes EXCEPT for the dash indicating + or - utc offset for the timezone
conformed_timestamp = re.sub(r"[:]|([-](?!((\d{2}[:]\d{2})|(\d{4}))$))", '', timestamp)

# split on the offset to remove it. use a capture group to keep the delimiter
split_timestamp = re.split(r"[+|-]",conformed_timestamp)
main_timestamp = split_timestamp[0]
if len(split_timestamp) == 3:
    sign = split_timestamp[1]
    offset = split_timestamp[2]
else:
    sign = None
    offset = None

# generate the datetime object without the offset at UTC time
output_datetime = datetime.datetime.strptime(main_timestamp +"Z", "%Y%m%dT%H%M%S.%fZ" )
if offset:
    # create timedelta based on offset
    offset_delta = datetime.timedelta(hours=int(sign+offset[:-2]), minutes=int(sign+offset[-2:]))
    # offset datetime with timedelta
    output_datetime = output_datetime + offset_delta

回答 19

对于适用于2.X标准库的内容,请尝试:

calendar.timegm(time.strptime(date.split(".")[0]+"UTC", "%Y-%m-%dT%H:%M:%S%Z"))

calendar.timegm是time.mktime缺少的gm版本。

For something that works with the 2.X standard library try:

calendar.timegm(time.strptime(date.split(".")[0]+"UTC", "%Y-%m-%dT%H:%M:%S%Z"))

calendar.timegm is the missing gm version of time.mktime.


回答 20

如果解析无效的日期字符串,则python-dateutil将引发异常,因此您可能想捕获该异常。

from dateutil import parser
ds = '2012-60-31'
try:
  dt = parser.parse(ds)
except ValueError, e:
  print '"%s" is an invalid date' % ds

The python-dateutil will throw an exception if parsing invalid date strings, so you may want to catch the exception.

from dateutil import parser
ds = '2012-60-31'
try:
  dt = parser.parse(ds)
except ValueError, e:
  print '"%s" is an invalid date' % ds

回答 21

如今,流行的“请求:HTTP for Humans™”软件包的作者发表了《Maya:Datetimes for Humans™》

>>> import maya
>>> str = '2008-09-03T20:56:35.450686Z'
>>> maya.MayaDT.from_rfc3339(str).datetime()
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686, tzinfo=<UTC>)

Nowadays there’s Maya: Datetimes for Humans™, from the author of the popular Requests: HTTP for Humans™ package:

>>> import maya
>>> str = '2008-09-03T20:56:35.450686Z'
>>> maya.MayaDT.from_rfc3339(str).datetime()
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686, tzinfo=<UTC>)

回答 22

对ISO-8601使用专门的解析器的另一种方法是使用dateutil解析器的isoparse函数:

from dateutil import parser

date = parser.isoparse("2008-09-03T20:56:35.450686+01:00")
print(date)

输出:

2008-09-03 20:56:35.450686+01:00

标准Python函数datetime.fromisoformat文档中也提到了此函数

第三方软件包dateutil中提供了功能更全的ISO 8601解析器dateutil.parser.isoparse。

An another way is to use specialized parser for ISO-8601 is to use isoparse function of dateutil parser:

from dateutil import parser

date = parser.isoparse("2008-09-03T20:56:35.450686+01:00")
print(date)

Output:

2008-09-03 20:56:35.450686+01:00

This function is also mentioned in the documentation for the standard Python function datetime.fromisoformat:

A more full-featured ISO 8601 parser, dateutil.parser.isoparse is available in the third-party package dateutil.


回答 23

多亏了马克·阿默里(Mark Amery)的出色回答,我设计了函数来说明所有可能的日期时间ISO格式:

class FixedOffset(tzinfo):
    """Fixed offset in minutes: `time = utc_time + utc_offset`."""
    def __init__(self, offset):
        self.__offset = timedelta(minutes=offset)
        hours, minutes = divmod(offset, 60)
        #NOTE: the last part is to remind about deprecated POSIX GMT+h timezones
        #  that have the opposite sign in the name;
        #  the corresponding numeric value is not used e.g., no minutes
        self.__name = '<%+03d%02d>%+d' % (hours, minutes, -hours)
    def utcoffset(self, dt=None):
        return self.__offset
    def tzname(self, dt=None):
        return self.__name
    def dst(self, dt=None):
        return timedelta(0)
    def __repr__(self):
        return 'FixedOffset(%d)' % (self.utcoffset().total_seconds() / 60)
    def __getinitargs__(self):
        return (self.__offset.total_seconds()/60,)

def parse_isoformat_datetime(isodatetime):
    try:
        return datetime.strptime(isodatetime, '%Y-%m-%dT%H:%M:%S.%f')
    except ValueError:
        pass
    try:
        return datetime.strptime(isodatetime, '%Y-%m-%dT%H:%M:%S')
    except ValueError:
        pass
    pat = r'(.*?[+-]\d{2}):(\d{2})'
    temp = re.sub(pat, r'\1\2', isodatetime)
    naive_date_str = temp[:-5]
    offset_str = temp[-5:]
    naive_dt = datetime.strptime(naive_date_str, '%Y-%m-%dT%H:%M:%S.%f')
    offset = int(offset_str[-4:-2])*60 + int(offset_str[-2:])
    if offset_str[0] == "-":
        offset = -offset
    return naive_dt.replace(tzinfo=FixedOffset(offset))

Thanks to great Mark Amery’s answer I devised function to account for all possible ISO formats of datetime:

class FixedOffset(tzinfo):
    """Fixed offset in minutes: `time = utc_time + utc_offset`."""
    def __init__(self, offset):
        self.__offset = timedelta(minutes=offset)
        hours, minutes = divmod(offset, 60)
        #NOTE: the last part is to remind about deprecated POSIX GMT+h timezones
        #  that have the opposite sign in the name;
        #  the corresponding numeric value is not used e.g., no minutes
        self.__name = '<%+03d%02d>%+d' % (hours, minutes, -hours)
    def utcoffset(self, dt=None):
        return self.__offset
    def tzname(self, dt=None):
        return self.__name
    def dst(self, dt=None):
        return timedelta(0)
    def __repr__(self):
        return 'FixedOffset(%d)' % (self.utcoffset().total_seconds() / 60)
    def __getinitargs__(self):
        return (self.__offset.total_seconds()/60,)

def parse_isoformat_datetime(isodatetime):
    try:
        return datetime.strptime(isodatetime, '%Y-%m-%dT%H:%M:%S.%f')
    except ValueError:
        pass
    try:
        return datetime.strptime(isodatetime, '%Y-%m-%dT%H:%M:%S')
    except ValueError:
        pass
    pat = r'(.*?[+-]\d{2}):(\d{2})'
    temp = re.sub(pat, r'\1\2', isodatetime)
    naive_date_str = temp[:-5]
    offset_str = temp[-5:]
    naive_dt = datetime.strptime(naive_date_str, '%Y-%m-%dT%H:%M:%S.%f')
    offset = int(offset_str[-4:-2])*60 + int(offset_str[-2:])
    if offset_str[0] == "-":
        offset = -offset
    return naive_dt.replace(tzinfo=FixedOffset(offset))

回答 24

def parseISO8601DateTime(datetimeStr):
    import time
    from datetime import datetime, timedelta

    def log_date_string(when):
        gmt = time.gmtime(when)
        if time.daylight and gmt[8]:
            tz = time.altzone
        else:
            tz = time.timezone
        if tz > 0:
            neg = 1
        else:
            neg = 0
            tz = -tz
        h, rem = divmod(tz, 3600)
        m, rem = divmod(rem, 60)
        if neg:
            offset = '-%02d%02d' % (h, m)
        else:
            offset = '+%02d%02d' % (h, m)

        return time.strftime('%d/%b/%Y:%H:%M:%S ', gmt) + offset

    dt = datetime.strptime(datetimeStr, '%Y-%m-%dT%H:%M:%S.%fZ')
    timestamp = dt.timestamp()
    return dt + timedelta(hours=dt.hour-time.gmtime(timestamp).tm_hour)

请注意,如果字符串不以结尾Z,我们应该使用进行解析%z

def parseISO8601DateTime(datetimeStr):
    import time
    from datetime import datetime, timedelta

    def log_date_string(when):
        gmt = time.gmtime(when)
        if time.daylight and gmt[8]:
            tz = time.altzone
        else:
            tz = time.timezone
        if tz > 0:
            neg = 1
        else:
            neg = 0
            tz = -tz
        h, rem = divmod(tz, 3600)
        m, rem = divmod(rem, 60)
        if neg:
            offset = '-%02d%02d' % (h, m)
        else:
            offset = '+%02d%02d' % (h, m)

        return time.strftime('%d/%b/%Y:%H:%M:%S ', gmt) + offset

    dt = datetime.strptime(datetimeStr, '%Y-%m-%dT%H:%M:%S.%fZ')
    timestamp = dt.timestamp()
    return dt + timedelta(hours=dt.hour-time.gmtime(timestamp).tm_hour)

Note that we should look if the string doesn’t ends with Z, we could parse using %z.


回答 25

最初我尝试使用:

from operator import neg, pos
from time import strptime, mktime
from datetime import datetime, tzinfo, timedelta

class MyUTCOffsetTimezone(tzinfo):
    @staticmethod
    def with_offset(offset_no_signal, signal):  # type: (str, str) -> MyUTCOffsetTimezone
        return MyUTCOffsetTimezone((pos if signal == '+' else neg)(
            (datetime.strptime(offset_no_signal, '%H:%M') - datetime(1900, 1, 1))
          .total_seconds()))

    def __init__(self, offset, name=None):
        self.offset = timedelta(seconds=offset)
        self.name = name or self.__class__.__name__

    def utcoffset(self, dt):
        return self.offset

    def tzname(self, dt):
        return self.name

    def dst(self, dt):
        return timedelta(0)


def to_datetime_tz(dt):  # type: (str) -> datetime
    fmt = '%Y-%m-%dT%H:%M:%S.%f'
    if dt[-6] in frozenset(('+', '-')):
        dt, sign, offset = strptime(dt[:-6], fmt), dt[-6], dt[-5:]
        return datetime.fromtimestamp(mktime(dt),
                                      tz=MyUTCOffsetTimezone.with_offset(offset, sign))
    elif dt[-1] == 'Z':
        return datetime.strptime(dt, fmt + 'Z')
    return datetime.strptime(dt, fmt)

但这不适用于负时区。但是我在Python 3.7.3中工作得很好:

from datetime import datetime


def to_datetime_tz(dt):  # type: (str) -> datetime
    fmt = '%Y-%m-%dT%H:%M:%S.%f'
    if dt[-6] in frozenset(('+', '-')):
        return datetime.strptime(dt, fmt + '%z')
    elif dt[-1] == 'Z':
        return datetime.strptime(dt, fmt + 'Z')
    return datetime.strptime(dt, fmt)

在某些测试中,请注意输出仅相差微秒。在我的机器上达到6位精度,但是YMMV:

for dt_in, dt_out in (
        ('2019-03-11T08:00:00.000Z', '2019-03-11T08:00:00'),
        ('2019-03-11T08:00:00.000+11:00', '2019-03-11T08:00:00+11:00'),
        ('2019-03-11T08:00:00.000-11:00', '2019-03-11T08:00:00-11:00')
    ):
    isoformat = to_datetime_tz(dt_in).isoformat()
    assert isoformat == dt_out, '{} != {}'.format(isoformat, dt_out)

Initially I tried with:

from operator import neg, pos
from time import strptime, mktime
from datetime import datetime, tzinfo, timedelta

class MyUTCOffsetTimezone(tzinfo):
    @staticmethod
    def with_offset(offset_no_signal, signal):  # type: (str, str) -> MyUTCOffsetTimezone
        return MyUTCOffsetTimezone((pos if signal == '+' else neg)(
            (datetime.strptime(offset_no_signal, '%H:%M') - datetime(1900, 1, 1))
          .total_seconds()))

    def __init__(self, offset, name=None):
        self.offset = timedelta(seconds=offset)
        self.name = name or self.__class__.__name__

    def utcoffset(self, dt):
        return self.offset

    def tzname(self, dt):
        return self.name

    def dst(self, dt):
        return timedelta(0)


def to_datetime_tz(dt):  # type: (str) -> datetime
    fmt = '%Y-%m-%dT%H:%M:%S.%f'
    if dt[-6] in frozenset(('+', '-')):
        dt, sign, offset = strptime(dt[:-6], fmt), dt[-6], dt[-5:]
        return datetime.fromtimestamp(mktime(dt),
                                      tz=MyUTCOffsetTimezone.with_offset(offset, sign))
    elif dt[-1] == 'Z':
        return datetime.strptime(dt, fmt + 'Z')
    return datetime.strptime(dt, fmt)

But that didn’t work on negative timezones. This however I got working fine, in Python 3.7.3:

from datetime import datetime


def to_datetime_tz(dt):  # type: (str) -> datetime
    fmt = '%Y-%m-%dT%H:%M:%S.%f'
    if dt[-6] in frozenset(('+', '-')):
        return datetime.strptime(dt, fmt + '%z')
    elif dt[-1] == 'Z':
        return datetime.strptime(dt, fmt + 'Z')
    return datetime.strptime(dt, fmt)

Some tests, note that the out only differs by precision of microseconds. Got to 6 digits of precision on my machine, but YMMV:

for dt_in, dt_out in (
        ('2019-03-11T08:00:00.000Z', '2019-03-11T08:00:00'),
        ('2019-03-11T08:00:00.000+11:00', '2019-03-11T08:00:00+11:00'),
        ('2019-03-11T08:00:00.000-11:00', '2019-03-11T08:00:00-11:00')
    ):
    isoformat = to_datetime_tz(dt_in).isoformat()
    assert isoformat == dt_out, '{} != {}'.format(isoformat, dt_out)

声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。