问题:将字符串转换为日期时间

我有大量的日期时间列表,例如字符串:

Jun 1 2005  1:33PM
Aug 28 1999 12:00AM

我将把它们推回到数据库中正确的日期时间字段中,因此我需要将它们魔术化为实际的日期时间对象。

这是通过Django的ORM进行的,因此我无法使用SQL进行插入时的转换。

I’ve got a huge list of date-times like this as strings:

Jun 1 2005  1:33PM
Aug 28 1999 12:00AM

I’m going to be shoving these back into proper datetime fields in a database so I need to magic them into real datetime objects.

This is going through Django’s ORM so I can’t use SQL to do the conversion on insert.


回答 0

datetime.strptime是将字符串解析为日期时间的主要例程。它可以处理各种格式,格式由您为其指定的格式字符串确定:

from datetime import datetime

datetime_object = datetime.strptime('Jun 1 2005  1:33PM', '%b %d %Y %I:%M%p')

生成的datetime对象是时区未使用的。

链接:

笔记:

  • strptime =“字符串解析时间”
  • strftime =“字符串格式时间”
  • 今天大声发音,您将在6个月内无需再次搜索。

datetime.strptime is the main routine for parsing strings into datetimes. It can handle all sorts of formats, with the format determined by a format string you give it:

from datetime import datetime

datetime_object = datetime.strptime('Jun 1 2005  1:33PM', '%b %d %Y %I:%M%p')

The resulting datetime object is timezone-naive.

Links:

Notes:

  • strptime = “string parse time”
  • strftime = “string format time”
  • Pronounce it out loud today & you won’t have to search for it again in 6 months.

回答 1

使用第三方dateutil库:

from dateutil import parser
parser.parse("Aug 28 1999 12:00AM")  # datetime.datetime(1999, 8, 28, 0, 0)

它可以处理大多数日期格式,包括您需要解析的格式。它比strptime大多数时候都可以猜测正确的格式要方便得多。

这对于编写测试非常有用,在测试中,可读性比性能更重要。

您可以使用以下方法安装它:

pip install python-dateutil

Use the third party dateutil library:

from dateutil import parser
parser.parse("Aug 28 1999 12:00AM")  # datetime.datetime(1999, 8, 28, 0, 0)

It can handle most date formats, including the one you need to parse. It’s more convenient than strptime as it can guess the correct format most of the time.

It’s very useful for writing tests, where readability is more important than performance.

You can install it with:

pip install python-dateutil

回答 2

时间模块中strptime。它与strftime相反。

$ python
>>> import time
>>> my_time = time.strptime('Jun 1 2005  1:33PM', '%b %d %Y %I:%M%p')
time.struct_time(tm_year=2005, tm_mon=6, tm_mday=1,
                 tm_hour=13, tm_min=33, tm_sec=0,
                 tm_wday=2, tm_yday=152, tm_isdst=-1)

timestamp = time.mktime(my_time)
# convert time object to datetime
from datetime import datetime
my_datetime = datetime.fromtimestamp(timestamp)
# convert time object to date
from datetime import date
my_date = date.fromtimestamp(timestamp)

Check out strptime in the time module. It is the inverse of strftime.

$ python
>>> import time
>>> my_time = time.strptime('Jun 1 2005  1:33PM', '%b %d %Y %I:%M%p')
time.struct_time(tm_year=2005, tm_mon=6, tm_mday=1,
                 tm_hour=13, tm_min=33, tm_sec=0,
                 tm_wday=2, tm_yday=152, tm_isdst=-1)

timestamp = time.mktime(my_time)
# convert time object to datetime
from datetime import datetime
my_datetime = datetime.fromtimestamp(timestamp)
# convert time object to date
from datetime import date
my_date = date.fromtimestamp(timestamp)

回答 3

我整理了一个可以转换一些真正简洁的表达式的项目。查看时间字符串

以下是一些示例:

pip install timestring
>>> import timestring
>>> timestring.Date('monday, aug 15th 2015 at 8:40 pm')
<timestring.Date 2015-08-15 20:40:00 4491909392>
>>> timestring.Date('monday, aug 15th 2015 at 8:40 pm').date
datetime.datetime(2015, 8, 15, 20, 40)
>>> timestring.Range('next week')
<timestring.Range From 03/10/14 00:00:00 to 03/03/14 00:00:00 4496004880>
>>> (timestring.Range('next week').start.date, timestring.Range('next week').end.date)
(datetime.datetime(2014, 3, 10, 0, 0), datetime.datetime(2014, 3, 14, 0, 0))

I have put together a project that can convert some really neat expressions. Check out timestring.

Here are some examples below:

pip install timestring
>>> import timestring
>>> timestring.Date('monday, aug 15th 2015 at 8:40 pm')
<timestring.Date 2015-08-15 20:40:00 4491909392>
>>> timestring.Date('monday, aug 15th 2015 at 8:40 pm').date
datetime.datetime(2015, 8, 15, 20, 40)
>>> timestring.Range('next week')
<timestring.Range From 03/10/14 00:00:00 to 03/03/14 00:00:00 4496004880>
>>> (timestring.Range('next week').start.date, timestring.Range('next week').end.date)
(datetime.datetime(2014, 3, 10, 0, 0), datetime.datetime(2014, 3, 14, 0, 0))

回答 4

记住这一点,您无需再次对日期时间转换感到困惑。

日期时间对象的字符串= strptime

datetime对象为其他格式= strftime

Jun 1 2005 1:33PM

等于

%b %d %Y %I:%M%p

%b月作为语言环境的缩写名称(六月)

%d月中的一天,以零填充的十进制数字(1)

%Y以世纪为十进制数字的年份(2015)

%I小时(12小时制),为零填充的十进制数字(01)

%M分钟,为零填充的十进制数字(33)

等同于AM或PM(PM)的%p语言环境

所以你需要strptime即转换string

>>> dates = []
>>> dates.append('Jun 1 2005  1:33PM')
>>> dates.append('Aug 28 1999 12:00AM')
>>> from datetime import datetime
>>> for d in dates:
...     date = datetime.strptime(d, '%b %d %Y %I:%M%p')
...     print type(date)
...     print date
... 

输出量

<type 'datetime.datetime'>
2005-06-01 13:33:00
<type 'datetime.datetime'>
1999-08-28 00:00:00

如果日期格式不同,可以使用panda或dateutil.parse怎么办?

>>> import dateutil
>>> dates = []
>>> dates.append('12 1 2017')
>>> dates.append('1 1 2017')
>>> dates.append('1 12 2017')
>>> dates.append('June 1 2017 1:30:00AM')
>>> [parser.parse(x) for x in dates]

输出

[datetime.datetime(2017, 12, 1, 0, 0), datetime.datetime(2017, 1, 1, 0, 0), datetime.datetime(2017, 1, 12, 0, 0), datetime.datetime(2017, 6, 1, 1, 30)]

Remember this and you didn’t need to get confused in datetime conversion again.

String to datetime object = strptime

datetime object to other formats = strftime

Jun 1 2005 1:33PM

is equals to

%b %d %Y %I:%M%p

%b Month as locale’s abbreviated name(Jun)

%d Day of the month as a zero-padded decimal number(1)

%Y Year with century as a decimal number(2015)

%I Hour (12-hour clock) as a zero-padded decimal number(01)

%M Minute as a zero-padded decimal number(33)

%p Locale’s equivalent of either AM or PM(PM)

so you need strptime i-e converting string to

>>> dates = []
>>> dates.append('Jun 1 2005  1:33PM')
>>> dates.append('Aug 28 1999 12:00AM')
>>> from datetime import datetime
>>> for d in dates:
...     date = datetime.strptime(d, '%b %d %Y %I:%M%p')
...     print type(date)
...     print date
... 

Output

<type 'datetime.datetime'>
2005-06-01 13:33:00
<type 'datetime.datetime'>
1999-08-28 00:00:00

What if you have different format of dates you can use panda or dateutil.parse

>>> import dateutil
>>> dates = []
>>> dates.append('12 1 2017')
>>> dates.append('1 1 2017')
>>> dates.append('1 12 2017')
>>> dates.append('June 1 2017 1:30:00AM')
>>> [parser.parse(x) for x in dates]

OutPut

[datetime.datetime(2017, 12, 1, 0, 0), datetime.datetime(2017, 1, 1, 0, 0), datetime.datetime(2017, 1, 12, 0, 0), datetime.datetime(2017, 6, 1, 1, 30)]

回答 5

在Python> = 3.7.0中,

转换YYYY-MM-DD字符串DateTime对象datetime.fromisoformat都可以使用。

>>> from datetime import datetime

>>> date_string = "2012-12-12 10:10:10"
>>> print (datetime.fromisoformat(date_string))
>>> 2012-12-12 10:10:10

In Python >= 3.7.0,

to convert YYYY-MM-DD string to datetime object, datetime.fromisoformat could be used.

>>> from datetime import datetime

>>> date_string = "2012-12-12 10:10:10"
>>> print (datetime.fromisoformat(date_string))
>>> 2012-12-12 10:10:10

回答 6

许多时间戳都有一个隐含的时区。为了确保您的代码在每个时区都能工作,您应该在内部使用UTC,并在每次异物进入系统时都附加一个时区。

Python 3.2+:

>>> datetime.datetime.strptime(
...     "March 5, 2014, 20:13:50", "%B %d, %Y, %H:%M:%S"
... ).replace(tzinfo=datetime.timezone(datetime.timedelta(hours=-3)))

Many timestamps have an implied timezone. To ensure that your code will work in every timezone, you should use UTC internally and attach a timezone each time a foreign object enters the system.

Python 3.2+:

>>> datetime.datetime.strptime(
...     "March 5, 2014, 20:13:50", "%B %d, %Y, %H:%M:%S"
... ).replace(tzinfo=datetime.timezone(datetime.timedelta(hours=-3)))

回答 7

这是两个使用Pandas将格式为字符串的日期转换为datetime.date对象的解决方案。

import pandas as pd

dates = ['2015-12-25', '2015-12-26']

# 1) Use a list comprehension.
>>> [d.date() for d in pd.to_datetime(dates)]
[datetime.date(2015, 12, 25), datetime.date(2015, 12, 26)]

# 2) Convert the dates to a DatetimeIndex and extract the python dates.
>>> pd.DatetimeIndex(dates).date.tolist()
[datetime.date(2015, 12, 25), datetime.date(2015, 12, 26)]

时机

dates = pd.DatetimeIndex(start='2000-1-1', end='2010-1-1', freq='d').date.tolist()

>>> %timeit [d.date() for d in pd.to_datetime(dates)]
# 100 loops, best of 3: 3.11 ms per loop

>>> %timeit pd.DatetimeIndex(dates).date.tolist()
# 100 loops, best of 3: 6.85 ms per loop

这是如何转换OP的原始日期时间示例:

datetimes = ['Jun 1 2005  1:33PM', 'Aug 28 1999 12:00AM']

>>> pd.to_datetime(datetimes).to_pydatetime().tolist()
[datetime.datetime(2005, 6, 1, 13, 33), 
 datetime.datetime(1999, 8, 28, 0, 0)]

使用可以从字符串转换为Pandas Timestamps有很多选项to_datetime,因此请检查文档如果需要任何特殊。

同样,时间戳除了具有许多可访问的属性和方法外,.date

Here are two solutions using Pandas to convert dates formatted as strings into datetime.date objects.

import pandas as pd

dates = ['2015-12-25', '2015-12-26']

# 1) Use a list comprehension.
>>> [d.date() for d in pd.to_datetime(dates)]
[datetime.date(2015, 12, 25), datetime.date(2015, 12, 26)]

# 2) Convert the dates to a DatetimeIndex and extract the python dates.
>>> pd.DatetimeIndex(dates).date.tolist()
[datetime.date(2015, 12, 25), datetime.date(2015, 12, 26)]

Timings

dates = pd.DatetimeIndex(start='2000-1-1', end='2010-1-1', freq='d').date.tolist()

>>> %timeit [d.date() for d in pd.to_datetime(dates)]
# 100 loops, best of 3: 3.11 ms per loop

>>> %timeit pd.DatetimeIndex(dates).date.tolist()
# 100 loops, best of 3: 6.85 ms per loop

And here is how to convert the OP’s original date-time examples:

datetimes = ['Jun 1 2005  1:33PM', 'Aug 28 1999 12:00AM']

>>> pd.to_datetime(datetimes).to_pydatetime().tolist()
[datetime.datetime(2005, 6, 1, 13, 33), 
 datetime.datetime(1999, 8, 28, 0, 0)]

There are many options for converting from the strings to Pandas Timestamps using to_datetime, so check the docs if you need anything special.

Likewise, Timestamps have many properties and methods that can be accessed in addition to .date


回答 8

我个人喜欢使用parser模块的解决方案,这是该问题的第二个答案,而且很漂亮,因为您不必构造任何字符串文字即可使其工作。但是,缺点是它比接受的答案慢90%strptime

from dateutil import parser
from datetime import datetime
import timeit

def dt():
    dt = parser.parse("Jun 1 2005  1:33PM")
def strptime():
    datetime_object = datetime.strptime('Jun 1 2005  1:33PM', '%b %d %Y %I:%M%p')

print(timeit.timeit(stmt=dt, number=10**5))
print(timeit.timeit(stmt=strptime, number=10**5))
>10.70296801342902
>1.3627995655316933

只要你是不是做这个一百万一遍又一遍的时间,我还是觉得parser方法是更方便,会自动处理大部分的时间格式。

I personally like the solution using the parser module, which is the second Answer to this question and is beautiful, as you don’t have to construct any string literals to get it working. BUT, one downside is that it is 90% slower than the accepted answer with strptime.

from dateutil import parser
from datetime import datetime
import timeit

def dt():
    dt = parser.parse("Jun 1 2005  1:33PM")
def strptime():
    datetime_object = datetime.strptime('Jun 1 2005  1:33PM', '%b %d %Y %I:%M%p')

print(timeit.timeit(stmt=dt, number=10**5))
print(timeit.timeit(stmt=strptime, number=10**5))
>10.70296801342902
>1.3627995655316933

As long as you are not doing this a million times over and over again, I still think the parser method is more convenient and will handle most of the time formats automatically.


回答 9

这里没有提到但有用的东西:在一天中添加一个后缀。我解耦了后缀逻辑,以便您可以将其用于任何您喜欢的数字,而不仅仅是日期。

import time

def num_suffix(n):
    '''
    Returns the suffix for any given int
    '''
    suf = ('th','st', 'nd', 'rd')
    n = abs(n) # wise guy
    tens = int(str(n)[-2:])
    units = n % 10
    if tens > 10 and tens < 20:
        return suf[0] # teens with 'th'
    elif units <= 3:
        return suf[units]
    else:
        return suf[0] # 'th'

def day_suffix(t):
    '''
    Returns the suffix of the given struct_time day
    '''
    return num_suffix(t.tm_mday)

# Examples
print num_suffix(123)
print num_suffix(3431)
print num_suffix(1234)
print ''
print day_suffix(time.strptime("1 Dec 00", "%d %b %y"))
print day_suffix(time.strptime("2 Nov 01", "%d %b %y"))
print day_suffix(time.strptime("3 Oct 02", "%d %b %y"))
print day_suffix(time.strptime("4 Sep 03", "%d %b %y"))
print day_suffix(time.strptime("13 Nov 90", "%d %b %y"))
print day_suffix(time.strptime("14 Oct 10", "%d %b %y"))​​​​​​​

Something that isn’t mentioned here and is useful: adding a suffix to the day. I decoupled the suffix logic so you can use it for any number you like, not just dates.

import time

def num_suffix(n):
    '''
    Returns the suffix for any given int
    '''
    suf = ('th','st', 'nd', 'rd')
    n = abs(n) # wise guy
    tens = int(str(n)[-2:])
    units = n % 10
    if tens > 10 and tens < 20:
        return suf[0] # teens with 'th'
    elif units <= 3:
        return suf[units]
    else:
        return suf[0] # 'th'

def day_suffix(t):
    '''
    Returns the suffix of the given struct_time day
    '''
    return num_suffix(t.tm_mday)

# Examples
print num_suffix(123)
print num_suffix(3431)
print num_suffix(1234)
print ''
print day_suffix(time.strptime("1 Dec 00", "%d %b %y"))
print day_suffix(time.strptime("2 Nov 01", "%d %b %y"))
print day_suffix(time.strptime("3 Oct 02", "%d %b %y"))
print day_suffix(time.strptime("4 Sep 03", "%d %b %y"))
print day_suffix(time.strptime("13 Nov 90", "%d %b %y"))
print day_suffix(time.strptime("14 Oct 10", "%d %b %y"))​​​​​​​

回答 10

In [34]: import datetime

In [35]: _now = datetime.datetime.now()

In [36]: _now
Out[36]: datetime.datetime(2016, 1, 19, 9, 47, 0, 432000)

In [37]: print _now
2016-01-19 09:47:00.432000

In [38]: _parsed = datetime.datetime.strptime(str(_now),"%Y-%m-%d %H:%M:%S.%f")

In [39]: _parsed
Out[39]: datetime.datetime(2016, 1, 19, 9, 47, 0, 432000)

In [40]: assert _now == _parsed
In [34]: import datetime

In [35]: _now = datetime.datetime.now()

In [36]: _now
Out[36]: datetime.datetime(2016, 1, 19, 9, 47, 0, 432000)

In [37]: print _now
2016-01-19 09:47:00.432000

In [38]: _parsed = datetime.datetime.strptime(str(_now),"%Y-%m-%d %H:%M:%S.%f")

In [39]: _parsed
Out[39]: datetime.datetime(2016, 1, 19, 9, 47, 0, 432000)

In [40]: assert _now == _parsed

回答 11

Django时区感知日期时间对象示例。

import datetime
from django.utils.timezone import get_current_timezone
tz = get_current_timezone()

format = '%b %d %Y %I:%M%p'
date_object = datetime.datetime.strptime('Jun 1 2005  1:33PM', format)
date_obj = tz.localize(date_object)

具备USE_TZ = True以下条件时,此转换对Django和Python非常重要:

RuntimeWarning: DateTimeField MyModel.created received a naive datetime (2016-03-04 00:00:00) while time zone support is active.

Django Timezone aware datetime object example.

import datetime
from django.utils.timezone import get_current_timezone
tz = get_current_timezone()

format = '%b %d %Y %I:%M%p'
date_object = datetime.datetime.strptime('Jun 1 2005  1:33PM', format)
date_obj = tz.localize(date_object)

This conversion is very important for Django and Python when you have USE_TZ = True:

RuntimeWarning: DateTimeField MyModel.created received a naive datetime (2016-03-04 00:00:00) while time zone support is active.

回答 12

创建一个小的实用程序函数,例如:

def date(datestr="", format="%Y-%m-%d"):
    from datetime import datetime
    if not datestr:
        return datetime.today().date()
    return datetime.strptime(datestr, format).date()

这足够通用:

  • 如果您不传递任何参数,它将返回今天的日期。
  • 有一种默认的日期格式可以覆盖。
  • 您可以轻松地对其进行修改以返回日期时间。

Create a small utility function like:

def date(datestr="", format="%Y-%m-%d"):
    from datetime import datetime
    if not datestr:
        return datetime.today().date()
    return datetime.strptime(datestr, format).date()

This is versatile enough:

  • If you don’t pass any arguments it will return today’s date.
  • There’s a date format as default that you can override.
  • You can easily modify it to return a datetime.

回答 13

它将有助于将字符串转换为日期时间以及时区

def convert_string_to_time(date_string, timezone):
    from datetime import datetime
    import pytz
    date_time_obj = datetime.strptime(date_string[:26], '%Y-%m-%d %H:%M:%S.%f')
    date_time_obj_timezone = pytz.timezone(timezone).localize(date_time_obj)

    return date_time_obj_timezone

date = '2018-08-14 13:09:24.543953+00:00'
TIME_ZONE = 'UTC'
date_time_obj_timezone = convert_string_to_time(date, TIME_ZONE)

It would do the helpful for converting string to datetime and also with time zone

def convert_string_to_time(date_string, timezone):
    from datetime import datetime
    import pytz
    date_time_obj = datetime.strptime(date_string[:26], '%Y-%m-%d %H:%M:%S.%f')
    date_time_obj_timezone = pytz.timezone(timezone).localize(date_time_obj)

    return date_time_obj_timezone

date = '2018-08-14 13:09:24.543953+00:00'
TIME_ZONE = 'UTC'
date_time_obj_timezone = convert_string_to_time(date, TIME_ZONE)

回答 14

arrow提供了许多有用的日期和时间功能。这段代码提供了对该问题的答案,并表明箭头还能够轻松格式化日期并显示其他语言环境的信息。

>>> import arrow
>>> dateStrings = [ 'Jun 1  2005 1:33PM', 'Aug 28 1999 12:00AM' ]
>>> for dateString in dateStrings:
...     dateString
...     arrow.get(dateString.replace('  ',' '), 'MMM D YYYY H:mmA').datetime
...     arrow.get(dateString.replace('  ',' '), 'MMM D YYYY H:mmA').format('ddd, Do MMM YYYY HH:mm')
...     arrow.get(dateString.replace('  ',' '), 'MMM D YYYY H:mmA').humanize(locale='de')
...
'Jun 1  2005 1:33PM'
datetime.datetime(2005, 6, 1, 13, 33, tzinfo=tzutc())
'Wed, 1st Jun 2005 13:33'
'vor 11 Jahren'
'Aug 28 1999 12:00AM'
datetime.datetime(1999, 8, 28, 0, 0, tzinfo=tzutc())
'Sat, 28th Aug 1999 00:00'
'vor 17 Jahren'

有关更多信息,请参见http://arrow.readthedocs.io/en/latest/

arrow offers many useful functions for dates and times. This bit of code provides an answer to the question and shows that arrow is also capable of formatting dates easily and displaying information for other locales.

>>> import arrow
>>> dateStrings = [ 'Jun 1  2005 1:33PM', 'Aug 28 1999 12:00AM' ]
>>> for dateString in dateStrings:
...     dateString
...     arrow.get(dateString.replace('  ',' '), 'MMM D YYYY H:mmA').datetime
...     arrow.get(dateString.replace('  ',' '), 'MMM D YYYY H:mmA').format('ddd, Do MMM YYYY HH:mm')
...     arrow.get(dateString.replace('  ',' '), 'MMM D YYYY H:mmA').humanize(locale='de')
...
'Jun 1  2005 1:33PM'
datetime.datetime(2005, 6, 1, 13, 33, tzinfo=tzutc())
'Wed, 1st Jun 2005 13:33'
'vor 11 Jahren'
'Aug 28 1999 12:00AM'
datetime.datetime(1999, 8, 28, 0, 0, tzinfo=tzutc())
'Sat, 28th Aug 1999 00:00'
'vor 17 Jahren'

See http://arrow.readthedocs.io/en/latest/ for more.


回答 15

您可以使用easy_date使其变得容易:

import date_converter
converted_date = date_converter.string_to_datetime('Jun 1 2005  1:33PM', '%b %d %Y %I:%M%p')

You can use easy_date to make it easy:

import date_converter
converted_date = date_converter.string_to_datetime('Jun 1 2005  1:33PM', '%b %d %Y %I:%M%p')

回答 16

如果只需要日期格式,则可以通过传递各个字段来手动将其转换,例如:

>>> import datetime
>>> date = datetime.date(int('2017'),int('12'),int('21'))
>>> date
datetime.date(2017, 12, 21)
>>> type(date)
<type 'datetime.date'>

您可以传递拆分的字符串值以将其转换为日期类型,例如:

selected_month_rec = '2017-09-01'
date_formate = datetime.date(int(selected_month_rec.split('-')[0]),int(selected_month_rec.split('-')[1]),int(selected_month_rec.split('-')[2]))

您将获得日期格式的结果值。

If you want only date format then you can manually convert it by passing your individual fields like:

>>> import datetime
>>> date = datetime.date(int('2017'),int('12'),int('21'))
>>> date
datetime.date(2017, 12, 21)
>>> type(date)
<type 'datetime.date'>

You can pass your split string values to convert it into date type like:

selected_month_rec = '2017-09-01'
date_formate = datetime.date(int(selected_month_rec.split('-')[0]),int(selected_month_rec.split('-')[1]),int(selected_month_rec.split('-')[2]))

You will get the resulting value in date format.


回答 17

您也可以退房 dateparser

dateparser 提供的模块可轻松解析几乎任何网页上常见的字符串格式的本地化日期。

安装:

$ pip install dateparser

我认为,这是解析日期的最简单方法。

最直接的方法是使用dateparser.parse功能,该功能包装了模块中的大多数功能。

样例代码:

import dateparser

t1 = 'Jun 1 2005  1:33PM'
t2 = 'Aug 28 1999 12:00AM'

dt1 = dateparser.parse(t1)
dt2 = dateparser.parse(t2)

print(dt1)
print(dt2)

输出:

2005-06-01 13:33:00
1999-08-28 00:00:00

You can also check out dateparser

dateparser provides modules to easily parse localized dates in almost any string formats commonly found on web pages.

Install:

$ pip install dateparser

This is, I think, the easiest way you can parse dates.

The most straightforward way is to use the dateparser.parse function, that wraps around most of the functionality in the module.

Sample Code:

import dateparser

t1 = 'Jun 1 2005  1:33PM'
t2 = 'Aug 28 1999 12:00AM'

dt1 = dateparser.parse(t1)
dt2 = dateparser.parse(t2)

print(dt1)
print(dt2)

Output:

2005-06-01 13:33:00
1999-08-28 00:00:00

回答 18

我的回答

在现实世界的数据中,这是一个实际的问题:多种,不匹配,不完整,不一致以及多语言/区域日期格式,通常在一个数据集中自由地混合使用。生产代码失败是不可能的,更不用说像狐狸一样的异常快乐了。

我们需要尝试…捕获多种日期时间格式fmt1,fmt2,…,fmtn,并strptime()为所有不匹配的对象抑制/处理(来自的)异常(尤其是避免使用yukky n缩进的try梯形图) ..catch子句)。从我的解决方案

def try_strptime(s, fmts=['%d-%b-%y','%m/%d/%Y']):
    for fmt in fmts:
        try:
            return datetime.strptime(s, fmt)
        except:
            continue

    return None # or reraise the ValueError if no format matched, if you prefer

See my answer.

In real-world data this is a real problem: multiple, mismatched, incomplete, inconsistent and multilanguage/region date formats, often mixed freely in one dataset. It’s not ok for production code to fail, let alone go exception-happy like a fox.

We need to try…catch multiple datetime formats fmt1,fmt2,…,fmtn and suppress/handle the exceptions (from strptime()) for all those that mismatch (and in particular, avoid needing a yukky n-deep indented ladder of try..catch clauses). From my solution

def try_strptime(s, fmts=['%d-%b-%y','%m/%d/%Y']):
    for fmt in fmts:
        try:
            return datetime.strptime(s, fmt)
        except:
            continue

    return None # or reraise the ValueError if no format matched, if you prefer

回答 19

emp = pd.read_csv("C:\\py\\programs\\pandas_2\\pandas\\employees.csv")
emp.info()

它显示“开始日期时间”列和“上次登录时间”在数据框中均为“对象=字符串”

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 8 columns):
First Name           933 non-null object
Gender               855 non-null object
Start Date           1000 non-null object

Last Login Time      1000 non-null object
Salary               1000 non-null int64
Bonus %              1000 non-null float64
Senior Management    933 non-null object
Team                 957 non-null object
dtypes: float64(1), int64(1), object(6)
memory usage: 62.6+ KB

通过使用parse_dates选项,read_csv您可以将字符串datetime转换为pandas datetime格式。

emp = pd.read_csv("C:\\py\\programs\\pandas_2\\pandas\\employees.csv", parse_dates=["Start Date", "Last Login Time"])
emp.info()


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 8 columns):
First Name           933 non-null object
Gender               855 non-null object
Start Date           1000 non-null datetime64[ns]
Last Login Time      1000 non-null datetime64[ns]
Salary               1000 non-null int64
Bonus %              1000 non-null float64
Senior Management    933 non-null object
Team                 957 non-null object
dtypes: datetime64[ns](2), float64(1), int64(1), object(4)
memory usage: 62.6+ KB
emp = pd.read_csv("C:\\py\\programs\\pandas_2\\pandas\\employees.csv")
emp.info()

it shows “Start Date Time” Column and “Last Login Time” both are “object = strings” in data-frame

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 8 columns):
First Name           933 non-null object
Gender               855 non-null object
Start Date           1000 non-null object

Last Login Time      1000 non-null object
Salary               1000 non-null int64
Bonus %              1000 non-null float64
Senior Management    933 non-null object
Team                 957 non-null object
dtypes: float64(1), int64(1), object(6)
memory usage: 62.6+ KB

By using parse_dates option in read_csv mention you can convert your string datetime into pandas datetime format.

emp = pd.read_csv("C:\\py\\programs\\pandas_2\\pandas\\employees.csv", parse_dates=["Start Date", "Last Login Time"])
emp.info()


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 8 columns):
First Name           933 non-null object
Gender               855 non-null object
Start Date           1000 non-null datetime64[ns]
Last Login Time      1000 non-null datetime64[ns]
Salary               1000 non-null int64
Bonus %              1000 non-null float64
Senior Management    933 non-null object
Team                 957 non-null object
dtypes: datetime64[ns](2), float64(1), int64(1), object(4)
memory usage: 62.6+ KB

声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。