标签归档:date

SQLAlchemy默认DateTime

问题:SQLAlchemy默认DateTime

这是我的声明性模型:

import datetime
from sqlalchemy import Column, Integer, DateTime
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class Test(Base):
    __tablename__ = 'test'

    id = Column(Integer, primary_key=True)
    created_date = DateTime(default=datetime.datetime.utcnow)

但是,当我尝试导入此模块时,出现此错误:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "orm/models2.py", line 37, in <module>
    class Test(Base):
  File "orm/models2.py", line 41, in Test
    created_date = sqlalchemy.DateTime(default=datetime.datetime.utcnow)
TypeError: __init__() got an unexpected keyword argument 'default'

如果使用整数类型,则可以设置默认值。这是怎么回事?

This is my declarative model:

import datetime
from sqlalchemy import Column, Integer, DateTime
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class Test(Base):
    __tablename__ = 'test'

    id = Column(Integer, primary_key=True)
    created_date = DateTime(default=datetime.datetime.utcnow)

However, when I try to import this module, I get this error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "orm/models2.py", line 37, in <module>
    class Test(Base):
  File "orm/models2.py", line 41, in Test
    created_date = sqlalchemy.DateTime(default=datetime.datetime.utcnow)
TypeError: __init__() got an unexpected keyword argument 'default'

If I use an Integer type, I can set a default value. What’s going on?


回答 0

DateTime没有默认键作为输入。默认键应该是该Column功能的输入。试试这个:

import datetime
from sqlalchemy import Column, Integer, DateTime
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class Test(Base):
    __tablename__ = 'test'

    id = Column(Integer, primary_key=True)
    created_date = Column(DateTime, default=datetime.datetime.utcnow)

DateTime doesn’t have a default key as an input. The default key should be an input to the Column function. Try this:

import datetime
from sqlalchemy import Column, Integer, DateTime
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class Test(Base):
    __tablename__ = 'test'

    id = Column(Integer, primary_key=True)
    created_date = Column(DateTime, default=datetime.datetime.utcnow)

回答 1

计算数据库中的时间戳,而不是客户端中的时间戳

为了理智,您可能希望datetimes由数据库服务器而不是应用程序服务器来计算所有数据。计算应用程序中的时间戳可能会导致问题,因为网络等待时间是可变的,客户端会经历略微不同的时钟漂移,并且不同的编程语言有时会略有不同地计算时间。

SQLAlchemy允许您通过传递func.now()func.current_timestamp()(它们是彼此的别名)来执行此操作,该命令告诉DB计算时间戳本身。

使用SQLALchemy的 server_default

另外,对于已经告诉数据库计算值的默认值,通常最好使用server_default代替default。这告诉SQLAlchemy将默认值作为CREATE TABLE语句的一部分传递。

例如,如果您针对该表编写了一个临时脚本,则使用server_default意味着您无需担心手动向脚本添加时间戳调用-数据库将自动对其进行设置。

了解SQLAlchemy的onupdate/server_onupdate

SQLAlchemy还支持,onupdate以便每当更新该行时,它都会插入一个新的时间戳。再一次,最好告诉数据库来计算时间戳本身:

from sqlalchemy.sql import func

time_created = Column(DateTime(timezone=True), server_default=func.now())
time_updated = Column(DateTime(timezone=True), onupdate=func.now())

有一个server_onupdate参数,但与不同server_default,它实际上未在服务器端设置任何参数。它只是告诉SQLalchemy更新发生时(也许您在列上创建了触发器),数据库将更改列,因此SQLAlchemy将要求返回值,以便它可以更新相应的对象。

另一个潜在的陷阱:

您可能会惊讶地发现,如果在单个事务中进行大量更改,则它们都具有相同的时间戳。这是因为SQL标准指定CURRENT_TIMESTAMP根据事务的开始返回值。

PostgreSQL提供了非SQL标准statement_timestamp()clock_timestamp()并且在事务中更改。此处的文档:https : //www.postgresql.org/docs/current/static/functions-datetime.html#FUNCTIONS-DATETIME-CURRENT

UTC时间戳

如果要使用UTC时间戳,请func.utcnow()SQLAlchemy文档中提供的实现存根。但是,您需要自己提供适当的特定于驱动程序的功能。

Calculate timestamps within your DB, not your client

For sanity, you probably want to have all datetimes calculated by your DB server, rather than the application server. Calculating the timestamp in the application can lead to problems because network latency is variable, clients experience slightly different clock drift, and different programming languages occasionally calculate time slightly differently.

SQLAlchemy allows you to do this by passing func.now() or func.current_timestamp() (they are aliases of each other) which tells the DB to calculate the timestamp itself.

Use SQLALchemy’s server_default

Additionally, for a default where you’re already telling the DB to calculate the value, it’s generally better to use server_default instead of default. This tells SQLAlchemy to pass the default value as part of the CREATE TABLE statement.

For example, if you write an ad hoc script against this table, using server_default means you won’t need to worry about manually adding a timestamp call to your script–the database will set it automatically.

Understanding SQLAlchemy’s onupdate/server_onupdate

SQLAlchemy also supports onupdate so that anytime the row is updated it inserts a new timestamp. Again, best to tell the DB to calculate the timestamp itself:

from sqlalchemy.sql import func

time_created = Column(DateTime(timezone=True), server_default=func.now())
time_updated = Column(DateTime(timezone=True), onupdate=func.now())

There is a server_onupdate parameter, but unlike server_default, it doesn’t actually set anything serverside. It just tells SQLalchemy that your database will change the column when an update happens (perhaps you created a trigger on the column ), so SQLAlchemy will ask for the return value so it can update the corresponding object.

One other potential gotcha:

You might be surprised to notice that if you make a bunch of changes within a single transaction, they all have the same timestamp. That’s because the SQL standard specifies that CURRENT_TIMESTAMP returns values based on the start of the transaction.

PostgreSQL provides the non-SQL-standard statement_timestamp() and clock_timestamp() which do change within a transaction. Docs here: https://www.postgresql.org/docs/current/static/functions-datetime.html#FUNCTIONS-DATETIME-CURRENT

UTC timestamp

If you want to use UTC timestamps, a stub of implementation for func.utcnow() is provided in SQLAlchemy documentation. You need to provide appropriate driver-specific functions on your own though.


回答 2

您还可以默认使用sqlalchemy内置函数 DateTime

from sqlalchemy.sql import func

DT = Column(DateTime(timezone=True), default=func.now())

You can also use sqlalchemy builtin function for default DateTime

from sqlalchemy.sql import func

DT = Column(DateTime(timezone=True), default=func.now())

回答 3

您可能想要使用,onupdate=datetime.now以便UPDATE也可以更改该last_updated字段。

SQLAlchemy对于python执行的函数有两个默认值。

  • default 设置一次INSERT的值
  • onupdate还将值设置为UPDATE 上的可调用结果。

You likely want to use onupdate=datetime.now so that UPDATEs also change the last_updated field.

SQLAlchemy has two defaults for python executed functions.

  • default sets the value on INSERT, only once
  • onupdate sets the value to the callable result on UPDATE as well.

回答 4

default关键字参数应被给予Column对象。

例:

Column(u'timestamp', TIMESTAMP(timezone=True), primary_key=False, nullable=False, default=time_now),

默认值可以是可调用的,在这里我定义如下。

from pytz import timezone
from datetime import datetime

UTC = timezone('UTC')

def time_now():
    return datetime.now(UTC)

The default keyword parameter should be given to the Column object.

Example:

Column(u'timestamp', TIMESTAMP(timezone=True), primary_key=False, nullable=False, default=time_now),

The default value can be a callable, which here I defined like the following.

from pytz import timezone
from datetime import datetime

UTC = timezone('UTC')

def time_now():
    return datetime.now(UTC)

回答 5

根据PostgreSQL文档,https://www.postgresql.org/docs/9.6/static/functions-datetime.html

now, CURRENT_TIMESTAMP, LOCALTIMESTAMP return the time of transaction.

这被认为是一个功能:目的是允许单个事务具有“当前”时间的一致概念,以便同一事务内的多个修改具有相同的时间戳。

如果您不希望事务时间戳记,则可能要使用statement_timestampclock_timestamp

statement_timestamp()

返回当前语句的开始时间(更具体地说,是从客户端收到最新命令消息的时间)。statement_timestamp

clock_timestamp()

返回实际的当前时间,因此,即使在单个SQL命令中,其值也会更改。

As per PostgreSQL documentation, https://www.postgresql.org/docs/9.6/static/functions-datetime.html

now, CURRENT_TIMESTAMP, LOCALTIMESTAMP return the time of transaction.

This is considered a feature: the intent is to allow a single transaction to have a consistent notion of the “current” time, so that multiple modifications within the same transaction bear the same time stamp.

You might want to use statement_timestamp or clock_timestamp if you don’t want transaction timestamp.

statement_timestamp()

returns the start time of the current statement (more specifically, the time of receipt of the latest command message from the client). statement_timestamp

clock_timestamp()

returns the actual current time, and therefore its value changes even within a single SQL command.


Python datetime-在使用strptime获取日,月,年之后设置固定的小时和分钟

问题:Python datetime-在使用strptime获取日,月,年之后设置固定的小时和分钟

我已经成功地将26 Sep 2012格式转换为26-09-2012使用:

datetime.strptime(request.POST['sample_date'],'%d %b %Y')

但是,我不知道如何将上述时间设置为11:59。有谁知道如何做到这一点?

请注意,这可以是将来的日期,也可以是任何随机的日期,而不仅仅是当前日期。

I’ve successfully converted something of 26 Sep 2012 format to 26-09-2012 using:

datetime.strptime(request.POST['sample_date'],'%d %b %Y')

However, I don’t know how to set the hour and minute of something like the above to 11:59. Does anyone know how to do this?

Note, this can be a future date or any random one, not just the current date.


回答 0

用途datetime.replace

from datetime import datetime
date = datetime.strptime('26 Sep 2012', '%d %b %Y')
newdate = date.replace(hour=11, minute=59)

Use datetime.replace:

from datetime import datetime
date = datetime.strptime('26 Sep 2012', '%d %b %Y')
newdate = date.replace(hour=11, minute=59)

回答 1

datetime.replace()将提供最佳选择。此外,它还提供了替换日,年和月的工具。

假设我们有一个datetime对象,日期表示为: "2017-05-04"

>>> from datetime import datetime
>>> date = datetime.strptime('2017-05-04',"%Y-%m-%d")
>>> print(date)
2017-05-04 00:00:00
>>> date = date.replace(minute=59, hour=23, second=59, year=2018, month=6, day=1)
>>> print(date)
2018-06-01 23:59:59

datetime.replace() will provide the best options. Also, it provides facility for replacing day, year, and month.

Suppose we have a datetime object and date is represented as: "2017-05-04"

>>> from datetime import datetime
>>> date = datetime.strptime('2017-05-04',"%Y-%m-%d")
>>> print(date)
2017-05-04 00:00:00
>>> date = date.replace(minute=59, hour=23, second=59, year=2018, month=6, day=1)
>>> print(date)
2018-06-01 23:59:59

Python中两个日期之间的差异

问题:Python中两个日期之间的差异

我有两个不同的日期,我想知道它们之间的天数差异。日期格式为YYYY-MM-DD。

我有一个可以在日期中添加或减去给定数字的功能:

def addonDays(a, x):
   ret = time.strftime("%Y-%m-%d",time.localtime(time.mktime(time.strptime(a,"%Y-%m-%d"))+x*3600*24+3600))      
   return ret

其中A是日期,x是我要添加的天数。结果是另一个日期。

我需要一个可以给出两个日期的函数,结果将是一个以天为单位的日期差的整数。

I have two different dates and I want to know the difference in days between them. The format of the date is YYYY-MM-DD.

I have a function that can ADD or SUBTRACT a given number to a date:

def addonDays(a, x):
   ret = time.strftime("%Y-%m-%d",time.localtime(time.mktime(time.strptime(a,"%Y-%m-%d"))+x*3600*24+3600))      
   return ret

where A is the date and x the number of days I want to add. And the result is another date.

I need a function where I can give two dates and the result would be an int with date difference in days.


回答 0

使用-得到两者之间的区别datetime对象,并采取days会员。

from datetime import datetime

def days_between(d1, d2):
    d1 = datetime.strptime(d1, "%Y-%m-%d")
    d2 = datetime.strptime(d2, "%Y-%m-%d")
    return abs((d2 - d1).days)

Use - to get the difference between two datetime objects and take the days member.

from datetime import datetime

def days_between(d1, d2):
    d1 = datetime.strptime(d1, "%Y-%m-%d")
    d2 = datetime.strptime(d2, "%Y-%m-%d")
    return abs((d2 - d1).days)

回答 1

另一个简短的解决方案:

from datetime import date

def diff_dates(date1, date2):
    return abs(date2-date1).days

def main():
    d1 = date(2013,1,1)
    d2 = date(2013,9,13)
    result1 = diff_dates(d2, d1)
    print '{} days between {} and {}'.format(result1, d1, d2)
    print ("Happy programmer's day!")

main()

Another short solution:

from datetime import date

def diff_dates(date1, date2):
    return abs(date2-date1).days

def main():
    d1 = date(2013,1,1)
    d2 = date(2013,9,13)
    result1 = diff_dates(d2, d1)
    print '{} days between {} and {}'.format(result1, d1, d2)
    print ("Happy programmer's day!")

main()

回答 2

我尝试了上面larsmans发布的代码,但是有两个问题:

1)原样的代码将引发mauguerra提到的错误2)如果将代码更改为以下内容:

...
    d1 = d1.strftime("%Y-%m-%d")
    d2 = d2.strftime("%Y-%m-%d")
    return abs((d2 - d1).days)

这会将您的datetime对象转换为字符串,但是有两件事

1)尝试执行d2-d1将失败,因为您无法在字符串上使用减号运算符,并且2)如果阅读了上述答案的第一行,则想在两个datetime对象上使用-运算符,但是将它们转换为字符串

我发现您实际上只需要以下内容:

import datetime

end_date = datetime.datetime.utcnow()
start_date = end_date - datetime.timedelta(days=8)
difference_in_days = abs((end_date - start_date).days)

print difference_in_days

I tried the code posted by larsmans above but, there are a couple of problems:

1) The code as is will throw the error as mentioned by mauguerra 2) If you change the code to the following:

...
    d1 = d1.strftime("%Y-%m-%d")
    d2 = d2.strftime("%Y-%m-%d")
    return abs((d2 - d1).days)

This will convert your datetime objects to strings but, two things

1) Trying to do d2 – d1 will fail as you cannot use the minus operator on strings and 2) If you read the first line of the above answer it stated, you want to use the – operator on two datetime objects but, you just converted them to strings

What I found is that you literally only need the following:

import datetime

end_date = datetime.datetime.utcnow()
start_date = end_date - datetime.timedelta(days=8)
difference_in_days = abs((end_date - start_date).days)

print difference_in_days

回答 3

试试这个:

data=pd.read_csv('C:\Users\Desktop\Data Exploration.csv')
data.head(5)
first=data['1st Gift']
last=data['Last Gift']
maxi=data['Largest Gift']
l_1=np.mean(first)-3*np.std(first)
u_1=np.mean(first)+3*np.std(first)


m=np.abs(data['1st Gift']-np.mean(data['1st Gift']))>3*np.std(data['1st Gift'])
pd.value_counts(m)
l=first[m]
data.loc[:,'1st Gift'][m==True]=np.mean(data['1st Gift'])+3*np.std(data['1st Gift'])
data['1st Gift'].head()




m=np.abs(data['Last Gift']-np.mean(data['Last Gift']))>3*np.std(data['Last Gift'])
pd.value_counts(m)
l=last[m]
data.loc[:,'Last Gift'][m==True]=np.mean(data['Last Gift'])+3*np.std(data['Last Gift'])
data['Last Gift'].head()

Try this:

data=pd.read_csv('C:\Users\Desktop\Data Exploration.csv')
data.head(5)
first=data['1st Gift']
last=data['Last Gift']
maxi=data['Largest Gift']
l_1=np.mean(first)-3*np.std(first)
u_1=np.mean(first)+3*np.std(first)


m=np.abs(data['1st Gift']-np.mean(data['1st Gift']))>3*np.std(data['1st Gift'])
pd.value_counts(m)
l=first[m]
data.loc[:,'1st Gift'][m==True]=np.mean(data['1st Gift'])+3*np.std(data['1st Gift'])
data['1st Gift'].head()




m=np.abs(data['Last Gift']-np.mean(data['Last Gift']))>3*np.std(data['Last Gift'])
pd.value_counts(m)
l=last[m]
data.loc[:,'Last Gift'][m==True]=np.mean(data['Last Gift'])+3*np.std(data['Last Gift'])
data['Last Gift'].head()

回答 4

pd.date_range(’2019-01-01’,’2019-02-01’)。shape [0]

pd.date_range(‘2019-01-01’, ‘2019-02-01’).shape[0]


上个月的python日期

问题:上个月的python日期

我正在尝试使用python获取上个月的日期。这是我尝试过的:

str( time.strftime('%Y') ) + str( int(time.strftime('%m'))-1 )

但是,这种方法很糟糕,原因有两个:首先,它返回2012年2月的20122(而不是201202),其次它将返回0而不是1月的12。

我已经用bash解决了这个麻烦

echo $(date -d"3 month ago" "+%G%m%d")

我认为,如果bash为此目的提供了一种内置方式,那么功能更强大的python应该比强迫编写自己的脚本来实现此目标更好。我当然可以做类似的事情:

if int(time.strftime('%m')) == 1:
    return '12'
else:
    if int(time.strftime('%m')) < 10:
        return '0'+str(time.strftime('%m')-1)
    else:
        return str(time.strftime('%m') -1)

我没有测试过此代码,也不想使用它(除非我找不到其他方法:/)

谢谢你的帮助!

I am trying to get the date of the previous month with python. Here is what i’ve tried:

str( time.strftime('%Y') ) + str( int(time.strftime('%m'))-1 )

However, this way is bad for 2 reasons: First it returns 20122 for the February of 2012 (instead of 201202) and secondly it will return 0 instead of 12 on January.

I have solved this trouble in bash with

echo $(date -d"3 month ago" "+%G%m%d")

I think that if bash has a built-in way for this purpose, then python, much more equipped, should provide something better than forcing writing one’s own script to achieve this goal. Of course i could do something like:

if int(time.strftime('%m')) == 1:
    return '12'
else:
    if int(time.strftime('%m')) < 10:
        return '0'+str(time.strftime('%m')-1)
    else:
        return str(time.strftime('%m') -1)

I have not tested this code and i don’t want to use it anyway (unless I can’t find any other way:/)

Thanks for your help!


回答 0

datetime和datetime.timedelta类是您的朋友。

  1. 找到今天。
  2. 用它来查找本月的第一天。
  3. 使用timedelta备份一天,直到上个月的最后一天。
  4. 打印您要查找的YYYYMM字符串。

像这样:

 import datetime
 today = datetime.date.today()
 first = today.replace(day=1)
 lastMonth = first - datetime.timedelta(days=1)
 print(lastMonth.strftime("%Y%m"))

201202 打印。

datetime and the datetime.timedelta classes are your friend.

  1. find today.
  2. use that to find the first day of this month.
  3. use timedelta to backup a single day, to the last day of the previous month.
  4. print the YYYYMM string you’re looking for.

Like this:

 import datetime
 today = datetime.date.today()
 first = today.replace(day=1)
 lastMonth = first - datetime.timedelta(days=1)
 print(lastMonth.strftime("%Y%m"))

201202 is printed.


回答 1

您应该使用dateutil。这样,您就可以使用relativedelta,它是timedelta的改进版本。

>>> import datetime 
>>> import dateutil.relativedelta
>>> now = datetime.datetime.now()
>>> print now
2012-03-15 12:33:04.281248
>>> print now + dateutil.relativedelta.relativedelta(months=-1)
2012-02-15 12:33:04.281248

You should use dateutil. With that, you can use relativedelta, it’s an improved version of timedelta.

>>> import datetime 
>>> import dateutil.relativedelta
>>> now = datetime.datetime.now()
>>> print now
2012-03-15 12:33:04.281248
>>> print now + dateutil.relativedelta.relativedelta(months=-1)
2012-02-15 12:33:04.281248

回答 2

from datetime import date, timedelta

first_day_of_current_month = date.today().replace(day=1)
last_day_of_previous_month = first_day_of_current_month - timedelta(days=1)

print "Previous month:", last_day_of_previous_month.month

要么:

from datetime import date, timedelta

prev = date.today().replace(day=1) - timedelta(days=1)
print prev.month
from datetime import date, timedelta

first_day_of_current_month = date.today().replace(day=1)
last_day_of_previous_month = first_day_of_current_month - timedelta(days=1)

print "Previous month:", last_day_of_previous_month.month

Or:

from datetime import date, timedelta

prev = date.today().replace(day=1) - timedelta(days=1)
print prev.month

回答 3

bgporter的答案为基础

def prev_month_range(when = None): 
    """Return (previous month's start date, previous month's end date)."""
    if not when:
        # Default to today.
        when = datetime.datetime.today()
    # Find previous month: https://stackoverflow.com/a/9725093/564514
    # Find today.
    first = datetime.date(day=1, month=when.month, year=when.year)
    # Use that to find the first day of this month.
    prev_month_end = first - datetime.timedelta(days=1)
    prev_month_start = datetime.date(day=1, month= prev_month_end.month, year= prev_month_end.year)
    # Return previous month's start and end dates in YY-MM-DD format.
    return (prev_month_start.strftime('%Y-%m-%d'), prev_month_end.strftime('%Y-%m-%d'))

Building on bgporter’s answer.

def prev_month_range(when = None): 
    """Return (previous month's start date, previous month's end date)."""
    if not when:
        # Default to today.
        when = datetime.datetime.today()
    # Find previous month: https://stackoverflow.com/a/9725093/564514
    # Find today.
    first = datetime.date(day=1, month=when.month, year=when.year)
    # Use that to find the first day of this month.
    prev_month_end = first - datetime.timedelta(days=1)
    prev_month_start = datetime.date(day=1, month= prev_month_end.month, year= prev_month_end.year)
    # Return previous month's start and end dates in YY-MM-DD format.
    return (prev_month_start.strftime('%Y-%m-%d'), prev_month_end.strftime('%Y-%m-%d'))

回答 4

它非常容易和简单。做这个

from dateutil.relativedelta import relativedelta
from datetime import datetime

today_date = datetime.today()
print "todays date time: %s" %today_date

one_month_ago = today_date - relativedelta(months=1)
print "one month ago date time: %s" % one_month_ago
print "one month ago date: %s" % one_month_ago.date()

输出如下:$ python2.7 main.py

todays date time: 2016-09-06 02:13:01.937121
one month ago date time: 2016-08-06 02:13:01.937121
one month ago date: 2016-08-06

Its very easy and simple. Do this

from dateutil.relativedelta import relativedelta
from datetime import datetime

today_date = datetime.today()
print "todays date time: %s" %today_date

one_month_ago = today_date - relativedelta(months=1)
print "one month ago date time: %s" % one_month_ago
print "one month ago date: %s" % one_month_ago.date()

Here is the output: $python2.7 main.py

todays date time: 2016-09-06 02:13:01.937121
one month ago date time: 2016-08-06 02:13:01.937121
one month ago date: 2016-08-06

回答 5

对于到达这里并希望获得上个月的第一天和最后一天的人:

from datetime import date, timedelta

last_day_of_prev_month = date.today().replace(day=1) - timedelta(days=1)

start_day_of_prev_month = date.today().replace(day=1) - timedelta(days=last_day_of_prev_month.day)

# For printing results
print("First day of prev month:", start_day_of_prev_month)
print("Last day of prev month:", last_day_of_prev_month)

输出:

First day of prev month: 2019-02-01
Last day of prev month: 2019-02-28

For someone who got here and looking to get both the first and last day of the previous month:

from datetime import date, timedelta

last_day_of_prev_month = date.today().replace(day=1) - timedelta(days=1)

start_day_of_prev_month = date.today().replace(day=1) - timedelta(days=last_day_of_prev_month.day)

# For printing results
print("First day of prev month:", start_day_of_prev_month)
print("Last day of prev month:", last_day_of_prev_month)

Output:

First day of prev month: 2019-02-01
Last day of prev month: 2019-02-28

回答 6

def prev_month(date=datetime.datetime.today()):
    if date.month == 1:
        return date.replace(month=12,year=date.year-1)
    else:
        try:
            return date.replace(month=date.month-1)
        except ValueError:
            return prev_month(date=date.replace(day=date.day-1))
def prev_month(date=datetime.datetime.today()):
    if date.month == 1:
        return date.replace(month=12,year=date.year-1)
    else:
        try:
            return date.replace(month=date.month-1)
        except ValueError:
            return prev_month(date=date.replace(day=date.day-1))

回答 7

只是为了好玩,一个使用divmod的纯数学答案。由于相乘,效率很低,也可以对月份数进行简单检查(如果等于12,则增加年份等)

year = today.year
month = today.month

nm = list(divmod(year * 12 + month + 1, 12))
if nm[1] == 0:
    nm[1] = 12
    nm[0] -= 1
pm = list(divmod(year * 12 + month - 1, 12))
if pm[1] == 0:
    pm[1] = 12
    pm[0] -= 1

next_month = nm
previous_month = pm

Just for fun, a pure math answer using divmod. Pretty inneficient because of the multiplication, could do just as well a simple check on the number of month (if equal to 12, increase year, etc)

year = today.year
month = today.month

nm = list(divmod(year * 12 + month + 1, 12))
if nm[1] == 0:
    nm[1] = 12
    nm[0] -= 1
pm = list(divmod(year * 12 + month - 1, 12))
if pm[1] == 0:
    pm[1] = 12
    pm[0] -= 1

next_month = nm
previous_month = pm

回答 8

使用Pendulum非常完整的库,我们有了subtract方法(而不是“ subStract”):

import pendulum
today = pendulum.datetime.today()  # 2020, january
lastmonth = today.subtract(months=1)
lastmonth.strftime('%Y%m')
# '201912'

我们看到它可以应对跳跃的岁月。

反向等效为add

https://pendulum.eustace.io/docs/#addition-and-subtraction

With the Pendulum very complete library, we have the subtract method (and not “subStract”):

import pendulum
today = pendulum.datetime.today()  # 2020, january
lastmonth = today.subtract(months=1)
lastmonth.strftime('%Y%m')
# '201912'

We see that it handles jumping years.

The reverse equivalent is add.

https://pendulum.eustace.io/docs/#addition-and-subtraction


回答 9

以@JF Sebastian的注释为基础,您可以将replace()函数链接起来以返回一个“月”。由于一个月不是固定的时间段,因此此解决方案尝试返回到上个月的同一日期,这当然不能在所有月份都有效。在这种情况下,此算法默认为上个月的最后一天。

from datetime import datetime, timedelta

d = datetime(2012, 3, 31) # A problem date as an example

# last day of last month
one_month_ago = (d.replace(day=1) - timedelta(days=1))
try:
    # try to go back to same day last month
    one_month_ago = one_month_ago.replace(day=d.day)
except ValueError:
    pass
print("one_month_ago: {0}".format(one_month_ago))

输出:

one_month_ago: 2012-02-29 00:00:00

Building off the comment of @J.F. Sebastian, you can chain the replace() function to go back one “month”. Since a month is not a constant time period, this solution tries to go back to the same date the previous month, which of course does not work for all months. In such a case, this algorithm defaults to the last day of the prior month.

from datetime import datetime, timedelta

d = datetime(2012, 3, 31) # A problem date as an example

# last day of last month
one_month_ago = (d.replace(day=1) - timedelta(days=1))
try:
    # try to go back to same day last month
    one_month_ago = one_month_ago.replace(day=d.day)
except ValueError:
    pass
print("one_month_ago: {0}".format(one_month_ago))

Output:

one_month_ago: 2012-02-29 00:00:00

回答 10

如果要在LINUX / UNIX环境中查看EXE类型文件中的ASCII字母,请尝试“ od -c’filename’| more”

您可能会得到很多无法识别的项目,但它们都会全部显示出来,并且将显示HEX表示形式,并且ASCII等效字符(如果适用)将跟随十六进制代码行。在您知道的已编译代码上尝试一下。您可能会在其中识别出一些东西。

If you want to look at the ASCII letters in a EXE type file in a LINUX/UNIX Environment, try “od -c ‘filename’ |more”

You will likely get a lot of unrecognizable items, but they will all be presented, and the HEX representations will be displayed, and the ASCII equivalent characters (if appropriate) will follow the line of hex codes. Try it on a compiled piece of code that you know. You might see things in it you recognize.


回答 11

有一个高级库dateparser可以确定给定自然语言的过去日期,并返回相应的Python datetime对象

from dateparser import parse
parse('4 months ago')

There is a high level library dateparser that can determine the past date given natural language, and return the corresponding Python datetime object

from dateparser import parse
parse('4 months ago')

将缺失的日期添加到熊猫数据框

问题:将缺失的日期添加到熊猫数据框

我的数据可以在给定日期包含多个事件,也可以在一个日期包含否事件。我接受这些事件,按日期计数并绘制它们。但是,当我绘制它们时,我的两个系列并不总是匹配。

idx = pd.date_range(df['simpleDate'].min(), df['simpleDate'].max())
s = df.groupby(['simpleDate']).size()

在上面的代码中,idx变为30个日期范围。2013/09/01至2013/09/30但是S可能只有25或26天,因为在给定日期没有事件发生。然后,当我尝试绘制时,由于大小不匹配,我得到一个AssertionError:

fig, ax = plt.subplots()    
ax.bar(idx.to_pydatetime(), s, color='green')

解决这个问题的正确方法是什么?我是否要从IDX中删除没有值的日期,或者(我希望这样做)是将序列中缺少的日期添加为0(我希望这样做)?我希望有30天的完整图表(值为0)。如果这种方法正确,那么有关如何开始使用的任何建议?我需要某种动态reindex功能吗?

这是Sdf.groupby(['simpleDate']).size() )的代码段,请注意没有输入04和05。

09-02-2013     2
09-03-2013    10
09-06-2013     5
09-07-2013     1

My data can have multiple events on a given date or NO events on a date. I take these events, get a count by date and plot them. However, when I plot them, my two series don’t always match.

idx = pd.date_range(df['simpleDate'].min(), df['simpleDate'].max())
s = df.groupby(['simpleDate']).size()

In the above code idx becomes a range of say 30 dates. 09-01-2013 to 09-30-2013 However S may only have 25 or 26 days because no events happened for a given date. I then get an AssertionError as the sizes dont match when I try to plot:

fig, ax = plt.subplots()    
ax.bar(idx.to_pydatetime(), s, color='green')

What’s the proper way to tackle this? Do I want to remove dates with no values from IDX or (which I’d rather do) is add to the series the missing date with a count of 0. I’d rather have a full graph of 30 days with 0 values. If this approach is right, any suggestions on how to get started? Do I need some sort of dynamic reindex function?

Here’s a snippet of S ( df.groupby(['simpleDate']).size() ), notice no entries for 04 and 05.

09-02-2013     2
09-03-2013    10
09-06-2013     5
09-07-2013     1

回答 0

您可以使用Series.reindex

import pandas as pd

idx = pd.date_range('09-01-2013', '09-30-2013')

s = pd.Series({'09-02-2013': 2,
               '09-03-2013': 10,
               '09-06-2013': 5,
               '09-07-2013': 1})
s.index = pd.DatetimeIndex(s.index)

s = s.reindex(idx, fill_value=0)
print(s)

Yield

2013-09-01     0
2013-09-02     2
2013-09-03    10
2013-09-04     0
2013-09-05     0
2013-09-06     5
2013-09-07     1
2013-09-08     0
...

You could use Series.reindex:

import pandas as pd

idx = pd.date_range('09-01-2013', '09-30-2013')

s = pd.Series({'09-02-2013': 2,
               '09-03-2013': 10,
               '09-06-2013': 5,
               '09-07-2013': 1})
s.index = pd.DatetimeIndex(s.index)

s = s.reindex(idx, fill_value=0)
print(s)

yields

2013-09-01     0
2013-09-02     2
2013-09-03    10
2013-09-04     0
2013-09-05     0
2013-09-06     5
2013-09-07     1
2013-09-08     0
...

回答 1

使用更快的解决方法.asfreq()。这不需要创建新索引即可在中调用.reindex()

# "broken" (staggered) dates
dates = pd.Index([pd.Timestamp('2012-05-01'), 
                  pd.Timestamp('2012-05-04'), 
                  pd.Timestamp('2012-05-06')])
s = pd.Series([1, 2, 3], dates)

print(s.asfreq('D'))
2012-05-01    1.0
2012-05-02    NaN
2012-05-03    NaN
2012-05-04    2.0
2012-05-05    NaN
2012-05-06    3.0
Freq: D, dtype: float64

A quicker workaround is to use .asfreq(). This doesn’t require creation of a new index to call within .reindex().

# "broken" (staggered) dates
dates = pd.Index([pd.Timestamp('2012-05-01'), 
                  pd.Timestamp('2012-05-04'), 
                  pd.Timestamp('2012-05-06')])
s = pd.Series([1, 2, 3], dates)

print(s.asfreq('D'))
2012-05-01    1.0
2012-05-02    NaN
2012-05-03    NaN
2012-05-04    2.0
2012-05-05    NaN
2012-05-06    3.0
Freq: D, dtype: float64

回答 2

一个问题是,reindex如果存在重复值,该操作将失败。假设我们正在处理带时间戳的数据,我们希望按日期将其编入索引:

df = pd.DataFrame({
    'timestamps': pd.to_datetime(
        ['2016-11-15 1:00','2016-11-16 2:00','2016-11-16 3:00','2016-11-18 4:00']),
    'values':['a','b','c','d']})
df.index = pd.DatetimeIndex(df['timestamps']).floor('D')
df

Yield

            timestamps             values
2016-11-15  "2016-11-15 01:00:00"  a
2016-11-16  "2016-11-16 02:00:00"  b
2016-11-16  "2016-11-16 03:00:00"  c
2016-11-18  "2016-11-18 04:00:00"  d

由于2016-11-16日期重复,尝试重新编制索引:

all_days = pd.date_range(df.index.min(), df.index.max(), freq='D')
df.reindex(all_days)

失败与:

...
ValueError: cannot reindex from a duplicate axis

(这表示索引重复,而不是索引本身是重复项)

相反,我们可以使用.loc查找范围内所有日期的条目:

df.loc[all_days]

Yield

            timestamps             values
2016-11-15  "2016-11-15 01:00:00"  a
2016-11-16  "2016-11-16 02:00:00"  b
2016-11-16  "2016-11-16 03:00:00"  c
2016-11-17  NaN                    NaN
2016-11-18  "2016-11-18 04:00:00"  d

fillna 如果需要,可用于色谱柱系列以填充空白。

One issue is that reindex will fail if there are duplicate values. Say we’re working with timestamped data, which we want to index by date:

df = pd.DataFrame({
    'timestamps': pd.to_datetime(
        ['2016-11-15 1:00','2016-11-16 2:00','2016-11-16 3:00','2016-11-18 4:00']),
    'values':['a','b','c','d']})
df.index = pd.DatetimeIndex(df['timestamps']).floor('D')
df

yields

            timestamps             values
2016-11-15  "2016-11-15 01:00:00"  a
2016-11-16  "2016-11-16 02:00:00"  b
2016-11-16  "2016-11-16 03:00:00"  c
2016-11-18  "2016-11-18 04:00:00"  d

Due to the duplicate 2016-11-16 date, an attempt to reindex:

all_days = pd.date_range(df.index.min(), df.index.max(), freq='D')
df.reindex(all_days)

fails with:

...
ValueError: cannot reindex from a duplicate axis

(by this it means the index has duplicates, not that it is itself a dup)

Instead, we can use .loc to look up entries for all dates in range:

df.loc[all_days]

yields

            timestamps             values
2016-11-15  "2016-11-15 01:00:00"  a
2016-11-16  "2016-11-16 02:00:00"  b
2016-11-16  "2016-11-16 03:00:00"  c
2016-11-17  NaN                    NaN
2016-11-18  "2016-11-18 04:00:00"  d

fillna can be used on the column series to fill blanks if needed.


回答 3

另一种方法是resample,除了缺少日期外,还可以处理重复的日期。例如:

df.resample('D').mean()

resample是一个类似的延迟操作,groupby因此您需要执行另一个操作。在这种情况下mean工作得很好,但你也可以使用许多其他的熊猫方法,如maxsum等。

这是原始数据,但带有“ 2013-09-03”的附加条目:

             val
date           
2013-09-02     2
2013-09-03    10
2013-09-03    20    <- duplicate date added to OP's data
2013-09-06     5
2013-09-07     1

结果如下:

             val
date            
2013-09-02   2.0
2013-09-03  15.0    <- mean of original values for 2013-09-03
2013-09-04   NaN    <- NaN b/c date not present in orig
2013-09-05   NaN    <- NaN b/c date not present in orig
2013-09-06   5.0
2013-09-07   1.0

我将遗漏的日期保留为NaN以便清楚地说明其工作原理,但是您可以fillna(0)根据OP的要求添加以零代替NaN的方法,也可以interpolate()根据相邻行使用类似非零值的填充方法。

An alternative approach is resample, which can handle duplicate dates in addition to missing dates. For example:

df.resample('D').mean()

resample is a deferred operation like groupby so you need to follow it with another operation. In this case mean works well, but you can also use many other pandas methods like max, sum, etc.

Here is the original data, but with an extra entry for ‘2013-09-03’:

             val
date           
2013-09-02     2
2013-09-03    10
2013-09-03    20    <- duplicate date added to OP's data
2013-09-06     5
2013-09-07     1

And here are the results:

             val
date            
2013-09-02   2.0
2013-09-03  15.0    <- mean of original values for 2013-09-03
2013-09-04   NaN    <- NaN b/c date not present in orig
2013-09-05   NaN    <- NaN b/c date not present in orig
2013-09-06   5.0
2013-09-07   1.0

I left the missing dates as NaNs to make it clear how this works, but you can add fillna(0) to replace NaNs with zeroes as requested by the OP or alternatively use something like interpolate() to fill with non-zero values based on the neighboring rows.


回答 4

这是一种将缺失的日期填充到数据框中的好方法,您可以选择fill_valuedays_back填充和date_order排序对数据框进行排序的顺序():

def fill_in_missing_dates(df, date_col_name = 'date',date_order = 'asc', fill_value = 0, days_back = 30):

    df.set_index(date_col_name,drop=True,inplace=True)
    df.index = pd.DatetimeIndex(df.index)
    d = datetime.now().date()
    d2 = d - timedelta(days = days_back)
    idx = pd.date_range(d2, d, freq = "D")
    df = df.reindex(idx,fill_value=fill_value)
    df[date_col_name] = pd.DatetimeIndex(df.index)

    return df

Here’s a nice method to fill in missing dates into a dataframe, with your choice of fill_value, days_back to fill in, and sort order (date_order) by which to sort the dataframe:

def fill_in_missing_dates(df, date_col_name = 'date',date_order = 'asc', fill_value = 0, days_back = 30):

    df.set_index(date_col_name,drop=True,inplace=True)
    df.index = pd.DatetimeIndex(df.index)
    d = datetime.now().date()
    d2 = d - timedelta(days = days_back)
    idx = pd.date_range(d2, d, freq = "D")
    df = df.reindex(idx,fill_value=fill_value)
    df[date_col_name] = pd.DatetimeIndex(df.index)

    return df

最干净,最Pythonic的方式来获取明天的约会?

问题:最干净,最Pythonic的方式来获取明天的约会?

获取明天约会的最干净,最Python方式是什么?必须有比在一天中增加一天,在月末处理天等等的更好的方法。

What is the cleanest and most Pythonic way to get tomorrow’s date? There must be a better way than to add one to the day, handle days at the end of the month, etc.


回答 0

datetime.date.today() + datetime.timedelta(days=1) 应该可以

datetime.date.today() + datetime.timedelta(days=1) should do the trick


回答 1

timedelta 可以处理增加的天,秒,微秒,毫秒,分钟,小时或星期。

>>> import datetime
>>> today = datetime.date.today()
>>> today
datetime.date(2009, 10, 1)
>>> today + datetime.timedelta(days=1)
datetime.date(2009, 10, 2)
>>> datetime.date(2009,10,31) + datetime.timedelta(hours=24)
datetime.date(2009, 11, 1)

如评论中所问,leap日没有问题:

>>> datetime.date(2004, 2, 28) + datetime.timedelta(days=1)
datetime.date(2004, 2, 29)
>>> datetime.date(2004, 2, 28) + datetime.timedelta(days=2)
datetime.date(2004, 3, 1)
>>> datetime.date(2005, 2, 28) + datetime.timedelta(days=1)
datetime.date(2005, 3, 1)

timedelta can handle adding days, seconds, microseconds, milliseconds, minutes, hours, or weeks.

>>> import datetime
>>> today = datetime.date.today()
>>> today
datetime.date(2009, 10, 1)
>>> today + datetime.timedelta(days=1)
datetime.date(2009, 10, 2)
>>> datetime.date(2009,10,31) + datetime.timedelta(hours=24)
datetime.date(2009, 11, 1)

As asked in a comment, leap days pose no problem:

>>> datetime.date(2004, 2, 28) + datetime.timedelta(days=1)
datetime.date(2004, 2, 29)
>>> datetime.date(2004, 2, 28) + datetime.timedelta(days=2)
datetime.date(2004, 3, 1)
>>> datetime.date(2005, 2, 28) + datetime.timedelta(days=1)
datetime.date(2005, 3, 1)

回答 2

没有处理的闰秒寿:

>>> from datetime import datetime, timedelta
>>> dt = datetime(2008,12,31,23,59,59)
>>> str(dt)
'2008-12-31 23:59:59'
>>> # leap second was added at the end of 2008, 
>>> # adding one second should create a datetime
>>> # of '2008-12-31 23:59:60'
>>> str(dt+timedelta(0,1))
'2009-01-01 00:00:00'
>>> str(dt+timedelta(0,2))
'2009-01-01 00:00:01'

该死的

编辑-@Mark:文档说“是”,但是代码说“不是很多”:

>>> time.strptime("2008-12-31 23:59:60","%Y-%m-%d %H:%M:%S")
(2008, 12, 31, 23, 59, 60, 2, 366, -1)
>>> time.mktime(time.strptime("2008-12-31 23:59:60","%Y-%m-%d %H:%M:%S"))
1230789600.0
>>> time.gmtime(time.mktime(time.strptime("2008-12-31 23:59:60","%Y-%m-%d %H:%M:%S")))
(2009, 1, 1, 6, 0, 0, 3, 1, 0)
>>> time.localtime(time.mktime(time.strptime("2008-12-31 23:59:60","%Y-%m-%d %H:%M:%S")))
(2009, 1, 1, 0, 0, 0, 3, 1, 0)

我认为gmtime或localtime将采用mktime返回的值,并以60秒为单位返回给我原始的元组。该测试表明这些leap秒会逐渐消失…

>>> a = time.mktime(time.strptime("2008-12-31 23:59:60","%Y-%m-%d %H:%M:%S"))
>>> b = time.mktime(time.strptime("2009-01-01 00:00:00","%Y-%m-%d %H:%M:%S"))
>>> a,b
(1230789600.0, 1230789600.0)
>>> b-a
0.0

No handling of leap seconds tho:

>>> from datetime import datetime, timedelta
>>> dt = datetime(2008,12,31,23,59,59)
>>> str(dt)
'2008-12-31 23:59:59'
>>> # leap second was added at the end of 2008, 
>>> # adding one second should create a datetime
>>> # of '2008-12-31 23:59:60'
>>> str(dt+timedelta(0,1))
'2009-01-01 00:00:00'
>>> str(dt+timedelta(0,2))
'2009-01-01 00:00:01'

darn.

EDIT – @Mark: The docs say “yes”, but the code says “not so much”:

>>> time.strptime("2008-12-31 23:59:60","%Y-%m-%d %H:%M:%S")
(2008, 12, 31, 23, 59, 60, 2, 366, -1)
>>> time.mktime(time.strptime("2008-12-31 23:59:60","%Y-%m-%d %H:%M:%S"))
1230789600.0
>>> time.gmtime(time.mktime(time.strptime("2008-12-31 23:59:60","%Y-%m-%d %H:%M:%S")))
(2009, 1, 1, 6, 0, 0, 3, 1, 0)
>>> time.localtime(time.mktime(time.strptime("2008-12-31 23:59:60","%Y-%m-%d %H:%M:%S")))
(2009, 1, 1, 0, 0, 0, 3, 1, 0)

I would think that gmtime or localtime would take the value returned by mktime and given me back the original tuple, with 60 as the number of seconds. And this test shows that these leap seconds can just fade away…

>>> a = time.mktime(time.strptime("2008-12-31 23:59:60","%Y-%m-%d %H:%M:%S"))
>>> b = time.mktime(time.strptime("2009-01-01 00:00:00","%Y-%m-%d %H:%M:%S"))
>>> a,b
(1230789600.0, 1230789600.0)
>>> b-a
0.0

回答 3

即使是基本time模块也可以处理此问题:

import time
time.localtime(time.time() + 24*3600)

Even the basic time module can handle this:

import time
time.localtime(time.time() + 24*3600)

是否有Python函数可确定日期在一年的哪个季度?

问题:是否有Python函数可确定日期在一年的哪个季度?

当然,我可以自己写这个,但是在我重新发明轮子之前,已经有一个已经做到这一点的功能了吗?

Sure I could write this myself, but before I go reinventing the wheel is there a function that already does this?


回答 0

给定一个实例xdatetime.date(x.month-1)//3会给你的四分之一(0第一季度,1第二季度,等等-加1,如果你需要从1计数,而不是;-)。


最初有两个答案,乘以赞成票,甚至最初被接受(目前都已删除),但有很多错误- -1除法前不做除法,而是除以4而不是3。从.month1到12,可以很容易地检查一下自己的公式是什么对:

for m in range(1, 13):
  print m//4 + 1,
print

给出1 1 1 2 2 2 2 3 3 3 3 4-两个四个月的季度和一个月的季度(深度)。

for m in range(1, 13):
  print (m-1)//3 + 1,
print

1 1 1 2 2 2 3 3 3 4 4 4-现在看起来这对您来说不是很可取吗?-)

我认为,这证明了这个问题是正确的;-)。

我认为datetime模块不一定应该具有所有可能有用的日历功能,但是我确实知道我在工作中维护了一个(经过测试;-)datetools模块来使用我的(以及其他人)项目,该模块几乎没有执行所有这些日历计算的函数-有些复杂,有些简单,但没有理由一遍又一遍地进行工作(甚至是简单的工作),也没有理由冒这样的错误;-)。

Given an instance x of datetime.date, (x.month-1)//3 will give you the quarter (0 for first quarter, 1 for second quarter, etc — add 1 if you need to count from 1 instead;-).


Originally two answers, multiply upvoted and even originally accepted (both currently deleted), were buggy — not doing the -1 before the division, and dividing by 4 instead of 3. Since .month goes 1 to 12, it’s easy to check for yourself what formula is right:

for m in range(1, 13):
  print m//4 + 1,
print

gives 1 1 1 2 2 2 2 3 3 3 3 4 — two four-month quarters and a single-month one (eep).

for m in range(1, 13):
  print (m-1)//3 + 1,
print

gives 1 1 1 2 2 2 3 3 3 4 4 4 — now doesn’t this look vastly preferable to you?-)

This proves that the question is well warranted, I think;-).

I don’t think the datetime module should necessarily have every possible useful calendric function, but I do know I maintain a (well-tested;-) datetools module for the use of my (and others’) projects at work, which has many little functions to perform all of these calendric computations — some are complex, some simple, but there’s no reason to do the work over and over (even simple work) or risk bugs in such computations;-).


回答 1

如果您已经在使用pandas,则非常简单。

import datetime as dt
import pandas as pd

quarter = pd.Timestamp(dt.date(2016, 2, 29)).quarter
assert quarter == 1

如果date数据框中有一列,则可以轻松创建一个新quarter列:

df['quarter'] = df['date'].dt.quarter

IF you are already using pandas, it’s quite simple.

import datetime as dt
import pandas as pd

quarter = pd.Timestamp(dt.date(2016, 2, 29)).quarter
assert quarter == 1

If you have a date column in a dataframe, you can easily create a new quarter column:

df['quarter'] = df['date'].dt.quarter

回答 2

我会建议另一个可以说更清洁的解决方案。如果X是一个datetime.datetime.now()实例,则四分之一是:

import math
Q=math.ceil(X.month/3.)

必须从数学模块中导入ceil,因为它无法直接访问。

I would suggest another arguably cleaner solution. If X is a datetime.datetime.now() instance, then the quarter is:

import math
Q=math.ceil(X.month/3.)

ceil has to be imported from math module as it can’t be accessed directly.


回答 3

对于试图获取会计年度季度(可能与日历年度有所不同)的任何人,我都编写了Python模块来执行此操作。

安装很简单。赶紧跑:

$ pip install fiscalyear

没有依赖关系,并且fiscalyear对Python 2和3都适用。

它基本上是内置的datetime模块的包装器,因此,datetime您已经熟悉的任何命令都可以使用。这是一个演示:

>>> from fiscalyear import *
>>> a = FiscalDate.today()
>>> a
FiscalDate(2017, 5, 6)
>>> a.fiscal_year
2017
>>> a.quarter
3
>>> b = FiscalYear(2017)
>>> b.start
FiscalDateTime(2016, 10, 1, 0, 0)
>>> b.end
FiscalDateTime(2017, 9, 30, 23, 59, 59)
>>> b.q3
FiscalQuarter(2017, 3)
>>> b.q3.start
FiscalDateTime(2017, 4, 1, 0, 0)
>>> b.q3.end
FiscalDateTime(2017, 6, 30, 23, 59, 59)

fiscalyear托管在GitHubPyPI上。可以在“ 阅读文档”中找到文档。如果您正在寻找它当前没有的任何功能,请告诉我!

For anyone trying to get the quarter of the fiscal year, which may differ from the calendar year, I wrote a Python module to do just this.

Installation is simple. Just run:

$ pip install fiscalyear

There are no dependencies, and fiscalyear should work for both Python 2 and 3.

It’s basically a wrapper around the built-in datetime module, so any datetime commands you are already familiar with will work. Here’s a demo:

>>> from fiscalyear import *
>>> a = FiscalDate.today()
>>> a
FiscalDate(2017, 5, 6)
>>> a.fiscal_year
2017
>>> a.quarter
3
>>> b = FiscalYear(2017)
>>> b.start
FiscalDateTime(2016, 10, 1, 0, 0)
>>> b.end
FiscalDateTime(2017, 9, 30, 23, 59, 59)
>>> b.q3
FiscalQuarter(2017, 3)
>>> b.q3.start
FiscalDateTime(2017, 4, 1, 0, 0)
>>> b.q3.end
FiscalDateTime(2017, 6, 30, 23, 59, 59)

fiscalyear is hosted on GitHub and PyPI. Documentation can be found at Read the Docs. If you’re looking for any features that it doesn’t currently have, let me know!


回答 4

这是一个获取datetime.datetime对象并为每个季度返回唯一字符串的函数示例:

from datetime import datetime, timedelta

def get_quarter(d):
    return "Q%d_%d" % (math.ceil(d.month/3), d.year)

d = datetime.now()
print(d.strftime("%Y-%m-%d"), get_q(d))

d2 = d - timedelta(90)
print(d2.strftime("%Y-%m-%d"), get_q(d2))

d3 = d - timedelta(180 + 365)
print(d3.strftime("%Y-%m-%d"), get_q(d3))

输出为:

2019-02-14 Q1_2019
2018-11-16 Q4_2018
2017-08-18 Q3_2017

Here is an example of a function that gets a datetime.datetime object and returns a unique string for each quarter:

from datetime import datetime, timedelta

def get_quarter(d):
    return "Q%d_%d" % (math.ceil(d.month/3), d.year)

d = datetime.now()
print(d.strftime("%Y-%m-%d"), get_q(d))

d2 = d - timedelta(90)
print(d2.strftime("%Y-%m-%d"), get_q(d2))

d3 = d - timedelta(180 + 365)
print(d3.strftime("%Y-%m-%d"), get_q(d3))

And the output is:

2019-02-14 Q1_2019
2018-11-16 Q4_2018
2017-08-18 Q3_2017

回答 5

如果m是月份号…

import math
math.ceil(float(m) / 3)

if m is the month number…

import math
math.ceil(float(m) / 3)

回答 6

此方法适用于任何映射:

month2quarter = {
        1:1,2:1,3:1,
        4:2,5:2,6:2,
        7:3,8:3,9:3,
        10:4,11:4,12:4,
    }.get

我们刚刚生成了一个函数 int->int

month2quarter(9) # returns 3

这种方法也是万无一失的

month2quarter(-1) # returns None
month2quarter('July') # returns None

This method works for any mapping:

month2quarter = {
        1:1,2:1,3:1,
        4:2,5:2,6:2,
        7:3,8:3,9:3,
        10:4,11:4,12:4,
    }.get

We have just generated a function int->int

month2quarter(9) # returns 3

This method is also fool-proof

month2quarter(-1) # returns None
month2quarter('July') # returns None

回答 7

对于那些正在使用熊猫查询会计年度季度数据的人,

import datetime
import pandas as pd
today_date = datetime.date.today()
quarter = pd.PeriodIndex(today_date, freq='Q-MAR').strftime('Q%q')

参考: 大熊猫时期指数

For those, who are looking for financial year quarter data, using pandas,

import datetime
import pandas as pd
today_date = datetime.date.today()
quarter = pd.PeriodIndex(today_date, freq='Q-MAR').strftime('Q%q')

reference: pandas period index


回答 8

这是一个老问题,但仍然值得讨论。

这是我的解决方案,使用了出色的dateutil模块。

  from dateutil import rrule,relativedelta

   year = this_date.year
   quarters = rrule.rrule(rrule.MONTHLY,
                      bymonth=(1,4,7,10),
                      bysetpos=-1,
                      dtstart=datetime.datetime(year,1,1),
                      count=8)

   first_day = quarters.before(this_date)
   last_day =  (quarters.after(this_date)
                -relativedelta.relativedelta(days=1)

因此first_day,该季度的第一天也是该季度last_day的最后一天(通过找到下一个季度的第一天减去一天来计算)。

This is an old question but still worthy of discussion.

Here is my solution, using the excellent dateutil module.

  from dateutil import rrule,relativedelta

   year = this_date.year
   quarters = rrule.rrule(rrule.MONTHLY,
                      bymonth=(1,4,7,10),
                      bysetpos=-1,
                      dtstart=datetime.datetime(year,1,1),
                      count=8)

   first_day = quarters.before(this_date)
   last_day =  (quarters.after(this_date)
                -relativedelta.relativedelta(days=1)

So first_day is the first day of the quarter, and last_day is the last day of the quarter (calculated by finding the first day of the next quarter, minus one day).


回答 9

这非常简单,可以在python3中使用:

from datetime import datetime

# Get current date-time.
now = datetime.now()

# Determine which quarter of the year is now. Returns q1, q2, q3 or q4.
quarter_of_the_year = 'q'+str((now.month-1)//3+1)

This is very simple and works in python3:

from datetime import datetime

# Get current date-time.
now = datetime.now()

# Determine which quarter of the year is now. Returns q1, q2, q3 or q4.
quarter_of_the_year = 'q'+str((now.month-1)//3+1)

回答 10

嗯,所以计算可能会出错,这是一个更好的版本(仅出于此目的)

first, second, third, fourth=1,2,3,4# you can make strings if you wish :)

quarterMap = {}
quarterMap.update(dict(zip((1,2,3),(first,)*3)))
quarterMap.update(dict(zip((4,5,6),(second,)*3)))
quarterMap.update(dict(zip((7,8,9),(third,)*3)))
quarterMap.update(dict(zip((10,11,12),(fourth,)*3)))

print quarterMap[6]

hmmm so calculations can go wrong, here is a better version (just for the sake of it)

first, second, third, fourth=1,2,3,4# you can make strings if you wish :)

quarterMap = {}
quarterMap.update(dict(zip((1,2,3),(first,)*3)))
quarterMap.update(dict(zip((4,5,6),(second,)*3)))
quarterMap.update(dict(zip((7,8,9),(third,)*3)))
quarterMap.update(dict(zip((10,11,12),(fourth,)*3)))

print quarterMap[6]

回答 11

这是一个冗长但易读的解决方案,适用于datetime和date实例

def get_quarter(date):
    for months, quarter in [
        ([1, 2, 3], 1),
        ([4, 5, 6], 2),
        ([7, 8, 9], 3),
        ([10, 11, 12], 4)
    ]:
        if date.month in months:
            return quarter

Here is a verbose, but also readable solution that will work for datetime and date instances

def get_quarter(date):
    for months, quarter in [
        ([1, 2, 3], 1),
        ([4, 5, 6], 2),
        ([7, 8, 9], 3),
        ([10, 11, 12], 4)
    ]:
        if date.month in months:
            return quarter

回答 12

使用字典,您可以通过

def get_quarter(month):
    quarter_dictionary = {
        "Q1" : [1,2,3],
        "Q2" : [4,5,6],
        "Q3" : [7,8,9],
        "Q4" : [10,11,12]
    }

    for key,values in quarter_dictionary.items():
        for value in values:
            if value == month:
                return key

print(get_quarter(3))

using dictionaries, you can pull this off by

def get_quarter(month):
    quarter_dictionary = {
        "Q1" : [1,2,3],
        "Q2" : [4,5,6],
        "Q3" : [7,8,9],
        "Q4" : [10,11,12]
    }

    for key,values in quarter_dictionary.items():
        for value in values:
            if value == month:
                return key

print(get_quarter(3))

如何将熊猫数据框中的日期转换为“日期”数据类型?

问题:如何将熊猫数据框中的日期转换为“日期”数据类型?

我有一个熊猫数据框,其中一列包含格式为日期的字符串 YYYY-MM-DD

例如 '2013-10-28'

目前该dtype列的是object

如何将列值转换为Pandas日期格式?

I have a Pandas data frame, one of the column contains date strings in the format YYYY-MM-DD

For e.g. '2013-10-28'

At the moment the dtype of the column is object.

How do I convert the column values to Pandas date format?


回答 0

使用类型

In [31]: df
Out[31]: 
   a        time
0  1  2013-01-01
1  2  2013-01-02
2  3  2013-01-03

In [32]: df['time'] = df['time'].astype('datetime64[ns]')

In [33]: df
Out[33]: 
   a                time
0  1 2013-01-01 00:00:00
1  2 2013-01-02 00:00:00
2  3 2013-01-03 00:00:00

Use astype

In [31]: df
Out[31]: 
   a        time
0  1  2013-01-01
1  2  2013-01-02
2  3  2013-01-03

In [32]: df['time'] = df['time'].astype('datetime64[ns]')

In [33]: df
Out[33]: 
   a                time
0  1 2013-01-01 00:00:00
1  2 2013-01-02 00:00:00
2  3 2013-01-03 00:00:00

回答 1

基本上等同于@waitingkuo,但我将to_datetime在这里使用(它看起来更干净一些,并提供了一些其他功能,例如dayfirst):

In [11]: df
Out[11]:
   a        time
0  1  2013-01-01
1  2  2013-01-02
2  3  2013-01-03

In [12]: pd.to_datetime(df['time'])
Out[12]:
0   2013-01-01 00:00:00
1   2013-01-02 00:00:00
2   2013-01-03 00:00:00
Name: time, dtype: datetime64[ns]

In [13]: df['time'] = pd.to_datetime(df['time'])

In [14]: df
Out[14]:
   a                time
0  1 2013-01-01 00:00:00
1  2 2013-01-02 00:00:00
2  3 2013-01-03 00:00:00

处理ValueError小号
如果碰上的情况下做

df['time'] = pd.to_datetime(df['time'])

抛出一个

ValueError: Unknown string format

这意味着您具有无效(不可强制)的值。如果可以将它们转换为pd.NaT,可以在其中添加一个errors='coerce'参数to_datetime

df['time'] = pd.to_datetime(df['time'], errors='coerce')

Essentially equivalent to @waitingkuo, but I would use to_datetime here (it seems a little cleaner, and offers some additional functionality e.g. dayfirst):

In [11]: df
Out[11]:
   a        time
0  1  2013-01-01
1  2  2013-01-02
2  3  2013-01-03

In [12]: pd.to_datetime(df['time'])
Out[12]:
0   2013-01-01 00:00:00
1   2013-01-02 00:00:00
2   2013-01-03 00:00:00
Name: time, dtype: datetime64[ns]

In [13]: df['time'] = pd.to_datetime(df['time'])

In [14]: df
Out[14]:
   a                time
0  1 2013-01-01 00:00:00
1  2 2013-01-02 00:00:00
2  3 2013-01-03 00:00:00

Handling ValueErrors
If you run into a situation where doing

df['time'] = pd.to_datetime(df['time'])

Throws a

ValueError: Unknown string format

That means you have invalid (non-coercible) values. If you are okay with having them converted to pd.NaT, you can add an errors='coerce' argument to to_datetime:

df['time'] = pd.to_datetime(df['time'], errors='coerce')

回答 2

我想象大量数据从CSV文件输入到Pandas中,在这种情况下,您可以简单地在初始CSV读取期间转换日期:

dfcsv = pd.read_csv('xyz.csv', parse_dates=[0])其中0表示日期所在的列。如果希望日期成为索引,
也可以, index_col=0在其中添加。

参见https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html

I imagine a lot of data comes into Pandas from CSV files, in which case you can simply convert the date during the initial CSV read:

dfcsv = pd.read_csv('xyz.csv', parse_dates=[0]) where the 0 refers to the column the date is in.
You could also add , index_col=0 in there if you want the date to be your index.

See https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html


回答 3

现在你可以做 df['column'].dt.date

请注意,对于日期时间对象,如果您没有看到它们均为00:00:00的小时数,则说明它不是熊猫。那是iPython笔记本,试图使事情看起来更漂亮。

Now you can do df['column'].dt.date

Note that for datetime objects, if you don’t see the hour when they’re all 00:00:00, that’s not pandas. That’s iPython notebook trying to make things look pretty.


回答 4

执行此操作的另一种方法,如果您有多个要转换为日期时间的列,则此方法效果很好。

cols = ['date1','date2']
df[cols] = df[cols].apply(pd.to_datetime)

Another way to do this and this works well if you have multiple columns to convert to datetime.

cols = ['date1','date2']
df[cols] = df[cols].apply(pd.to_datetime)

回答 5

如果要获取DATE而不是DATETIME格式:

df["id_date"] = pd.to_datetime(df["id_date"]).dt.date

If you want to get the DATE and not DATETIME format:

df["id_date"] = pd.to_datetime(df["id_date"]).dt.date

回答 6

在某些情况下,可能需要将日期转换为其他频率。在这种情况下,我建议按日期设置索引。

#set an index by dates
df.set_index(['time'], drop=True, inplace=True)

此后,您可以更轻松地转换为最需要的日期格式类型。在下面,我依次转换为多种日期格式,最终以每个月初的一组每日日期结束。

#Convert to daily dates
df.index = pd.DatetimeIndex(data=df.index)

#Convert to monthly dates
df.index = df.index.to_period(freq='M')

#Convert to strings
df.index = df.index.strftime('%Y-%m')

#Convert to daily dates
df.index = pd.DatetimeIndex(data=df.index)

为了简洁起见,我没有显示在上面的每一行之后都运行以下代码:

print(df.index)
print(df.index.dtype)
print(type(df.index))

这给了我以下输出:

Index(['2013-01-01', '2013-01-02', '2013-01-03'], dtype='object', name='time')
object
<class 'pandas.core.indexes.base.Index'>

DatetimeIndex(['2013-01-01', '2013-01-02', '2013-01-03'], dtype='datetime64[ns]', name='time', freq=None)
datetime64[ns]
<class 'pandas.core.indexes.datetimes.DatetimeIndex'>

PeriodIndex(['2013-01', '2013-01', '2013-01'], dtype='period[M]', name='time', freq='M')
period[M]
<class 'pandas.core.indexes.period.PeriodIndex'>

Index(['2013-01', '2013-01', '2013-01'], dtype='object')
object
<class 'pandas.core.indexes.base.Index'>

DatetimeIndex(['2013-01-01', '2013-01-01', '2013-01-01'], dtype='datetime64[ns]', freq=None)
datetime64[ns]
<class 'pandas.core.indexes.datetimes.DatetimeIndex'>

It may be the case that dates need to be converted to a different frequency. In this case, I would suggest setting an index by dates.

#set an index by dates
df.set_index(['time'], drop=True, inplace=True)

After this, you can more easily convert to the type of date format you will need most. Below, I sequentially convert to a number of date formats, ultimately ending up with a set of daily dates at the beginning of the month.

#Convert to daily dates
df.index = pd.DatetimeIndex(data=df.index)

#Convert to monthly dates
df.index = df.index.to_period(freq='M')

#Convert to strings
df.index = df.index.strftime('%Y-%m')

#Convert to daily dates
df.index = pd.DatetimeIndex(data=df.index)

For brevity, I don’t show that I run the following code after each line above:

print(df.index)
print(df.index.dtype)
print(type(df.index))

This gives me the following output:

Index(['2013-01-01', '2013-01-02', '2013-01-03'], dtype='object', name='time')
object
<class 'pandas.core.indexes.base.Index'>

DatetimeIndex(['2013-01-01', '2013-01-02', '2013-01-03'], dtype='datetime64[ns]', name='time', freq=None)
datetime64[ns]
<class 'pandas.core.indexes.datetimes.DatetimeIndex'>

PeriodIndex(['2013-01', '2013-01', '2013-01'], dtype='period[M]', name='time', freq='M')
period[M]
<class 'pandas.core.indexes.period.PeriodIndex'>

Index(['2013-01', '2013-01', '2013-01'], dtype='object')
object
<class 'pandas.core.indexes.base.Index'>

DatetimeIndex(['2013-01-01', '2013-01-01', '2013-01-01'], dtype='datetime64[ns]', freq=None)
datetime64[ns]
<class 'pandas.core.indexes.datetimes.DatetimeIndex'>

回答 7

尝试使用pd.to_datetime函数将行之一转换为时间戳,然后使用.map将公式映射到整个列

Try to convert one of the rows into timestamp using the pd.to_datetime function and then use .map to map the formular to the entire column


回答 8

 #   Column          Non-Null Count   Dtype         
---  ------          --------------   -----         
 0   startDay        110526 non-null  object
 1   endDay          110526 non-null  object

import pandas as pd

df['startDay'] = pd.to_datetime(df.startDay)

df['endDay'] = pd.to_datetime(df.endDay)

 #   Column          Non-Null Count   Dtype         
---  ------          --------------   -----         
 0   startDay        110526 non-null  datetime64[ns]
 1   endDay          110526 non-null  datetime64[ns]
 #   Column          Non-Null Count   Dtype         
---  ------          --------------   -----         
 0   startDay        110526 non-null  object
 1   endDay          110526 non-null  object

import pandas as pd

df['startDay'] = pd.to_datetime(df.startDay)

df['endDay'] = pd.to_datetime(df.endDay)

 #   Column          Non-Null Count   Dtype         
---  ------          --------------   -----         
 0   startDay        110526 non-null  datetime64[ns]
 1   endDay          110526 non-null  datetime64[ns]

回答 9

为了完整起见,可能不是最直接的另一种选择,有点类似于@SSS提出的选择,但使用datetime库是:

import datetime
df["Date"] = df["Date"].apply(lambda x: datetime.datetime.strptime(x, '%Y-%d-%m').date())

For the sake of completeness, another option, which might not be the most straightforward one, a bit similar to the one proposed by @SSS, but using rather the datetime library is:

import datetime
df["Date"] = df["Date"].apply(lambda x: datetime.datetime.strptime(x, '%Y-%d-%m').date())

Python中生日的年龄

问题:Python中生日的年龄

如何从今天的日期和人的出生日期中找到python的年龄?生日是来自Django模型中的DateField的。

How can I find an age in python from today’s date and a persons birthdate? The birthdate is a from a DateField in a Django model.


回答 0

考虑到int(True)为1且int(False)为0,可以轻松得多:

from datetime import date

def calculate_age(born):
    today = date.today()
    return today.year - born.year - ((today.month, today.day) < (born.month, born.day))

That can be done much simpler considering that int(True) is 1 and int(False) is 0:

from datetime import date

def calculate_age(born):
    today = date.today()
    return today.year - born.year - ((today.month, today.day) < (born.month, born.day))

回答 1

from datetime import date

def calculate_age(born):
    today = date.today()
    try: 
        birthday = born.replace(year=today.year)
    except ValueError: # raised when birth date is February 29 and the current year is not a leap year
        birthday = born.replace(year=today.year, month=born.month+1, day=1)
    if birthday > today:
        return today.year - born.year - 1
    else:
        return today.year - born.year

更新:使用Danny的解决方案更好

from datetime import date

def calculate_age(born):
    today = date.today()
    try: 
        birthday = born.replace(year=today.year)
    except ValueError: # raised when birth date is February 29 and the current year is not a leap year
        birthday = born.replace(year=today.year, month=born.month+1, day=1)
    if birthday > today:
        return today.year - born.year - 1
    else:
        return today.year - born.year

Update: Use Danny’s solution, it’s better


回答 2

from datetime import date

days_in_year = 365.2425    
age = int((date.today() - birth_date).days / days_in_year)

在Python 3中,您可以对进行除法datetime.timedelta

from datetime import date, timedelta

age = (date.today() - birth_date) // timedelta(days=365.2425)
from datetime import date

days_in_year = 365.2425    
age = int((date.today() - birth_date).days / days_in_year)

In Python 3, you could perform division on datetime.timedelta:

from datetime import date, timedelta

age = (date.today() - birth_date) // timedelta(days=365.2425)

回答 3

如@ [Tomasz Zielinski]和@Williams的建议,python-dateutil只能执行5行。

from dateutil.relativedelta import *
from datetime import date
today = date.today()
dob = date(1982, 7, 5)
age = relativedelta(today, dob)

>>relativedelta(years=+33, months=+11, days=+16)`

As suggested by @[Tomasz Zielinski] and @Williams python-dateutil can do it just 5 lines.

from dateutil.relativedelta import *
from datetime import date
today = date.today()
dob = date(1982, 7, 5)
age = relativedelta(today, dob)

>>relativedelta(years=+33, months=+11, days=+16)`

回答 4

最简单的方法是使用 python-dateutil

import datetime

import dateutil

def birthday(date):
    # Get the current date
    now = datetime.datetime.utcnow()
    now = now.date()

    # Get the difference between the current date and the birthday
    age = dateutil.relativedelta.relativedelta(now, date)
    age = age.years

    return age

The simplest way is using python-dateutil

import datetime

import dateutil

def birthday(date):
    # Get the current date
    now = datetime.datetime.utcnow()
    now = now.date()

    # Get the difference between the current date and the birthday
    age = dateutil.relativedelta.relativedelta(now, date)
    age = age.years

    return age

回答 5

from datetime import date

def age(birth_date):
    today = date.today()
    y = today.year - birth_date.year
    if today.month < birth_date.month or today.month == birth_date.month and today.day < birth_date.day:
        y -= 1
    return y
from datetime import date

def age(birth_date):
    today = date.today()
    y = today.year - birth_date.year
    if today.month < birth_date.month or today.month == birth_date.month and today.day < birth_date.day:
        y -= 1
    return y

回答 6

不幸的是,您不能仅仅使用timedelata,因为它使用的最大单位是天,leap年将使您的计算无效。因此,让我们找到年数,如果最后一年未满,则将其调整为一:

from datetime import date
birth_date = date(1980, 5, 26)
years = date.today().year - birth_date.year
if (datetime.now() - birth_date.replace(year=datetime.now().year)).days >= 0:
    age = years
else:
    age = years - 1

更新:

当2月29日生效时,此解决方案确实会导致异常。这是正确的检查:

from datetime import date
birth_date = date(1980, 5, 26)
today = date.today()
years = today.year - birth_date.year
if all((x >= y) for x,y in zip(today.timetuple(), birth_date.timetuple()):
   age = years
else:
   age = years - 1

Upd2:

多次调用来now()提高绩效很可笑,这在所有情况下都没有关系,但在非常特殊的情况下。使用变量的真正原因是数据不一致的风险。

Unfortunately, you cannot just use timedelata as the largest unit it uses is day and leap years will render you calculations invalid. Therefore, let’s find number of years then adjust by one if the last year isn’t full:

from datetime import date
birth_date = date(1980, 5, 26)
years = date.today().year - birth_date.year
if (datetime.now() - birth_date.replace(year=datetime.now().year)).days >= 0:
    age = years
else:
    age = years - 1

Upd:

This solution really causes an exception when Feb, 29 comes into play. Here’s correct check:

from datetime import date
birth_date = date(1980, 5, 26)
today = date.today()
years = today.year - birth_date.year
if all((x >= y) for x,y in zip(today.timetuple(), birth_date.timetuple()):
   age = years
else:
   age = years - 1

Upd2:

Calling multiple calls to now() a performance hit is ridiculous, it does not matter in all but extremely special cases. The real reason to use a variable is the risk of data incosistency.


回答 7

在这种情况下,典型的陷阱是如何处理2月29日出生的人。例如:您必须年满18岁才能投票,开车,买酒等…如果您出生于2004-02-29,那么您被允许做此类事情的第一天是2022-02 -28还是2022-03-01?AFAICT主要是第一个,但有些杀人狂可能会说后者。

以下代码可满足当日出生人口的0.068%(大约)的需求:

def age_in_years(from_date, to_date, leap_day_anniversary_Feb28=True):
    age = to_date.year - from_date.year
    try:
        anniversary = from_date.replace(year=to_date.year)
    except ValueError:
        assert from_date.day == 29 and from_date.month == 2
        if leap_day_anniversary_Feb28:
            anniversary = datetime.date(to_date.year, 2, 28)
        else:
            anniversary = datetime.date(to_date.year, 3, 1)
    if to_date < anniversary:
        age -= 1
    return age

if __name__ == "__main__":
    import datetime

    tests = """

    2004  2 28 2010  2 27  5 1
    2004  2 28 2010  2 28  6 1
    2004  2 28 2010  3  1  6 1

    2004  2 29 2010  2 27  5 1
    2004  2 29 2010  2 28  6 1
    2004  2 29 2010  3  1  6 1

    2004  2 29 2012  2 27  7 1
    2004  2 29 2012  2 28  7 1
    2004  2 29 2012  2 29  8 1
    2004  2 29 2012  3  1  8 1

    2004  2 28 2010  2 27  5 0
    2004  2 28 2010  2 28  6 0
    2004  2 28 2010  3  1  6 0

    2004  2 29 2010  2 27  5 0
    2004  2 29 2010  2 28  5 0
    2004  2 29 2010  3  1  6 0

    2004  2 29 2012  2 27  7 0
    2004  2 29 2012  2 28  7 0
    2004  2 29 2012  2 29  8 0
    2004  2 29 2012  3  1  8 0

    """

    for line in tests.splitlines():
        nums = [int(x) for x in line.split()]
        if not nums:
            print
            continue
        datea = datetime.date(*nums[0:3])
        dateb = datetime.date(*nums[3:6])
        expected, anniv = nums[6:8]
        age = age_in_years(datea, dateb, anniv)
        print datea, dateb, anniv, age, expected, age == expected

这是输出:

2004-02-28 2010-02-27 1 5 5 True
2004-02-28 2010-02-28 1 6 6 True
2004-02-28 2010-03-01 1 6 6 True

2004-02-29 2010-02-27 1 5 5 True
2004-02-29 2010-02-28 1 6 6 True
2004-02-29 2010-03-01 1 6 6 True

2004-02-29 2012-02-27 1 7 7 True
2004-02-29 2012-02-28 1 7 7 True
2004-02-29 2012-02-29 1 8 8 True
2004-02-29 2012-03-01 1 8 8 True

2004-02-28 2010-02-27 0 5 5 True
2004-02-28 2010-02-28 0 6 6 True
2004-02-28 2010-03-01 0 6 6 True

2004-02-29 2010-02-27 0 5 5 True
2004-02-29 2010-02-28 0 5 5 True
2004-02-29 2010-03-01 0 6 6 True

2004-02-29 2012-02-27 0 7 7 True
2004-02-29 2012-02-28 0 7 7 True
2004-02-29 2012-02-29 0 8 8 True
2004-02-29 2012-03-01 0 8 8 True

The classic gotcha in this scenario is what to do with people born on the 29th day of February. Example: you need to be aged 18 to vote, drive a car, buy alcohol, etc … if you are born on 2004-02-29, what is the first day that you are permitted to do such things: 2022-02-28, or 2022-03-01? AFAICT, mostly the first, but a few killjoys might say the latter.

Here’s code that caters for the 0.068% (approx) of the population born on that day:

def age_in_years(from_date, to_date, leap_day_anniversary_Feb28=True):
    age = to_date.year - from_date.year
    try:
        anniversary = from_date.replace(year=to_date.year)
    except ValueError:
        assert from_date.day == 29 and from_date.month == 2
        if leap_day_anniversary_Feb28:
            anniversary = datetime.date(to_date.year, 2, 28)
        else:
            anniversary = datetime.date(to_date.year, 3, 1)
    if to_date < anniversary:
        age -= 1
    return age

if __name__ == "__main__":
    import datetime

    tests = """

    2004  2 28 2010  2 27  5 1
    2004  2 28 2010  2 28  6 1
    2004  2 28 2010  3  1  6 1

    2004  2 29 2010  2 27  5 1
    2004  2 29 2010  2 28  6 1
    2004  2 29 2010  3  1  6 1

    2004  2 29 2012  2 27  7 1
    2004  2 29 2012  2 28  7 1
    2004  2 29 2012  2 29  8 1
    2004  2 29 2012  3  1  8 1

    2004  2 28 2010  2 27  5 0
    2004  2 28 2010  2 28  6 0
    2004  2 28 2010  3  1  6 0

    2004  2 29 2010  2 27  5 0
    2004  2 29 2010  2 28  5 0
    2004  2 29 2010  3  1  6 0

    2004  2 29 2012  2 27  7 0
    2004  2 29 2012  2 28  7 0
    2004  2 29 2012  2 29  8 0
    2004  2 29 2012  3  1  8 0

    """

    for line in tests.splitlines():
        nums = [int(x) for x in line.split()]
        if not nums:
            print
            continue
        datea = datetime.date(*nums[0:3])
        dateb = datetime.date(*nums[3:6])
        expected, anniv = nums[6:8]
        age = age_in_years(datea, dateb, anniv)
        print datea, dateb, anniv, age, expected, age == expected

Here’s the output:

2004-02-28 2010-02-27 1 5 5 True
2004-02-28 2010-02-28 1 6 6 True
2004-02-28 2010-03-01 1 6 6 True

2004-02-29 2010-02-27 1 5 5 True
2004-02-29 2010-02-28 1 6 6 True
2004-02-29 2010-03-01 1 6 6 True

2004-02-29 2012-02-27 1 7 7 True
2004-02-29 2012-02-28 1 7 7 True
2004-02-29 2012-02-29 1 8 8 True
2004-02-29 2012-03-01 1 8 8 True

2004-02-28 2010-02-27 0 5 5 True
2004-02-28 2010-02-28 0 6 6 True
2004-02-28 2010-03-01 0 6 6 True

2004-02-29 2010-02-27 0 5 5 True
2004-02-29 2010-02-28 0 5 5 True
2004-02-29 2010-03-01 0 6 6 True

2004-02-29 2012-02-27 0 7 7 True
2004-02-29 2012-02-28 0 7 7 True
2004-02-29 2012-02-29 0 8 8 True
2004-02-29 2012-03-01 0 8 8 True

回答 8

如果您希望使用django模板在页面中打印此内容,那么以下内容可能就足够了:

{{ birth_date|timesince }}

If you’re looking to print this in a page using django templates, then the following might be enough:

{{ birth_date|timesince }}

回答 9

这是找到一个人的年龄(数月,数月或数天)的解决方案。

假设某人的出生日期是2012-01-17T00:00:00 因此,他在2013-01-16T00:00:00的年龄将是11个月

或如果他出生于2012-12-17T00:00:00,则他在2013-01-12T00:00:00的年龄为26天

或如果他出生于2000-02-29T00:00:00,则他在2012-02-29T00:00:00的年龄将为12岁

您将需要导入datetime

这是代码:

def get_person_age(date_birth, date_today):

"""
At top level there are three possibilities : Age can be in days or months or years.
For age to be in years there are two cases: Year difference is one or Year difference is more than 1
For age to be in months there are two cases: Year difference is 0 or 1
For age to be in days there are 4 possibilities: Year difference is 1(20-dec-2012 - 2-jan-2013),
                                                 Year difference is 0, Months difference is 0 or 1
"""
years_diff = date_today.year - date_birth.year
months_diff = date_today.month - date_birth.month
days_diff = date_today.day - date_birth.day
age_in_days = (date_today - date_birth).days

age = years_diff
age_string = str(age) + " years"

# age can be in months or days.
if years_diff == 0:
    if months_diff == 0:
        age = age_in_days
        age_string = str(age) + " days"
    elif months_diff == 1:
        if days_diff < 0:
            age = age_in_days
            age_string = str(age) + " days"
        else:
            age = months_diff
            age_string = str(age) + " months"
    else:
        if days_diff < 0:
            age = months_diff - 1
        else:
            age = months_diff
        age_string = str(age) + " months"
# age can be in years, months or days.
elif years_diff == 1:
    if months_diff < 0:
        age = months_diff + 12
        age_string = str(age) + " months" 
        if age == 1:
            if days_diff < 0:
                age = age_in_days
                age_string = str(age) + " days" 
        elif days_diff < 0:
            age = age-1
            age_string = str(age) + " months"
    elif months_diff == 0:
        if days_diff < 0:
            age = 11
            age_string = str(age) + " months"
        else:
            age = 1
            age_string = str(age) + " years"
    else:
        age = 1
        age_string = str(age) + " years"
# The age is guaranteed to be in years.
else:
    if months_diff < 0:
        age = years_diff - 1
    elif months_diff == 0:
        if days_diff < 0:
            age = years_diff - 1
        else:
            age = years_diff
    else:
        age = years_diff
    age_string = str(age) + " years"

if age == 1:
    age_string = age_string.replace("years", "year").replace("months", "month").replace("days", "day")

return age_string

以上代码中使用的一些额外功能是:

def get_todays_date():
    """
    This function returns todays date in proper date object format
    """
    return datetime.now()

def get_date_format(str_date):
"""
This function converts string into date type object
"""
str_date = str_date.split("T")[0]
return datetime.strptime(str_date, "%Y-%m-%d")

现在,我们必须使用类似2000-02-29T00:00:00的字符串来填充get_date_format()

它将转换为日期类型对象,该对象将被馈送到get_person_age(date_birth,date_today)

函数get_person_age(date_birth,date_today)将以字符串格式返回age。

Here is a solution to find age of a person as either years or months or days.

Lets say a person’s date of birth is 2012-01-17T00:00:00 Therefore, his age on 2013-01-16T00:00:00 will be 11 months

or if he is born on 2012-12-17T00:00:00, his age on 2013-01-12T00:00:00 will be 26 days

or if he is born on 2000-02-29T00:00:00, his age on 2012-02-29T00:00:00 will be 12 years

You will need to import datetime.

Here is the code:

def get_person_age(date_birth, date_today):

"""
At top level there are three possibilities : Age can be in days or months or years.
For age to be in years there are two cases: Year difference is one or Year difference is more than 1
For age to be in months there are two cases: Year difference is 0 or 1
For age to be in days there are 4 possibilities: Year difference is 1(20-dec-2012 - 2-jan-2013),
                                                 Year difference is 0, Months difference is 0 or 1
"""
years_diff = date_today.year - date_birth.year
months_diff = date_today.month - date_birth.month
days_diff = date_today.day - date_birth.day
age_in_days = (date_today - date_birth).days

age = years_diff
age_string = str(age) + " years"

# age can be in months or days.
if years_diff == 0:
    if months_diff == 0:
        age = age_in_days
        age_string = str(age) + " days"
    elif months_diff == 1:
        if days_diff < 0:
            age = age_in_days
            age_string = str(age) + " days"
        else:
            age = months_diff
            age_string = str(age) + " months"
    else:
        if days_diff < 0:
            age = months_diff - 1
        else:
            age = months_diff
        age_string = str(age) + " months"
# age can be in years, months or days.
elif years_diff == 1:
    if months_diff < 0:
        age = months_diff + 12
        age_string = str(age) + " months" 
        if age == 1:
            if days_diff < 0:
                age = age_in_days
                age_string = str(age) + " days" 
        elif days_diff < 0:
            age = age-1
            age_string = str(age) + " months"
    elif months_diff == 0:
        if days_diff < 0:
            age = 11
            age_string = str(age) + " months"
        else:
            age = 1
            age_string = str(age) + " years"
    else:
        age = 1
        age_string = str(age) + " years"
# The age is guaranteed to be in years.
else:
    if months_diff < 0:
        age = years_diff - 1
    elif months_diff == 0:
        if days_diff < 0:
            age = years_diff - 1
        else:
            age = years_diff
    else:
        age = years_diff
    age_string = str(age) + " years"

if age == 1:
    age_string = age_string.replace("years", "year").replace("months", "month").replace("days", "day")

return age_string

Some extra functions used in the above codes are:

def get_todays_date():
    """
    This function returns todays date in proper date object format
    """
    return datetime.now()

And

def get_date_format(str_date):
"""
This function converts string into date type object
"""
str_date = str_date.split("T")[0]
return datetime.strptime(str_date, "%Y-%m-%d")

Now, we have to feed get_date_format() with the strings like 2000-02-29T00:00:00

It will convert it into the date type object which is to be fed to get_person_age(date_birth, date_today).

The function get_person_age(date_birth, date_today) will return age in string format.


回答 10

扩展了Danny的解决方案,但提供了多种报告年轻人群年龄的方法(请注意,今天是datetime.date(2015,7,17)):

def calculate_age(born):
    '''
        Converts a date of birth (dob) datetime object to years, always rounding down.
        When the age is 80 years or more, just report that the age is 80 years or more.
        When the age is less than 12 years, rounds down to the nearest half year.
        When the age is less than 2 years, reports age in months, rounded down.
        When the age is less than 6 months, reports the age in weeks, rounded down.
        When the age is less than 2 weeks, reports the age in days.
    '''
    today = datetime.date.today()
    age_in_years = today.year - born.year - ((today.month, today.day) < (born.month, born.day))
    months = (today.month - born.month - (today.day < born.day)) %12
    age = today - born
    age_in_days = age.days
    if age_in_years >= 80:
        return 80, 'years or older'
    if age_in_years >= 12:
        return age_in_years, 'years'
    elif age_in_years >= 2:
        half = 'and a half ' if months > 6 else ''
        return age_in_years, '%syears'%half
    elif months >= 6:
        return months, 'months'
    elif age_in_days >= 14:
        return age_in_days/7, 'weeks'
    else:
        return age_in_days, 'days'

样例代码:

print '%d %s' %calculate_age(datetime.date(1933,6,12)) # >=80 years
print '%d %s' %calculate_age(datetime.date(1963,6,12)) # >=12 years
print '%d %s' %calculate_age(datetime.date(2010,6,19)) # >=2 years
print '%d %s' %calculate_age(datetime.date(2010,11,19)) # >=2 years with half
print '%d %s' %calculate_age(datetime.date(2014,11,19)) # >=6 months
print '%d %s' %calculate_age(datetime.date(2015,6,4)) # >=2 weeks
print '%d %s' %calculate_age(datetime.date(2015,7,11)) # days old

80 years or older
52 years
5 years
4 and a half years
7 months
6 weeks
7 days

Expanding on Danny’s Solution, but with all sorts of ways to report ages for younger folk (note, today is datetime.date(2015,7,17)):

def calculate_age(born):
    '''
        Converts a date of birth (dob) datetime object to years, always rounding down.
        When the age is 80 years or more, just report that the age is 80 years or more.
        When the age is less than 12 years, rounds down to the nearest half year.
        When the age is less than 2 years, reports age in months, rounded down.
        When the age is less than 6 months, reports the age in weeks, rounded down.
        When the age is less than 2 weeks, reports the age in days.
    '''
    today = datetime.date.today()
    age_in_years = today.year - born.year - ((today.month, today.day) < (born.month, born.day))
    months = (today.month - born.month - (today.day < born.day)) %12
    age = today - born
    age_in_days = age.days
    if age_in_years >= 80:
        return 80, 'years or older'
    if age_in_years >= 12:
        return age_in_years, 'years'
    elif age_in_years >= 2:
        half = 'and a half ' if months > 6 else ''
        return age_in_years, '%syears'%half
    elif months >= 6:
        return months, 'months'
    elif age_in_days >= 14:
        return age_in_days/7, 'weeks'
    else:
        return age_in_days, 'days'

Sample code:

print '%d %s' %calculate_age(datetime.date(1933,6,12)) # >=80 years
print '%d %s' %calculate_age(datetime.date(1963,6,12)) # >=12 years
print '%d %s' %calculate_age(datetime.date(2010,6,19)) # >=2 years
print '%d %s' %calculate_age(datetime.date(2010,11,19)) # >=2 years with half
print '%d %s' %calculate_age(datetime.date(2014,11,19)) # >=6 months
print '%d %s' %calculate_age(datetime.date(2015,6,4)) # >=2 weeks
print '%d %s' %calculate_age(datetime.date(2015,7,11)) # days old

80 years or older
52 years
5 years
4 and a half years
7 months
6 weeks
7 days

回答 11

由于我没有看到正确的实现方式,因此以这种方式重新编码了我的代码…

    def age_in_years(from_date, to_date=datetime.date.today()):
  if (DEBUG):
    logger.debug("def age_in_years(from_date='%s', to_date='%s')" % (from_date, to_date))

  if (from_date>to_date): # swap when the lower bound is not the lower bound
    logger.debug('Swapping dates ...')
    tmp = from_date
    from_date = to_date
    to_date = tmp

  age_delta = to_date.year - from_date.year
  month_delta = to_date.month - from_date.month
  day_delta = to_date.day - from_date.day

  if (DEBUG):
    logger.debug("Delta's are : %i  / %i / %i " % (age_delta, month_delta, day_delta))

  if (month_delta>0  or (month_delta==0 and day_delta>=0)): 
    return age_delta 

  return (age_delta-1)

假设在2月28日出生时在29日成为“ 18”是错误的。可以忽略界限…这只是我的代码的个人方便:)

As I did not see the correct implementation, I recoded mine this way…

    def age_in_years(from_date, to_date=datetime.date.today()):
  if (DEBUG):
    logger.debug("def age_in_years(from_date='%s', to_date='%s')" % (from_date, to_date))

  if (from_date>to_date): # swap when the lower bound is not the lower bound
    logger.debug('Swapping dates ...')
    tmp = from_date
    from_date = to_date
    to_date = tmp

  age_delta = to_date.year - from_date.year
  month_delta = to_date.month - from_date.month
  day_delta = to_date.day - from_date.day

  if (DEBUG):
    logger.debug("Delta's are : %i  / %i / %i " % (age_delta, month_delta, day_delta))

  if (month_delta>0  or (month_delta==0 and day_delta>=0)): 
    return age_delta 

  return (age_delta-1)

Assumption of being “18” on the 28th of Feb when born on the 29th is just wrong. Swapping the bounds can be left out … it is just a personal convenience for my code :)


回答 12

扩展到Danny W. Adair答案,还可以获得月份

def calculate_age(b):
    t = date.today()
    c = ((t.month, t.day) < (b.month, b.day))
    c2 = (t.day< b.day)
    return t.year - b.year - c,c*12+t.month-b.month-c2

Extend to Danny W. Adair Answer, to get month also

def calculate_age(b):
    t = date.today()
    c = ((t.month, t.day) < (b.month, b.day))
    c2 = (t.day< b.day)
    return t.year - b.year - c,c*12+t.month-b.month-c2

回答 13

import datetime

今天的日期

td=datetime.datetime.now().date() 

你的生日

bd=datetime.date(1989,3,15)

你的年龄

age_years=int((td-bd).days /365.25)
import datetime

Todays date

td=datetime.datetime.now().date() 

Your birthdate

bd=datetime.date(1989,3,15)

Your age

age_years=int((td-bd).days /365.25)

回答 14

导入日期时间

def age(date_of_birth):
    if date_of_birth > datetime.date.today().replace(year = date_of_birth.year):
        return datetime.date.today().year - date_of_birth.year - 1
    else:
        return datetime.date.today().year - date_of_birth.year

在您的情况下:

import datetime

# your model
def age(self):
    if self.birthdate > datetime.date.today().replace(year = self.birthdate.year):
        return datetime.date.today().year - self.birthdate.year - 1
    else:
        return datetime.date.today().year - self.birthdate.year

import datetime

def age(date_of_birth):
    if date_of_birth > datetime.date.today().replace(year = date_of_birth.year):
        return datetime.date.today().year - date_of_birth.year - 1
    else:
        return datetime.date.today().year - date_of_birth.year

In your case:

import datetime

# your model
def age(self):
    if self.birthdate > datetime.date.today().replace(year = self.birthdate.year):
        return datetime.date.today().year - self.birthdate.year - 1
    else:
        return datetime.date.today().year - self.birthdate.year

回答 15

稍微修改了Danny的解决方案,以便于阅读和理解

    from datetime import date

    def calculate_age(birth_date):
        today = date.today()
        age = today.year - birth_date.year
        full_year_passed = (today.month, today.day) < (birth_date.month, birth_date.day)
        if not full_year_passed:
            age -= 1
        return age

Slightly modified Danny’s solution for easier reading and understanding

    from datetime import date

    def calculate_age(birth_date):
        today = date.today()
        age = today.year - birth_date.year
        full_year_passed = (today.month, today.day) < (birth_date.month, birth_date.day)
        if not full_year_passed:
            age -= 1
        return age

熊猫可以自动识别日期吗?

问题:熊猫可以自动识别日期吗?

今天,我感到惊讶的是,pandas在从数据文件中读取数据时能够识别值的类型:

df = pandas.read_csv('test.dat', delimiter=r"\s+", names=['col1','col2','col3'])

例如,可以通过以下方式检查它:

for i, r in df.iterrows():
    print type(r['col1']), type(r['col2']), type(r['col3'])

特别是整数,浮点数和字符串可以正确识别。但是,我有一列的日期采用以下格式:2013-6-4。这些日期被识别为字符串(而不是python日期对象)。有没有一种方法可以“学习”熊猫到公认的日期?

Today I was positively surprised by the fact that while reading data from a data file (for example) pandas is able to recognize types of values:

df = pandas.read_csv('test.dat', delimiter=r"\s+", names=['col1','col2','col3'])

For example it can be checked in this way:

for i, r in df.iterrows():
    print type(r['col1']), type(r['col2']), type(r['col3'])

In particular integer, floats and strings were recognized correctly. However, I have a column that has dates in the following format: 2013-6-4. These dates were recognized as strings (not as python date-objects). Is there a way to “learn” pandas to recognized dates?


回答 0

您应该添加parse_dates=True,或者parse_dates=['column name']在阅读时通常足以神奇地解析它。但是总有一些奇怪的格式需要手动定义。在这种情况下,您还可以添加日期解析器功能,这是最灵活的方法。

假设您的字符串中有一列“ datetime”,然后:

dateparse = lambda x: pd.datetime.strptime(x, '%Y-%m-%d %H:%M:%S')

df = pd.read_csv(infile, parse_dates=['datetime'], date_parser=dateparse)

这样,您甚至可以将多个列合并为一个datetime列,从而将一个“ date”和一个“ time”列合并为一个“ datetime”列:

dateparse = lambda x: pd.datetime.strptime(x, '%Y-%m-%d %H:%M:%S')

df = pd.read_csv(infile, parse_dates={'datetime': ['date', 'time']}, date_parser=dateparse)

您可以在此页面strptimestrftime 找到指令(即用于不同格式的字母)。

You should add parse_dates=True, or parse_dates=['column name'] when reading, thats usually enough to magically parse it. But there are always weird formats which need to be defined manually. In such a case you can also add a date parser function, which is the most flexible way possible.

Suppose you have a column ‘datetime’ with your string, then:

from datetime import datetime
dateparse = lambda x: datetime.strptime(x, '%Y-%m-%d %H:%M:%S')

df = pd.read_csv(infile, parse_dates=['datetime'], date_parser=dateparse)

This way you can even combine multiple columns into a single datetime column, this merges a ‘date’ and a ‘time’ column into a single ‘datetime’ column:

dateparse = lambda x: datetime.strptime(x, '%Y-%m-%d %H:%M:%S')

df = pd.read_csv(infile, parse_dates={'datetime': ['date', 'time']}, date_parser=dateparse)

You can find directives (i.e. the letters to be used for different formats) for strptime and strftime in this page.


回答 1

自@Rutger回答以来,熊猫界面可能已更改,但是在我使用的版本(0.15.2)中,该date_parser函数接收日期列表,而不是单个值。在这种情况下,他的代码应该这样更新:

dateparse = lambda dates: [pd.datetime.strptime(d, '%Y-%m-%d %H:%M:%S') for d in dates]

df = pd.read_csv(infile, parse_dates=['datetime'], date_parser=dateparse)

Perhaps the pandas interface has changed since @Rutger answered, but in the version I’m using (0.15.2), the date_parser function receives a list of dates instead of a single value. In this case, his code should be updated like so:

dateparse = lambda dates: [pd.datetime.strptime(d, '%Y-%m-%d %H:%M:%S') for d in dates]

df = pd.read_csv(infile, parse_dates=['datetime'], date_parser=dateparse)

回答 2

pandas read_csv方法非常适合解析日期。完整的文档位于http://pandas.pydata.org/pandas-docs/stable/genic/pandas.io.parsers.read_csv.html

您甚至可以在不同的列中包含不同的日期部分,并传递参数:

parse_dates : boolean, list of ints or names, list of lists, or dict
If True -> try parsing the index. If [1, 2, 3] -> try parsing columns 1, 2, 3 each as a
separate date column. If [[1, 3]] -> combine columns 1 and 3 and parse as a single date
column. {‘foo : [1, 3]} -> parse columns 1, 3 as date and call result foo

默认的日期检测效果很好,但似乎偏向于北美日期格式。如果您住在其他地方,您可能偶尔会被结果所吸引。据我所知,2000年1月6日是美国的1月6日,而不是我居住的6月1日。如果使用了2000年6月23日这样的日期,它足够聪明地摆弄它们。不过,使用YYYYMMDD日期变化可能更安全。向熊猫开发者表示歉意,但是最近我还没有在当地进行测试。

您可以使用date_parser参数传递一个函数来转换格式。

date_parser : function
Function to use for converting a sequence of string columns to an array of datetime
instances. The default uses dateutil.parser.parser to do the conversion.

pandas read_csv method is great for parsing dates. Complete documentation at http://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.parsers.read_csv.html

you can even have the different date parts in different columns and pass the parameter:

parse_dates : boolean, list of ints or names, list of lists, or dict
If True -> try parsing the index. If [1, 2, 3] -> try parsing columns 1, 2, 3 each as a
separate date column. If [[1, 3]] -> combine columns 1 and 3 and parse as a single date
column. {‘foo’ : [1, 3]} -> parse columns 1, 3 as date and call result ‘foo’

The default sensing of dates works great, but it seems to be biased towards north american Date formats. If you live elsewhere you might occasionally be caught by the results. As far as I can remember 1/6/2000 means 6 January in the USA as opposed to 1 Jun where I live. It is smart enough to swing them around if dates like 23/6/2000 are used. Probably safer to stay with YYYYMMDD variations of date though. Apologies to pandas developers,here but i have not tested it with local dates recently.

you can use the date_parser parameter to pass a function to convert your format.

date_parser : function
Function to use for converting a sequence of string columns to an array of datetime
instances. The default uses dateutil.parser.parser to do the conversion.

回答 3

您可以pandas.to_datetime()按照文档中的建议使用pandas.read_csv()

如果列或索引包含不可解析的日期,则整个列或索引将按原样作为对象数据类型返回。对于非标准的日期时间解析,请pd.to_datetime在之后使用pd.read_csv

演示:

>>> D = {'date': '2013-6-4'}
>>> df = pd.DataFrame(D, index=[0])
>>> df
       date
0  2013-6-4
>>> df.dtypes
date    object
dtype: object
>>> df['date'] = pd.to_datetime(df.date, format='%Y-%m-%d')
>>> df
        date
0 2013-06-04
>>> df.dtypes
date    datetime64[ns]
dtype: object

You could use pandas.to_datetime() as recommended in the documentation for pandas.read_csv():

If a column or index contains an unparseable date, the entire column or index will be returned unaltered as an object data type. For non-standard datetime parsing, use pd.to_datetime after pd.read_csv.

Demo:

>>> D = {'date': '2013-6-4'}
>>> df = pd.DataFrame(D, index=[0])
>>> df
       date
0  2013-6-4
>>> df.dtypes
date    object
dtype: object
>>> df['date'] = pd.to_datetime(df.date, format='%Y-%m-%d')
>>> df
        date
0 2013-06-04
>>> df.dtypes
date    datetime64[ns]
dtype: object

回答 4

将两列合并为一个datetime列时,可接受的答案会产生错误(pandas版本0.20.3),因为这些列分别发送到date_parser函数。

以下作品:

def dateparse(d,t):
    dt = d + " " + t
    return pd.datetime.strptime(dt, '%d/%m/%Y %H:%M:%S')

df = pd.read_csv(infile, parse_dates={'datetime': ['date', 'time']}, date_parser=dateparse)

When merging two columns into a single datetime column, the accepted answer generates an error (pandas version 0.20.3), since the columns are sent to the date_parser function separately.

The following works:

def dateparse(d,t):
    dt = d + " " + t
    return pd.datetime.strptime(dt, '%d/%m/%Y %H:%M:%S')

df = pd.read_csv(infile, parse_dates={'datetime': ['date', 'time']}, date_parser=dateparse)

回答 5

是的-根据pandas.read_csv 文档

注意:存在iso8601格式日期的快速路径。

因此,如果您的csv有一个名为的列datetime,并且日期看起来像2013-01-01T01:01例如,运行此命令将使熊猫(我在v0.19.2上)自动获取日期和时间:

df = pd.read_csv('test.csv', parse_dates=['datetime'])

请注意,您需要显式传递parse_dates,否则将无法正常运行。

验证:

df.dtypes

您应该看到列的数据类型是 datetime64[ns]

Yes – according to the pandas.read_csv documentation:

Note: A fast-path exists for iso8601-formatted dates.

So if your csv has a column named datetime and the dates looks like 2013-01-01T01:01 for example, running this will make pandas (I’m on v0.19.2) pick up the date and time automatically:

df = pd.read_csv('test.csv', parse_dates=['datetime'])

Note that you need to explicitly pass parse_dates, it doesn’t work without.

Verify with:

df.dtypes

You should see the datatype of the column is datetime64[ns]


回答 6

如果性能对您很重要,请确保您有时间:

import sys
import timeit
import pandas as pd

print('Python %s on %s' % (sys.version, sys.platform))
print('Pandas version %s' % pd.__version__)

repeat = 3
numbers = 100

def time(statement, _setup=None):
    print (min(
        timeit.Timer(statement, setup=_setup or setup).repeat(
            repeat, numbers)))

print("Format %m/%d/%y")
setup = """import pandas as pd
import io

data = io.StringIO('''\
ProductCode,Date
''' + '''\
x1,07/29/15
x2,07/29/15
x3,07/29/15
x4,07/30/15
x5,07/29/15
x6,07/29/15
x7,07/29/15
y7,08/05/15
x8,08/05/15
z3,08/05/15
''' * 100)"""

time('pd.read_csv(data); data.seek(0)')
time('pd.read_csv(data, parse_dates=["Date"]); data.seek(0)')
time('pd.read_csv(data, parse_dates=["Date"],'
     'infer_datetime_format=True); data.seek(0)')
time('pd.read_csv(data, parse_dates=["Date"],'
     'date_parser=lambda x: pd.datetime.strptime(x, "%m/%d/%y")); data.seek(0)')

print("Format %Y-%m-%d %H:%M:%S")
setup = """import pandas as pd
import io

data = io.StringIO('''\
ProductCode,Date
''' + '''\
x1,2016-10-15 00:00:43
x2,2016-10-15 00:00:56
x3,2016-10-15 00:00:56
x4,2016-10-15 00:00:12
x5,2016-10-15 00:00:34
x6,2016-10-15 00:00:55
x7,2016-10-15 00:00:06
y7,2016-10-15 00:00:01
x8,2016-10-15 00:00:00
z3,2016-10-15 00:00:02
''' * 1000)"""

time('pd.read_csv(data); data.seek(0)')
time('pd.read_csv(data, parse_dates=["Date"]); data.seek(0)')
time('pd.read_csv(data, parse_dates=["Date"],'
     'infer_datetime_format=True); data.seek(0)')
time('pd.read_csv(data, parse_dates=["Date"],'
     'date_parser=lambda x: pd.datetime.strptime(x, "%Y-%m-%d %H:%M:%S")); data.seek(0)')

印刷品:

Python 3.7.1 (v3.7.1:260ec2c36a, Oct 20 2018, 03:13:28) 
[Clang 6.0 (clang-600.0.57)] on darwin
Pandas version 0.23.4
Format %m/%d/%y
0.19123052499999993
8.20691274
8.143124389
1.2384357139999977
Format %Y-%m-%d %H:%M:%S
0.5238807110000039
0.9202787830000005
0.9832778819999959
12.002349824999996

因此,与ISO8601格式的日期(%Y-%m-%d %H:%M:%S显然是一个ISO8601格式的日期,我猜的T 可以被丢弃,并用空格代替),你应该指定infer_datetime_format(不使更多常见的两种明显的差异),并通过自己的解析器只会破坏性能。另一方面,date_parser与标准日期格式相比确实有所不同。像往常一样,请务必先确定时间再进行优化。

If performance matters to you make sure you time:

import sys
import timeit
import pandas as pd

print('Python %s on %s' % (sys.version, sys.platform))
print('Pandas version %s' % pd.__version__)

repeat = 3
numbers = 100

def time(statement, _setup=None):
    print (min(
        timeit.Timer(statement, setup=_setup or setup).repeat(
            repeat, numbers)))

print("Format %m/%d/%y")
setup = """import pandas as pd
import io

data = io.StringIO('''\
ProductCode,Date
''' + '''\
x1,07/29/15
x2,07/29/15
x3,07/29/15
x4,07/30/15
x5,07/29/15
x6,07/29/15
x7,07/29/15
y7,08/05/15
x8,08/05/15
z3,08/05/15
''' * 100)"""

time('pd.read_csv(data); data.seek(0)')
time('pd.read_csv(data, parse_dates=["Date"]); data.seek(0)')
time('pd.read_csv(data, parse_dates=["Date"],'
     'infer_datetime_format=True); data.seek(0)')
time('pd.read_csv(data, parse_dates=["Date"],'
     'date_parser=lambda x: pd.datetime.strptime(x, "%m/%d/%y")); data.seek(0)')

print("Format %Y-%m-%d %H:%M:%S")
setup = """import pandas as pd
import io

data = io.StringIO('''\
ProductCode,Date
''' + '''\
x1,2016-10-15 00:00:43
x2,2016-10-15 00:00:56
x3,2016-10-15 00:00:56
x4,2016-10-15 00:00:12
x5,2016-10-15 00:00:34
x6,2016-10-15 00:00:55
x7,2016-10-15 00:00:06
y7,2016-10-15 00:00:01
x8,2016-10-15 00:00:00
z3,2016-10-15 00:00:02
''' * 1000)"""

time('pd.read_csv(data); data.seek(0)')
time('pd.read_csv(data, parse_dates=["Date"]); data.seek(0)')
time('pd.read_csv(data, parse_dates=["Date"],'
     'infer_datetime_format=True); data.seek(0)')
time('pd.read_csv(data, parse_dates=["Date"],'
     'date_parser=lambda x: pd.datetime.strptime(x, "%Y-%m-%d %H:%M:%S")); data.seek(0)')

prints:

Python 3.7.1 (v3.7.1:260ec2c36a, Oct 20 2018, 03:13:28) 
[Clang 6.0 (clang-600.0.57)] on darwin
Pandas version 0.23.4
Format %m/%d/%y
0.19123052499999993
8.20691274
8.143124389
1.2384357139999977
Format %Y-%m-%d %H:%M:%S
0.5238807110000039
0.9202787830000005
0.9832778819999959
12.002349824999996

So with iso8601-formatted date (%Y-%m-%d %H:%M:%S is apparently an iso8601-formatted date, I guess the T can be dropped and replaced by a space) you should not specify infer_datetime_format (which does not make a difference with more common ones either apparently) and passing your own parser in just cripples performance. On the other hand, date_parser does make a difference with not so standard day formats. Be sure to time before you optimize, as usual.


回答 7

加载csv文件中包含date列时,我们有两种方法可以使熊猫识别date列,即

  1. 熊猫通过arg明确识别格式 date_parser=mydateparser

  2. 熊猫通过AGR隐式识别格式 infer_datetime_format=True

一些日期列数据

18/01/18

18/02/02

这里我们不知道前两件事,可能是一个月或一天。因此,在这种情况下,我们必须使用方法1:-显式传递格式

    mydateparser = lambda x: pd.datetime.strptime(x, "%m/%d/%y")
    df = pd.read_csv(file_name, parse_dates=['date_col_name'],
date_parser=mydateparser)

方法2:-隐式或自动识别格式

df = pd.read_csv(file_name, parse_dates=[date_col_name],infer_datetime_format=True)

While loading csv file contain date column.We have two approach to to make pandas to recognize date column i.e

  1. Pandas explicit recognize the format by arg date_parser=mydateparser

  2. Pandas implicit recognize the format by agr infer_datetime_format=True

Some of the date column data

01/01/18

01/02/18

Here we don’t know the first two things It may be month or day. So in this case we have to use Method 1:- Explicit pass the format

    mydateparser = lambda x: pd.datetime.strptime(x, "%m/%d/%y")
    df = pd.read_csv(file_name, parse_dates=['date_col_name'],
date_parser=mydateparser)

Method 2:- Implicit or Automatically recognize the format

df = pd.read_csv(file_name, parse_dates=[date_col_name],infer_datetime_format=True)