分类目录归档:知识问答

TypeError:无法散列的类型:’dict’

问题:TypeError:无法散列的类型:’dict’

这段代码给我一个错误unhashable type: dict,任何人都可以向我解释解决方案

negids = movie_reviews.fileids('neg')
def word_feats(words):
    return dict([(word, True) for word in words])

negfeats = [(word_feats(movie_reviews.words(fileids=[f])), 'neg') for f in negids]
stopset = set(stopwords.words('english'))

def stopword_filtered_word_feats(words):
    return dict([(word, True) for word in words if word not in stopset])

result=stopword_filtered_word_feats(negfeats)

This piece of code is giving me an error unhashable type: dict can anyone explain me what is the solution

negids = movie_reviews.fileids('neg')
def word_feats(words):
    return dict([(word, True) for word in words])

negfeats = [(word_feats(movie_reviews.words(fileids=[f])), 'neg') for f in negids]
stopset = set(stopwords.words('english'))

def stopword_filtered_word_feats(words):
    return dict([(word, True) for word in words if word not in stopset])

result=stopword_filtered_word_feats(negfeats)

回答 0

您正在尝试将a dict用作另一个dict或in的键set。那是行不通的,因为密钥必须是可哈希的。通常,只有不可变的对象(字符串,整数,浮点数,frozensets,不可变的元组)才是可哈希化的(尽管可能有exceptions)。因此,这不起作用:

>>> dict_key = {"a": "b"}
>>> some_dict[dict_key] = True
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'

要将字典用作键,您需要将其转换为可能首先进行哈希处理的东西。如果要用作键的字典仅由不可变值组成,则可以像这样创建可散列的表示形式:

>>> key = frozenset(dict_key.items())

现在,您可以keydict或中用作键set

>>> some_dict[key] = True
>>> some_dict
{frozenset([('a', 'b')]): True}

当然,每当要使用字典查找某些内容时,都需要重复练习:

>>> some_dict[dict_key]                     # Doesn't work
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'
>>> some_dict[frozenset(dict_key.items())]  # Works
True

如果dict您希望用作键的值本身就是字典和/或列表,则需要递归“冻结”预期键。这是一个起点:

def freeze(d):
    if isinstance(d, dict):
        return frozenset((key, freeze(value)) for key, value in d.items())
    elif isinstance(d, list):
        return tuple(freeze(value) for value in d)
    return d

You’re trying to use a dict as a key to another dict or in a set. That does not work because the keys have to be hashable. As a general rule, only immutable objects (strings, integers, floats, frozensets, tuples of immutables) are hashable (though exceptions are possible). So this does not work:

>>> dict_key = {"a": "b"}
>>> some_dict[dict_key] = True
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'

To use a dict as a key you need to turn it into something that may be hashed first. If the dict you wish to use as key consists of only immutable values, you can create a hashable representation of it like this:

>>> key = frozenset(dict_key.items())

Now you may use key as a key in a dict or set:

>>> some_dict[key] = True
>>> some_dict
{frozenset([('a', 'b')]): True}

Of course you need to repeat the exercise whenever you want to look up something using a dict:

>>> some_dict[dict_key]                     # Doesn't work
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'
>>> some_dict[frozenset(dict_key.items())]  # Works
True

If the dict you wish to use as key has values that are themselves dicts and/or lists, you need to recursively “freeze” the prospective key. Here’s a starting point:

def freeze(d):
    if isinstance(d, dict):
        return frozenset((key, freeze(value)) for key, value in d.items())
    elif isinstance(d, list):
        return tuple(freeze(value) for value in d)
    return d

回答 1

一个可能的解决方案可能是使用JSON dumps()方法,因此您可以将字典转换为字符串-

import json

a={"a":10, "b":20}
b={"b":20, "a":10}
c = [json.dumps(a), json.dumps(b)]


set(c)
json.dumps(a) in c

输出-

set(['{"a": 10, "b": 20}'])
True

A possible solution might be to use the JSON dumps() method, so you can convert the dictionary to a string —

import json

a={"a":10, "b":20}
b={"b":20, "a":10}
c = [json.dumps(a), json.dumps(b)]


set(c)
json.dumps(a) in c

Output –

set(['{"a": 10, "b": 20}'])
True

如何在Tkinter中将参数传递给Button命令?

问题:如何在Tkinter中将参数传递给Button命令?

假设我Button在Python中使用Tkinter进行了以下操作:

import Tkinter as Tk
win = Tk.Toplevel()
frame = Tk.Frame(master=win).grid(row=1, column=1)
button = Tk.Button(master=frame, text='press', command=action)

action当我按下按钮时会调用该方法,但是如果我想向该方法传递一些参数action呢?

我尝试使用以下代码:

button = Tk.Button(master=frame, text='press', command=action(someNumber))

这只是立即调用该方法,而按该按钮则没有任何作用。

Suppose I have the following Button made with Tkinter in Python:

import Tkinter as Tk
win = Tk.Toplevel()
frame = Tk.Frame(master=win).grid(row=1, column=1)
button = Tk.Button(master=frame, text='press', command=action)

The method action is called when I press the button, but what if I wanted to pass some arguments to the method action?

I have tried with the following code:

button = Tk.Button(master=frame, text='press', command=action(someNumber))

This just invokes the method immediately, and pressing the button does nothing.


回答 0

我个人更喜欢lambdas在这种情况下使用,因为imo更加简单明了,并且如果您无法控制被调用的方法,也不会强迫您编写很多包装方法,但这当然是一个问题。

这就是使用lambda的方式(请注意,在功能模块中还存在一些currying的实现,因此您也可以使用它):

button = Tk.Button(master=frame, text='press', command= lambda: action(someNumber))

I personally prefer to use lambdas in such a scenario, because imo it’s clearer and simpler and also doesn’t force you to write lots of wrapper methods if you don’t have control over the called method, but that’s certainly a matter of taste.

That’s how you’d do it with a lambda (note there’s also some implementation of currying in the functional module, so you can use that too):

button = Tk.Button(master=frame, text='press', command= lambda: action(someNumber))

回答 1

这也可以通过使用partial标准库functools来完成,如下所示:

from functools import partial
#(...)
action_with_arg = partial(action, arg)
button = Tk.Button(master=frame, text='press', command=action_with_arg)

This can also be done by using partial from the standard library functools, like this:

from functools import partial
#(...)
action_with_arg = partial(action, arg)
button = Tk.Button(master=frame, text='press', command=action_with_arg)

回答 2

GUI示例:

假设我有GUI:

import tkinter as tk

root = tk.Tk()

btn = tk.Button(root, text="Press")
btn.pack()

root.mainloop()

按下按钮时会发生什么

看到btn按下时它会调用自己的函数,函数与button_press_handle以下示例非常相似:

def button_press_handle(callback=None):
    if callback:
        callback() # Where exactly the method assigned to btn['command'] is being callled

与:

button_press_handle(btn['command'])

您可以简单地认为command应该将option设置为对我们要调用的方法的引用,类似于callbackin button_press_handle


按下按钮时调用方法(回调

没有参数

因此,如果要在print按下按钮时进行某些操作,则需要进行以下设置:

btn['command'] = print # default to print is new line

请密切注意缺少()print方法的不足,该方法的含义是:“这是我要在按下时调用的方法名称,不要立即调用。” 但是,我没有为传递任何参数,print因此在没有参数的情况下,它会打印任何内容。

论点

现在,如果我还希望将参数传递给要在按下按钮时调用的方法,则可以使用匿名函数,该函数可以通过lambda语句创建,在这种情况下,将使用print内置方法,如下所示:

btn['command'] = lambda arg1="Hello", arg2=" ", arg3="World!" : print(arg1 + arg2 + arg3)

按下按钮时调用多种方法

没有参数

您也可以使用using lambda语句实现该功能,但是这被认为是不好的做法,因此在此不再赘述。好的做法是定义一个单独的方法,multiple_methods该方法调用所需的方法,然后将其设置为按下按钮的回调:

def multiple_methods():
    print("Vicariously") # the first inner callback
    print("I") # another inner callback

论点

为了将参数传递给调用其他方法的方法,请再次使用lambda语句,但首先:

def multiple_methods(*args, **kwargs):
    print(args[0]) # the first inner callback
    print(kwargs['opt1']) # another inner callback

然后设置:

btn['command'] = lambda arg="live", kw="as the" : a_new_method(arg, opt1=kw)

从回调返回对象

还要进一步注意,这callback并不是真的,return因为它仅在button_press_handlewith 内调用,callback()而不是return callback()。确实return不在该功能之外的任何地方。因此,您应该修改当前作用域中可访问的对象。


具有全局对象修改的完整示例

下面的示例将调用一个方法,该方法btn每次按下按钮都会更改的文本:

import tkinter as tk

i = 0
def text_mod():
    global i, btn           # btn can be omitted but not sure if should be
    txt = ("Vicariously", "I", "live", "as", "the", "whole", "world", "dies")
    btn['text'] = txt[i]    # the global object that is modified
    i = (i + 1) % len(txt)  # another global object that gets modified

root = tk.Tk()

btn = tk.Button(root, text="My Button")
btn['command'] = text_mod

btn.pack(fill='both', expand=True)

root.mainloop()

镜子

Example GUI:

Let’s say I have the GUI:

import tkinter as tk

root = tk.Tk()

btn = tk.Button(root, text="Press")
btn.pack()

root.mainloop()

What Happens When a Button Is Pressed

See that when btn is pressed it calls its own function which is very similar to button_press_handle in the following example:

def button_press_handle(callback=None):
    if callback:
        callback() # Where exactly the method assigned to btn['command'] is being callled

with:

button_press_handle(btn['command'])

You can simply think that command option should be set as, the reference to the method we want to be called, similar to callback in button_press_handle.


Calling a Method(Callback) When the Button is Pressed

Without arguments

So if I wanted to print something when the button is pressed I would need to set:

btn['command'] = print # default to print is new line

Pay close attention to the lack of () with the print method which is omitted in the meaning that: “This is the method’s name which I want you to call when pressed but don’t call it just this very instant.” However, I didn’t pass any arguments for the print so it printed whatever it prints when called without arguments.

With Argument(s)

Now If I wanted to also pass arguments to the method I want to be called when the button is pressed I could make use of the anonymous functions, which can be created with lambda statement, in this case for print built-in method, like the following:

btn['command'] = lambda arg1="Hello", arg2=" ", arg3="World!" : print(arg1 + arg2 + arg3)

Calling Multiple Methods when the Button Is Pressed

Without Arguments

You can also achieve that using lambda statement but it is considered bad practice and thus I won’t include it here. The good practice is to define a separate method, multiple_methods, that calls the methods wanted and then set it as the callback to the button press:

def multiple_methods():
    print("Vicariously") # the first inner callback
    print("I") # another inner callback

With Argument(s)

In order to pass argument(s) to method that calls other methods, again make use of lambda statement, but first:

def multiple_methods(*args, **kwargs):
    print(args[0]) # the first inner callback
    print(kwargs['opt1']) # another inner callback

and then set:

btn['command'] = lambda arg="live", kw="as the" : a_new_method(arg, opt1=kw)

Returning Object(s) From the Callback

Also further note that callback can’t really return because it’s only called inside button_press_handle with callback() as opposed to return callback(). It does return but not anywhere outside that function. Thus you should rather modify object(s) that are accessible in the current scope.


Complete Example with global Object Modification(s)

Below example will call a method that changes btn‘s text each time the button is pressed:

import tkinter as tk

i = 0
def text_mod():
    global i, btn           # btn can be omitted but not sure if should be
    txt = ("Vicariously", "I", "live", "as", "the", "whole", "world", "dies")
    btn['text'] = txt[i]    # the global object that is modified
    i = (i + 1) % len(txt)  # another global object that gets modified

root = tk.Tk()

btn = tk.Button(root, text="My Button")
btn['command'] = text_mod

btn.pack(fill='both', expand=True)

root.mainloop()

Mirror


回答 3

Python提供函数参数默认值的能力为我们提供了一条出路。

def fce(x=myX, y=myY):
    myFunction(x,y)
button = Tk.Button(mainWin, text='press', command=fce)

请参阅:http : //infohost.nmt.edu/tcc/help/pubs/tkinter/web/extra-args.html

对于更多按钮,您可以创建一个返回函数的函数:

def fce(myX, myY):
    def wrapper(x=myX, y=myY):
        pass
        pass
        pass
        return x+y
    return wrapper

button1 = Tk.Button(mainWin, text='press 1', command=fce(1,2))
button2 = Tk.Button(mainWin, text='press 2', command=fce(3,4))
button3 = Tk.Button(mainWin, text='press 3', command=fce(9,8))

Python’s ability to provide default values for function arguments gives us a way out.

def fce(x=myX, y=myY):
    myFunction(x,y)
button = Tk.Button(mainWin, text='press', command=fce)

See: http://infohost.nmt.edu/tcc/help/pubs/tkinter/web/extra-args.html

For more buttons you can create a function which returns a function:

def fce(myX, myY):
    def wrapper(x=myX, y=myY):
        pass
        pass
        pass
        return x+y
    return wrapper

button1 = Tk.Button(mainWin, text='press 1', command=fce(1,2))
button2 = Tk.Button(mainWin, text='press 2', command=fce(3,4))
button3 = Tk.Button(mainWin, text='press 3', command=fce(9,8))

回答 4

建立在Matt Thompsons的答案上:可以将一个类设为可调用的,因此可以代替一个函数来使用它:

import tkinter as tk

class Callback:
    def __init__(self, func, *args, **kwargs):
        self.func = func
        self.args = args
        self.kwargs = kwargs
    def __call__(self):
        self.func(*self.args, **self.kwargs)

def default_callback(t):
    print("Button '{}' pressed.".format(t))

root = tk.Tk()

buttons = ["A", "B", "C"]

for i, b in enumerate(buttons):
    tk.Button(root, text=b, command=Callback(default_callback, b)).grid(row=i, column=0)

tk.mainloop()

Building on Matt Thompsons answer : a class can be made callable so it can be used instead of a function:

import tkinter as tk

class Callback:
    def __init__(self, func, *args, **kwargs):
        self.func = func
        self.args = args
        self.kwargs = kwargs
    def __call__(self):
        self.func(*self.args, **self.kwargs)

def default_callback(t):
    print("Button '{}' pressed.".format(t))

root = tk.Tk()

buttons = ["A", "B", "C"]

for i, b in enumerate(buttons):
    tk.Button(root, text=b, command=Callback(default_callback, b)).grid(row=i, column=0)

tk.mainloop()

回答 5

它立即调用该方法并且按下按钮没有执行任何操作的原因action(somenumber)是已评估并且其返回值归因于按钮的命令。因此,如果action打印出一些东西告诉您它已经运行并返回了None,那么您只需运行action以评估其返回值并给出None作为按钮的命令。

要使按钮具有不同的参数来调用函数,可以使用全局变量,尽管我不建议这样做:

import Tkinter as Tk

frame = Tk.Frame(width=5, height=2, bd=1, relief=Tk.SUNKEN)
frame.grid(row=2,column=2)
frame.pack(fill=Tk.X, padx=5, pady=5)
def action():
    global output
    global variable
    output.insert(Tk.END,variable.get())
button = Tk.Button(master=frame, text='press', command=action)
button.pack()
variable = Tk.Entry(master=frame)
variable.pack()
output = Tk.Text(master=frame)
output.pack()

if __name__ == '__main__':
    Tk.mainloop()

我要做的是制作一个class其对象包含所需的每个变量和根据需要更改它们的方法:

import Tkinter as Tk
class Window:
    def __init__(self):
        self.frame = Tk.Frame(width=5, height=2, bd=1, relief=Tk.SUNKEN)
        self.frame.grid(row=2,column=2)
        self.frame.pack(fill=Tk.X, padx=5, pady=5)

        self.button = Tk.Button(master=self.frame, text='press', command=self.action)
        self.button.pack()

        self.variable = Tk.Entry(master=self.frame)
        self.variable.pack()

        self.output = Tk.Text(master=self.frame)
        self.output.pack()

    def action(self):
        self.output.insert(Tk.END,self.variable.get())

if __name__ == '__main__':
    window = Window()
    Tk.mainloop()

The reason it invokes the method immediately and pressing the button does nothing is that action(somenumber) is evaluated and its return value is attributed as the command for the button. So if action prints something to tell you it has run and returns None, you just run action to evaluate its return value and given None as the command for the button.

To have buttons to call functions with different arguments you can use global variables, although I can’t recommend it:

import Tkinter as Tk

frame = Tk.Frame(width=5, height=2, bd=1, relief=Tk.SUNKEN)
frame.grid(row=2,column=2)
frame.pack(fill=Tk.X, padx=5, pady=5)
def action():
    global output
    global variable
    output.insert(Tk.END,variable.get())
button = Tk.Button(master=frame, text='press', command=action)
button.pack()
variable = Tk.Entry(master=frame)
variable.pack()
output = Tk.Text(master=frame)
output.pack()

if __name__ == '__main__':
    Tk.mainloop()

What I would do is make a class whose objects would contain every variable required and methods to change those as needed:

import Tkinter as Tk
class Window:
    def __init__(self):
        self.frame = Tk.Frame(width=5, height=2, bd=1, relief=Tk.SUNKEN)
        self.frame.grid(row=2,column=2)
        self.frame.pack(fill=Tk.X, padx=5, pady=5)

        self.button = Tk.Button(master=self.frame, text='press', command=self.action)
        self.button.pack()

        self.variable = Tk.Entry(master=self.frame)
        self.variable.pack()

        self.output = Tk.Text(master=self.frame)
        self.output.pack()

    def action(self):
        self.output.insert(Tk.END,self.variable.get())

if __name__ == '__main__':
    window = Window()
    Tk.mainloop()

回答 6

button = Tk.Button(master=frame, text='press', command=lambda: action(someNumber))

我相信应该解决这个问题

button = Tk.Button(master=frame, text='press', command=lambda: action(someNumber))

I believe should fix this


回答 7

最好的做法是使用lambda,如下所示:

button = Tk.Button(master=frame, text='press', command=lambda: action(someNumber))

The best thing to do is use lambda as follows:

button = Tk.Button(master=frame, text='press', command=lambda: action(someNumber))

回答 8

我来晚了,但是这是完成它的一种非常简单的方法。

import tkinter as tk
def function1(param1, param2):
    print(str(param1) + str(param2))

var1 = "Hello "
var2 = "World!"
def function2():
    function1(var1, var2)

root = tk.Tk()

myButton = tk.Button(root, text="Button", command=function2)
root.mainloop()

您只需将要使用的功能包装到另一个功能中,然后在按下按钮时调用第二个功能。

I am extremely late, but here is a very simple way of accomplishing it.

import tkinter as tk
def function1(param1, param2):
    print(str(param1) + str(param2))

var1 = "Hello "
var2 = "World!"
def function2():
    function1(var1, var2)

root = tk.Tk()

myButton = tk.Button(root, text="Button", command=function2)
root.mainloop()

You simply wrap the function you want to use in another function and call the second function on the button press.


回答 9

Lambda很不错,但是您也可以尝试一下(在for循环中顺便说一句):

root = Tk()

dct = {"1": [*args], "2": [*args]}
def keypress(event):
    *args = dct[event.char]
    for arg in args:
        pass
for i in range(10):
    root.bind(str(i), keypress)

之所以起作用,是因为设置了绑定后,按键将事件作为参数传递。然后,您可以取消事件的属性,例如event.char获得“ 1”或“ UP”。如果您需要一个或多个事件属性以外的参数。只需创建一个字典来存储它们。

Lambdas are all well and good, but you can also try this (which works in a for loop btw):

root = Tk()

dct = {"1": [*args], "2": [*args]}
def keypress(event):
    *args = dct[event.char]
    for arg in args:
        pass
for i in range(10):
    root.bind(str(i), keypress)

This works because when the binding is set, a key press passes the event as an argument. You can then call attributes off the event like event.char to get “1” or “UP” ect. If you need an argument or multiple arguments other than the event attributes. just create a dictionary to store them.


回答 10

我也曾经遇到过这个问题。您可以只使用lambda:

button = Tk.Button(master=frame, text='press',command=lambda: action(someNumber))

I have encountered this problem before, too. You can just use lambda:

button = Tk.Button(master=frame, text='press',command=lambda: action(someNumber))

回答 11

如果您要执行更多操作,请使用lambda将条目数据传递给命令函数,例如:

event1 = Entry(master)
button1 = Button(master, text="OK", command=lambda: test_event(event1.get()))

def test_event(event_text):
    if not event_text:
        print("Nothing entered")
    else:
        print(str(event_text))
        #  do stuff

这会将事件中的信息传递给按钮功能。可能有更多类似Python的方式编写此代码,但这对我有用。

Use a lambda to pass the entry data to the command function if you have more actions to carry out, like this (I’ve tried to make it generic, so just adapt):

event1 = Entry(master)
button1 = Button(master, text="OK", command=lambda: test_event(event1.get()))

def test_event(event_text):
    if not event_text:
        print("Nothing entered")
    else:
        print(str(event_text))
        #  do stuff

This will pass the information in the event to the button function. There may be more Pythonesque ways of writing this, but it works for me.


回答 12

JasonPy-一些事情…

如果您将一个按钮粘在一个循环中,它将一遍又一遍地创建…这可能不是您想要的。(也许是)…

它总是获得最后一个索引的原因是单击它们时运行的lambda事件-而不是程序启动时。我不确定100%在做什么,但也许尝试在完成后存储值,然后稍后使用lambda按钮调用它。

例如:(不使用此代码,仅作为示例)

for entry in stuff_that_is_happening:
    value_store[entry] = stuff_that_is_happening

那你可以说…

button... command: lambda: value_store[1]

希望这可以帮助!

JasonPy – a few things…

if you stick a button in a loop it will be created over and over and over again… which is probably not what you want. (maybe it is)…

The reason it always gets the last index is lambda events run when you click them – not when the program starts. I’m not sure 100% what you are doing but maybe try storing the value when it’s made then call it later with the lambda button.

eg: (don’t use this code, just an example)

for entry in stuff_that_is_happening:
    value_store[entry] = stuff_that_is_happening

then you can say….

button... command: lambda: value_store[1]

hope this helps!


回答 13

一种简单的方法是button使用lambda以下语法进行配置:

button['command'] = lambda arg1 = local_var1, arg2 = local_var2 : function(arg1, arg2)

One simple way would be to configure button with lambda like the following syntax:

button['command'] = lambda arg1 = local_var1, arg2 = local_var2 : function(arg1, arg2)

回答 14

为了后代:您也可以使用类来实现类似的目的。例如:

class Function_Wrapper():
    def __init__(self, x, y, z):
        self.x, self.y, self.z = x, y, z
    def func(self):
        return self.x + self.y + self.z # execute function

然后可以通过以下方式简单地创建按钮:

instance1 = Function_Wrapper(x, y, z)
button1  = Button(master, text = "press", command = instance1.func)

这种方法还允许您通过设置来更改函数参数instance1.x = 3

For posterity: you can also use classes to achieve something similar. For instance:

class Function_Wrapper():
    def __init__(self, x, y, z):
        self.x, self.y, self.z = x, y, z
    def func(self):
        return self.x + self.y + self.z # execute function

Button can then be simply created by:

instance1 = Function_Wrapper(x, y, z)
button1  = Button(master, text = "press", command = instance1.func)

This approach also allows you to change the function arguments by i.e. setting instance1.x = 3.


回答 15

您需要使用 lambda:

button = Tk.Button(master=frame, text='press', command=lambda: action(someNumber))

You need to use lambda:

button = Tk.Button(master=frame, text='press', command=lambda: action(someNumber))

回答 16

使用lambda

import tkinter as tk

root = tk.Tk()
def go(text):
    print(text)

b = tk.Button(root, text="Click", command=lambda: go("hello"))
b.pack()
root.mainloop()

输出:

hello

Use lambda

import tkinter as tk

root = tk.Tk()
def go(text):
    print(text)

b = tk.Button(root, text="Click", command=lambda: go("hello"))
b.pack()
root.mainloop()

output:

hello

如何在Python中将多个值附加到列表

问题:如何在Python中将多个值附加到列表

我试图弄清楚如何在Python中将多个值附加到列表中。我知道有一些方法来做到这一点,如手动输入值,或在PUR追加操作for循环,或appendextend功能。

但是,我想知道是否还有更整洁的方法?也许某个软件包或功能?

I am trying to figure out how to append multiple values to a list in Python. I know there are few methods to do so, such as manually input the values, or put the append operation in a for loop, or the append and extend functions.

However, I wonder if there is a more neat way to do so? Maybe a certain package or function?


回答 0

您可以使用sequence方法list.extend将列表从任意迭代类型中扩展为多个值,无论是另一个列表还是提供值序列的任何其他事物。

>>> lst = [1, 2]
>>> lst.append(3)
>>> lst.append(4)
>>> lst
[1, 2, 3, 4]

>>> lst.extend([5, 6, 7])
>>> lst.extend((8, 9, 10))
>>> lst
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

>>> lst.extend(range(11, 14))
>>> lst
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]

因此,您可以list.append()用来附加单个值,也list.extend()可以附加多个值。

You can use the sequence method list.extend to extend the list by multiple values from any kind of iterable, being it another list or any other thing that provides a sequence of values.

>>> lst = [1, 2]
>>> lst.append(3)
>>> lst.append(4)
>>> lst
[1, 2, 3, 4]

>>> lst.extend([5, 6, 7])
>>> lst.extend((8, 9, 10))
>>> lst
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

>>> lst.extend(range(11, 14))
>>> lst
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]

So you can use list.append() to append a single value, and list.extend() to append multiple values.


回答 1

除了append函数以外,如果用“多个值”表示另一个列表,则可以像这样简单地将它们连接起来。

>>> a = [1,2,3]
>>> b = [4,5,6]
>>> a + b
[1, 2, 3, 4, 5, 6]

Other than the append function, if by “multiple values” you mean another list, you can simply concatenate them like so.

>>> a = [1,2,3]
>>> b = [4,5,6]
>>> a + b
[1, 2, 3, 4, 5, 6]

回答 2

如果你看一下在官方的文档,你会看到下方appendextend。这就是您要寻找的。

itertools.chain如果您对高效的迭代感兴趣,而不是最终获得一个完全填充的数据结构,那也是很有用的。

If you take a look at the official docs, you’ll see right below append, extend. That’s what your looking for.

There’s also itertools.chain if you are more interested in efficient iteration than ending up with a fully populated data structure.


将包含NaN的Pandas列转换为dtype`int`

问题:将包含NaN的Pandas列转换为dtype`int`

我将数据从.csv文件读取到Pandas数据框,如下所示。对于其中一列,id我想将列类型指定为int。问题在于该id系列的值缺失/为空。

当我尝试id在读取.csv时将列转换为整数时,得到:

df= pd.read_csv("data.csv", dtype={'id': int}) 
error: Integer column has NA values

或者,我尝试在阅读以下内容后转换列类型,但是这次我得到:

df= pd.read_csv("data.csv") 
df[['id']] = df[['id']].astype(int)
error: Cannot convert NA to integer

我该如何解决?

I read data from a .csv file to a Pandas dataframe as below. For one of the columns, namely id, I want to specify the column type as int. The problem is the id series has missing/empty values.

When I try to cast the id column to integer while reading the .csv, I get:

df= pd.read_csv("data.csv", dtype={'id': int}) 
error: Integer column has NA values

Alternatively, I tried to convert the column type after reading as below, but this time I get:

df= pd.read_csv("data.csv") 
df[['id']] = df[['id']].astype(int)
error: Cannot convert NA to integer

How can I tackle this?


回答 0

整数列中缺少NaN rep是熊猫的“陷阱”

通常的解决方法是仅使用浮点数。

The lack of NaN rep in integer columns is a pandas “gotcha”.

The usual workaround is to simply use floats.


回答 1

在0.24。+版本中,pandas获得了保留具有缺失值的整数dtypes的功能。

可空整数数据类型

大熊猫可以使用来表示可能缺少值的整数数据arrays.IntegerArray。这是在熊猫中实现的扩展类型。它不是整数的默认dtype,因此不会被推断。您必须将dtype明确传递给array()Series

arr = pd.array([1, 2, np.nan], dtype=pd.Int64Dtype())
pd.Series(arr)

0      1
1      2
2    NaN
dtype: Int64

要将列转换为可为空的整数,请使用:

df['myCol'] = df['myCol'].astype('Int64')

In version 0.24.+ pandas has gained the ability to hold integer dtypes with missing values.

Nullable Integer Data Type.

Pandas can represent integer data with possibly missing values using arrays.IntegerArray. This is an extension types implemented within pandas. It is not the default dtype for integers, and will not be inferred; you must explicitly pass the dtype into array() or Series:

arr = pd.array([1, 2, np.nan], dtype=pd.Int64Dtype())
pd.Series(arr)

0      1
1      2
2    NaN
dtype: Int64

For convert column to nullable integers use:

df['myCol'] = df['myCol'].astype('Int64')

回答 2

我的用例是在装入数据库表之前先整理数据:

df[col] = df[col].fillna(-1)
df[col] = df[col].astype(int)
df[col] = df[col].astype(str)
df[col] = df[col].replace('-1', np.nan)

删除NaN,转换为int,转换为str,然后重新插入NAN。

它虽然不漂亮,但可以完成工作!

My use case is munging data prior to loading into a DB table:

df[col] = df[col].fillna(-1)
df[col] = df[col].astype(int)
df[col] = df[col].astype(str)
df[col] = df[col].replace('-1', np.nan)

Remove NaNs, convert to int, convert to str and then reinsert NANs.

It’s not pretty but it gets the job done!


回答 3

现在可以创建一个包含NaNs作为intdtype 的熊猫列,因为它现在已正式添加到熊猫0.24.0中。

pandas 0.24.x发行说明 Quote:“ Pandas已经拥有了持有缺失值的整数dtypes的能力

It is now possible to create a pandas column containing NaNs as dtype int, since it is now officially added on pandas 0.24.0

pandas 0.24.x release notes Quote: “Pandas has gained the ability to hold integer dtypes with missing values


回答 4

如果绝对要在列中组合整数和NaN,则可以使用“对象”数据类型:

df['col'] = (
    df['col'].fillna(0)
    .astype(int)
    .astype(object)
    .where(df['col'].notnull())
)

这将用整数替换NaN(无关紧要),将其转换为int,转换为对象,最后重新插入NaN。

If you absolutely want to combine integers and NaNs in a column, you can use the ‘object’ data type:

df['col'] = (
    df['col'].fillna(0)
    .astype(int)
    .astype(object)
    .where(df['col'].notnull())
)

This will replace NaNs with an integer (doesn’t matter which), convert to int, convert to object and finally reinsert NaNs.


回答 5

如果您可以修改存储的数据,请使用缺少的哨兵值id。由列名推断出的一个常见用例id是一个严格大于零的整数,您可以将其0用作前哨值,以便编写

if row['id']:
   regular_process(row)
else:
   special_process(row)

If you can modify your stored data, use a sentinel value for missing id. A common use case, inferred by the column name, being that id is an integer, strictly greater than zero, you could use 0 as a sentinel value so that you can write

if row['id']:
   regular_process(row)
else:
   special_process(row)

回答 6

您可以使用.dropna()是否可以删除带有NaN值的行。

df = df.dropna(subset=['id'])

或者,使用.fillna().astype()将NaN替换为值,并将其转换为int。

在处理带有大整数的CSV文件时,我遇到了这个问题,而其中一些缺失(NaN)。不能使用float作为类型,因为我可能会降低精度。

我的解决方案是使用str作为中间类型。然后,您可以在稍后的代码中将字符串转换为int。我将NaN替换为0,但是您可以选择任何值。

df = pd.read_csv(filename, dtype={'id':str})
df["id"] = df["id"].fillna("0").astype(int)

为了进行说明,这是一个示例,说明浮点数可能会降低精度:

s = "12345678901234567890"
f = float(s)
i = int(f)
i2 = int(s)
print (f, i, i2)

输出为:

1.2345678901234567e+19 12345678901234567168 12345678901234567890

You could use .dropna() if it is OK to drop the rows with the NaN values.

df = df.dropna(subset=['id'])

Alternatively, use .fillna() and .astype() to replace the NaN with values and convert them to int.

I ran into this problem when processing a CSV file with large integers, while some of them were missing (NaN). Using float as the type was not an option, because I might loose the precision.

My solution was to use str as the intermediate type. Then you can convert the string to int as you please later in the code. I replaced NaN with 0, but you could choose any value.

df = pd.read_csv(filename, dtype={'id':str})
df["id"] = df["id"].fillna("0").astype(int)

For the illustration, here is an example how floats may loose the precision:

s = "12345678901234567890"
f = float(s)
i = int(f)
i2 = int(s)
print (f, i, i2)

And the output is:

1.2345678901234567e+19 12345678901234567168 12345678901234567890

回答 7

这里的大多数解决方案都告诉您如何使用占位符整数表示空值。如果不确定整数是否会显示在源数据中,则该方法无济于事。我的方法将格式化不包含其十进制值的浮点数,并将空值转换为无值。结果是一个对象数据类型,当加载到CSV中时,它将看起来像一个带有空值的整数字段。

keep_df[col] = keep_df[col].apply(lambda x: None if pandas.isnull(x) else '{0:.0f}'.format(pandas.to_numeric(x)))

Most solutions here tell you how to use a placeholder integer to represent nulls. That approach isn’t helpful if you’re uncertain that integer won’t show up in your source data though. My method with will format floats without their decimal values and convert nulls to None’s. The result is an object datatype that will look like an integer field with null values when loaded into a CSV.

keep_df[col] = keep_df[col].apply(lambda x: None if pandas.isnull(x) else '{0:.0f}'.format(pandas.to_numeric(x)))

回答 8

我在使用pyspark时遇到了这个问题。由于这是在jvm上运行的代码的python前端,因此它需要类型安全,并且不能选择使用float而不是int。我通过将熊猫包装pd.read_csv在一个函数中来解决此问题,该函数将使用用户定义的填充值填充用户定义的列,然后再将其转换为所需的类型。这是我最终使用的内容:

def custom_read_csv(file_path, custom_dtype = None, fill_values = None, **kwargs):
    if custom_dtype is None:
        return pd.read_csv(file_path, **kwargs)
    else:
        assert 'dtype' not in kwargs.keys()
        df = pd.read_csv(file_path, dtype = {}, **kwargs)
        for col, typ in custom_dtype.items():
            if fill_values is None or col not in fill_values.keys():
                fill_val = -1
            else:
                fill_val = fill_values[col]
            df[col] = df[col].fillna(fill_val).astype(typ)
    return df

I ran into this issue working with pyspark. As this is a python frontend for code running on a jvm, it requires type safety and using float instead of int is not an option. I worked around the issue by wrapping the pandas pd.read_csv in a function that will fill user-defined columns with user-defined fill values before casting them to the required type. Here is what I ended up using:

def custom_read_csv(file_path, custom_dtype = None, fill_values = None, **kwargs):
    if custom_dtype is None:
        return pd.read_csv(file_path, **kwargs)
    else:
        assert 'dtype' not in kwargs.keys()
        df = pd.read_csv(file_path, dtype = {}, **kwargs)
        for col, typ in custom_dtype.items():
            if fill_values is None or col not in fill_values.keys():
                fill_val = -1
            else:
                fill_val = fill_values[col]
            df[col] = df[col].fillna(fill_val).astype(typ)
    return df

回答 9

首先删除包含NaN的行。然后对剩余的行进行整数转换。最后,再次插入删除的行。希望它能工作

First remove the rows which contain NaN. Then do Integer conversion on remaining rows. At Last insert the removed rows again. Hope it will work


回答 10

import pandas as pd

df= pd.read_csv("data.csv")
df['id'] = pd.to_numeric(df['id'])
import pandas as pd

df= pd.read_csv("data.csv")
df['id'] = pd.to_numeric(df['id'])

回答 11

假设您的DateColumn格式为3312018.0的字符串应转换为03/31/2018。并且,某些记录丢失或为0。

df['DateColumn'] = df['DateColumn'].astype(int)
df['DateColumn'] = df['DateColumn'].astype(str)
df['DateColumn'] = df['DateColumn'].apply(lambda x: x.zfill(8))
df.loc[df['DateColumn'] == '00000000','DateColumn'] = '01011980'
df['DateColumn'] = pd.to_datetime(df['DateColumn'], format="%m%d%Y")
df['DateColumn'] = df['DateColumn'].apply(lambda x: x.strftime('%m/%d/%Y'))

Assuming your DateColumn formatted 3312018.0 should be converted to 03/31/2018 as a string. And, some records are missing or 0.

df['DateColumn'] = df['DateColumn'].astype(int)
df['DateColumn'] = df['DateColumn'].astype(str)
df['DateColumn'] = df['DateColumn'].apply(lambda x: x.zfill(8))
df.loc[df['DateColumn'] == '00000000','DateColumn'] = '01011980'
df['DateColumn'] = pd.to_datetime(df['DateColumn'], format="%m%d%Y")
df['DateColumn'] = df['DateColumn'].apply(lambda x: x.strftime('%m/%d/%Y'))

Python中的“ collection.defaultdict”多个级别

问题:Python中的“ collection.defaultdict”多个级别

感谢SO方面的一些杰出人士,我发现了的可能性collections.defaultdict,尤其是在可读性和速度方面。我让他们成功使用。

现在,我想实现三个级别的字典,两个最大的字典是defaultdict,最低的是int。我找不到执行此操作的适当方法。这是我的尝试:

from collections import defaultdict
d = defaultdict(defaultdict)
a = [("key1", {"a1":22, "a2":33}),
     ("key2", {"a1":32, "a2":55}),
     ("key3", {"a1":43, "a2":44})]
for i in a:
    d[i[0]] = i[1]

现在这可以工作,但是以下是所需的行为,但无效:

d["key4"]["a1"] + 1

我怀疑我应该在某个地方声明第二个级别defaultdict是type int,但是我没有找到在哪里或怎么做。

defaultdict首先使用的原因是避免必须为每个新键初始化字典。

还有更优雅的建议吗?

谢谢pythoneers!

Thanks to some great folks on SO, I discovered the possibilities offered by collections.defaultdict, notably in readability and speed. I have put them to use with success.

Now I would like to implement three levels of dictionaries, the two top ones being defaultdict and the lowest one being int. I don’t find the appropriate way to do this. Here is my attempt:

from collections import defaultdict
d = defaultdict(defaultdict)
a = [("key1", {"a1":22, "a2":33}),
     ("key2", {"a1":32, "a2":55}),
     ("key3", {"a1":43, "a2":44})]
for i in a:
    d[i[0]] = i[1]

Now this works, but the following, which is the desired behavior, doesn’t:

d["key4"]["a1"] + 1

I suspect that I should have declared somewhere that the second level defaultdict is of type int, but I didn’t find where or how to do so.

The reason I am using defaultdict in the first place is to avoid having to initialize the dictionary for each new key.

Any more elegant suggestion?

Thanks pythoneers!


回答 0

用:

from collections import defaultdict
d = defaultdict(lambda: defaultdict(int))

defaultdict(int)只要在中访问新密钥,就会创建一个新密钥d

Use:

from collections import defaultdict
d = defaultdict(lambda: defaultdict(int))

This will create a new defaultdict(int) whenever a new key is accessed in d.


回答 1

使可腌制的嵌套defaultdict的另一种方法是使用部分对象而不是lambda:

from functools import partial
...
d = defaultdict(partial(defaultdict, int))

这将起作用,因为defaultdict类可在模块级别全局访问:

“除非对它包装的函数[或在这种情况下,类]可以在其__name__(在其__module__内)全局访问,否则您不能腌制部分对象” – 酸洗包装的部分函数

Another way to make a pickleable, nested defaultdict is to use a partial object instead of a lambda:

from functools import partial
...
d = defaultdict(partial(defaultdict, int))

This will work because the defaultdict class is globally accessible at the module level:

“You can’t pickle a partial object unless the function [or in this case, class] it wraps is globally accessible … under its __name__ (within its __module__)” — Pickling wrapped partial functions


回答 2

这里查看nosklo的答案以获得更通用的解决方案。

class AutoVivification(dict):
    """Implementation of perl's autovivification feature."""
    def __getitem__(self, item):
        try:
            return dict.__getitem__(self, item)
        except KeyError:
            value = self[item] = type(self)()
            return value

测试:

a = AutoVivification()

a[1][2][3] = 4
a[1][3][3] = 5
a[1][2]['test'] = 6

print a

输出:

{1: {2: {'test': 6, 3: 4}, 3: {3: 5}}}

Look at nosklo’s answer here for a more general solution.

class AutoVivification(dict):
    """Implementation of perl's autovivification feature."""
    def __getitem__(self, item):
        try:
            return dict.__getitem__(self, item)
        except KeyError:
            value = self[item] = type(self)()
            return value

Testing:

a = AutoVivification()

a[1][2][3] = 4
a[1][3][3] = 5
a[1][2]['test'] = 6

print a

Output:

{1: {2: {'test': 6, 3: 4}, 3: {3: 5}}}

回答 3

按照@rschwieb的要求D['key'] += 1,我们可以通过定义方法覆盖加法来扩展前一个__add__方法,以使其表现得更像collections.Counter()

首先__missing__将被调用以创建一个新的空值,该值将传递到中__add__。我们测试该值,以空值为False

有关覆盖的更多信息,请参见模拟数字类型

from numbers import Number


class autovivify(dict):
    def __missing__(self, key):
        value = self[key] = type(self)()
        return value

    def __add__(self, x):
        """ override addition for numeric types when self is empty """
        if not self and isinstance(x, Number):
            return x
        raise ValueError

    def __sub__(self, x):
        if not self and isinstance(x, Number):
            return -1 * x
        raise ValueError

例子:

>>> import autovivify
>>> a = autovivify.autovivify()
>>> a
{}
>>> a[2]
{}
>>> a
{2: {}}
>>> a[4] += 1
>>> a[5][3][2] -= 1
>>> a
{2: {}, 4: 1, 5: {3: {2: -1}}}

我们可以只提供默认的0值,然后尝试操作:

class av2(dict):
    def __missing__(self, key):
        value = self[key] = type(self)()
        return value

    def __add__(self, x):
        """ override addition when self is empty """
        if not self:
            return 0 + x
        raise ValueError

    def __sub__(self, x):
        """ override subtraction when self is empty """
        if not self:
            return 0 - x
        raise ValueError

As per @rschwieb’s request for D['key'] += 1, we can expand on previous by overriding addition by defining __add__ method, to make this behave more like a collections.Counter()

First __missing__ will be called to create a new empty value, which will be passed into __add__. We test the value, counting on empty values to be False.

See emulating numeric types for more information on overriding.

from numbers import Number


class autovivify(dict):
    def __missing__(self, key):
        value = self[key] = type(self)()
        return value

    def __add__(self, x):
        """ override addition for numeric types when self is empty """
        if not self and isinstance(x, Number):
            return x
        raise ValueError

    def __sub__(self, x):
        if not self and isinstance(x, Number):
            return -1 * x
        raise ValueError

Examples:

>>> import autovivify
>>> a = autovivify.autovivify()
>>> a
{}
>>> a[2]
{}
>>> a
{2: {}}
>>> a[4] += 1
>>> a[5][3][2] -= 1
>>> a
{2: {}, 4: 1, 5: {3: {2: -1}}}

Rather than checking argument is a Number (very non-python, amirite!) we could just provide a default 0 value and then attempt the operation:

class av2(dict):
    def __missing__(self, key):
        value = self[key] = type(self)()
        return value

    def __add__(self, x):
        """ override addition when self is empty """
        if not self:
            return 0 + x
        raise ValueError

    def __sub__(self, x):
        """ override subtraction when self is empty """
        if not self:
            return 0 - x
        raise ValueError

回答 4

晚会晚了,但是对于任意深度,我只是发现自己在做这样的事情:

from collections import defaultdict

class DeepDict(defaultdict):
    def __call__(self):
        return DeepDict(self.default_factory)

这里的窍门基本上是使DeepDict实例本身成为构造缺失值的有效工厂。现在我们可以做类似的事情

dd = DeepDict(DeepDict(list))
dd[1][2].extend([3,4])
sum(dd[1][2])  # 7

ddd = DeepDict(DeepDict(DeepDict(list)))
ddd[1][2][3].extend([4,5])
sum(ddd[1][2][3])  # 9

Late to the party, but for arbitrary depth I just found myself doing something like this:

from collections import defaultdict

class DeepDict(defaultdict):
    def __call__(self):
        return DeepDict(self.default_factory)

The trick here is basically to make the DeepDict instance itself a valid factory for constructing missing values. Now we can do things like

dd = DeepDict(DeepDict(list))
dd[1][2].extend([3,4])
sum(dd[1][2])  # 7

ddd = DeepDict(DeepDict(DeepDict(list)))
ddd[1][2][3].extend([4,5])
sum(ddd[1][2][3])  # 9

回答 5

def _sub_getitem(self, k):
    try:
        # sub.__class__.__bases__[0]
        real_val = self.__class__.mro()[-2].__getitem__(self, k)
        val = '' if real_val is None else real_val
    except Exception:
        val = ''
        real_val = None
    # isinstance(Avoid,dict)也是true,会一直递归死
    if type(val) in (dict, list, str, tuple):
        val = type('Avoid', (type(val),), {'__getitem__': _sub_getitem, 'pop': _sub_pop})(val)
        # 重新赋值当前字典键为返回值,当对其赋值时可回溯
        if all([real_val is not None, isinstance(self, (dict, list)), type(k) is not slice]):
            self[k] = val
    return val


def _sub_pop(self, k=-1):
    try:
        val = self.__class__.mro()[-2].pop(self, k)
        val = '' if val is None else val
    except Exception:
        val = ''
    if type(val) in (dict, list, str, tuple):
        val = type('Avoid', (type(val),), {'__getitem__': _sub_getitem, 'pop': _sub_pop})(val)
    return val


class DefaultDict(dict):
    def __getitem__(self, k):
        return _sub_getitem(self, k)

    def pop(self, k):
        return _sub_pop(self, k)

In[8]: d=DefaultDict()
In[9]: d['a']['b']['c']['d']
Out[9]: ''
In[10]: d['a']="ggggggg"
In[11]: d['a']
Out[11]: 'ggggggg'
In[12]: d['a']['pp']
Out[12]: ''

再没有错误。无论嵌套多少级。弹出也没有错误

dd = DefaultDict({“ 1”:333333})

def _sub_getitem(self, k):
    try:
        # sub.__class__.__bases__[0]
        real_val = self.__class__.mro()[-2].__getitem__(self, k)
        val = '' if real_val is None else real_val
    except Exception:
        val = ''
        real_val = None
    # isinstance(Avoid,dict)也是true,会一直递归死
    if type(val) in (dict, list, str, tuple):
        val = type('Avoid', (type(val),), {'__getitem__': _sub_getitem, 'pop': _sub_pop})(val)
        # 重新赋值当前字典键为返回值,当对其赋值时可回溯
        if all([real_val is not None, isinstance(self, (dict, list)), type(k) is not slice]):
            self[k] = val
    return val


def _sub_pop(self, k=-1):
    try:
        val = self.__class__.mro()[-2].pop(self, k)
        val = '' if val is None else val
    except Exception:
        val = ''
    if type(val) in (dict, list, str, tuple):
        val = type('Avoid', (type(val),), {'__getitem__': _sub_getitem, 'pop': _sub_pop})(val)
    return val


class DefaultDict(dict):
    def __getitem__(self, k):
        return _sub_getitem(self, k)

    def pop(self, k):
        return _sub_pop(self, k)

In[8]: d=DefaultDict()
In[9]: d['a']['b']['c']['d']
Out[9]: ''
In[10]: d['a']="ggggggg"
In[11]: d['a']
Out[11]: 'ggggggg'
In[12]: d['a']['pp']
Out[12]: ''

No errors again. No matter how many levels nested. pop no error also

dd=DefaultDict({“1”:333333})


如何修改文本文件?

问题:如何修改文本文件?

我正在使用Python,并且想在不删除或复制文件的情况下将字符串插入文本文件。我怎样才能做到这一点?

I’m using Python, and would like to insert a string into a text file without deleting or copying the file. How can I do that?


回答 0

不幸的是,没有重写的方法就无法插入文件的中间。如先前的张贴者所指出的,您可以将文件追加到文件中或使用“搜索”覆盖文件的一部分,但是如果要在文件的开头或中间添加内容,则必须重写它。

这是操作系统,而不是Python。所有语言均相同。

我通常要做的是从文件中读取,进行修改并将其写到名为myfile.txt.tmp或类似名称的新文件中。这比将整个文件读入内存更好,因为文件可能太大了。临时文件完成后,我将其重命名为原始文件。

这是一种很好的安全方法,因为如果文件写入由于任何原因而崩溃或中止,您仍然可以拥有原始文件。

Unfortunately there is no way to insert into the middle of a file without re-writing it. As previous posters have indicated, you can append to a file or overwrite part of it using seek but if you want to add stuff at the beginning or the middle, you’ll have to rewrite it.

This is an operating system thing, not a Python thing. It is the same in all languages.

What I usually do is read from the file, make the modifications and write it out to a new file called myfile.txt.tmp or something like that. This is better than reading the whole file into memory because the file may be too large for that. Once the temporary file is completed, I rename it the same as the original file.

This is a good, safe way to do it because if the file write crashes or aborts for any reason, you still have your untouched original file.


谁能解释python的相对导入?

问题:谁能解释python的相对导入?

我无法终生让python的相对导入工作。我创建了一个不起作用的简单示例:

目录结构为:

/__init__.py
/start.py
/parent.py
/sub/__init__.py
/sub/relative.py

/start.py 仅包含: import sub.relative

/sub/relative.py 仅包含 from .. import parent

所有其他文件均为空白。

在命令行上执行以下命令时:

$ cd /
$ python start.py

我得到:

Traceback (most recent call last):
  File "start.py", line 1, in <module>
    import sub.relative
  File "/home/cvondrick/sandbox/sub/relative.py", line 1, in <module>
    from .. import parent
ValueError: Attempted relative import beyond toplevel package

我正在使用Python 2.6。为什么会这样呢?如何使此沙盒示例正常工作?

I can’t for the life of me get python’s relative imports to work. I have created a simple example of where it does not function:

The directory structure is:

/__init__.py
/start.py
/parent.py
/sub/__init__.py
/sub/relative.py

/start.py contains just: import sub.relative

/sub/relative.py contains just from .. import parent

All other files are blank.

When executing the following on the command line:

$ cd /
$ python start.py

I get:

Traceback (most recent call last):
  File "start.py", line 1, in <module>
    import sub.relative
  File "/home/cvondrick/sandbox/sub/relative.py", line 1, in <module>
    from .. import parent
ValueError: Attempted relative import beyond toplevel package

I am using Python 2.6. Why is this the case? How do I make this sandbox example work?


回答 0

您正在从“ sub”包中导入。start.py即使有__init__.py礼物,它本身也不在包装中。

您需要从以下目录中的一个目录启动程序parent.py

./start.py

./pkg/__init__.py
./pkg/parent.py
./pkg/sub/__init__.py
./pkg/sub/relative.py

start.py

import pkg.sub.relative

现在pkg是顶层软件包,您的相对导入应该可以了。


如果您想坚持使用当前的布局,则可以使用import parent。因为您是start.py用来启动解释器的,所以该目录start.py位于python路径中。parent.py作为一个单独的模块住在那儿。

__init__.py如果您不将任何内容导入到目录树中更远的脚本中,也可以安全地删除顶层。

You are importing from package “sub”. start.py is not itself in a package even if there is a __init__.py present.

You would need to start your program from one directory over parent.py:

./start.py

./pkg/__init__.py
./pkg/parent.py
./pkg/sub/__init__.py
./pkg/sub/relative.py

With start.py:

import pkg.sub.relative

Now pkg is the top level package and your relative import should work.


If you want to stick with your current layout you can just use import parent. Because you use start.py to launch your interpreter, the directory where start.py is located is in your python path. parent.py lives there as a separate module.

You can also safely delete the top level __init__.py, if you don’t import anything into a script further up the directory tree.


回答 1

如果要relative.py直接调用,即如果您确实要从顶级模块导入,则必须将其显式添加到sys.path列表中。
它应如何工作:

# Add this line to the beginning of relative.py file
import sys
sys.path.append('..')

# Now you can do imports from one directory top cause it is in the sys.path
import parent

# And even like this:
from parent import Parent

如果您认为上述情况可能导致某种程度的不一致,则可以改用以下方法:

sys.path.append(sys.path[0] + "/..")

sys.path[0] 指的是进入点运行的路径。

If you are going to call relative.py directly and i.e. if you really want to import from a top level module you have to explicitly add it to the sys.path list.
Here is how it should work:

# Add this line to the beginning of relative.py file
import sys
sys.path.append('..')

# Now you can do imports from one directory top cause it is in the sys.path
import parent

# And even like this:
from parent import Parent

If you think the above can cause some kind of inconsistency you can use this instead:

sys.path.append(sys.path[0] + "/..")

sys.path[0] refers to the path that the entry point was ran from.


回答 2

在python3中签出:

python -V
Python 3.6.5

范例1:

.
├── parent.py
├── start.py
└── sub
    └── relative.py

- start.py
import sub.relative

- parent.py
print('Hello from parent.py')

- sub/relative.py
from .. import parent

如果我们这样运行(只是确保PYTHONPATH为空):

PYTHONPATH='' python3 start.py

输出:

Traceback (most recent call last):
  File "start.py", line 1, in <module>
    import sub.relative
  File "/python-import-examples/so-example-v1/sub/relative.py", line 1, in <module>
    from .. import parent
ValueError: attempted relative import beyond top-level package

如果我们改变导入 sub/relative.py

- sub/relative.py
import parent

如果我们这样运行:

PYTHONPATH='' python3 start.py

输出:

Hello from parent.py

范例2:

.
├── parent.py
└── sub
    ├── relative.py
    └── start.py

- parent.py
print('Hello from parent.py')

- sub/relative.py
print('Hello from relative.py')

- sub/start.py
import relative
from .. import parent

像这样运行:

PYTHONPATH='' python3 sub/start.py

输出:

Hello from relative.py
Traceback (most recent call last):
  File "sub/start.py", line 2, in <module>
    from .. import parent
ValueError: attempted relative import beyond top-level package

如果我们更改import in sub/start.py

- sub/start.py
import relative
import parent

像这样运行:

PYTHONPATH='' python3 sub/start.py

输出:

Hello from relative.py
Traceback (most recent call last):
  File "sub/start.py", line 3, in <module>
    import parent
ModuleNotFoundError: No module named 'parent'

像这样运行:

PYTHONPATH='.' python3 sub/start.py

输出:

Hello from relative.py
Hello from parent.py

另外最好使用从根文件夹导入,即:

- sub/start.py
import sub.relative
import parent

像这样运行:

PYTHONPATH='.' python3 sub/start.py

输出:

Hello from relative.py
Hello from parent.py

Checking it out in python3:

python -V
Python 3.6.5

Example1:

.
├── parent.py
├── start.py
└── sub
    └── relative.py

- start.py
import sub.relative

- parent.py
print('Hello from parent.py')

- sub/relative.py
from .. import parent

If we run it like this(just to make sure PYTHONPATH is empty):

PYTHONPATH='' python3 start.py

Output:

Traceback (most recent call last):
  File "start.py", line 1, in <module>
    import sub.relative
  File "/python-import-examples/so-example-v1/sub/relative.py", line 1, in <module>
    from .. import parent
ValueError: attempted relative import beyond top-level package

If we change import in sub/relative.py

- sub/relative.py
import parent

If we run it like this:

PYTHONPATH='' python3 start.py

Output:

Hello from parent.py

Example2:

.
├── parent.py
└── sub
    ├── relative.py
    └── start.py

- parent.py
print('Hello from parent.py')

- sub/relative.py
print('Hello from relative.py')

- sub/start.py
import relative
from .. import parent

Run it like:

PYTHONPATH='' python3 sub/start.py

Output:

Hello from relative.py
Traceback (most recent call last):
  File "sub/start.py", line 2, in <module>
    from .. import parent
ValueError: attempted relative import beyond top-level package

If we change import in sub/start.py:

- sub/start.py
import relative
import parent

Run it like:

PYTHONPATH='' python3 sub/start.py

Output:

Hello from relative.py
Traceback (most recent call last):
  File "sub/start.py", line 3, in <module>
    import parent
ModuleNotFoundError: No module named 'parent'

Run it like:

PYTHONPATH='.' python3 sub/start.py

Output:

Hello from relative.py
Hello from parent.py

Also it’s better to use import from root folder, i.e.:

- sub/start.py
import sub.relative
import parent

Run it like:

PYTHONPATH='.' python3 sub/start.py

Output:

Hello from relative.py
Hello from parent.py

-m开关的作用是什么?

问题:-m开关的作用是什么?

你能给我解释一下打电话之间有什么区别

python -m mymod1 mymod2.py args

python mymod1.py mymod2.py args

看来在这两种情况下mymod1.py被调用,sys.argv

['mymod1.py', 'mymod2.py', 'args']

那么,该-m开关是做什么用的呢?

Could you explain to me what the difference is between calling

python -m mymod1 mymod2.py args

and

python mymod1.py mymod2.py args

It seems in both cases mymod1.py is called and sys.argv is

['mymod1.py', 'mymod2.py', 'args']

So what is the -m switch for?


回答 0

PEP 338Rationale部分的第一行说:

Python 2.4添加了命令行开关-m,以允许使用Python模块命名空间定位模块以作为脚本执行。激励性的示例是标准库模块,例如pdb和profile,并且Python 2.4实现对于此有限的目的是合适的。

因此,您可以通过这种方式在Python的搜索路径中指定任何模块,而不仅仅是当前目录中的文件。您是正确的,python mymod1.py mymod2.py args其效果完全相同。本Scope of this proposal节的第一行指出:

在Python 2.4中,将执行使用-m定位的模块,就像在命令行中提供了其文件名一样。

还有-m更多的可能,例如使用作为包装一部分的模块等,这就是PEP 338其余部分的意义。阅读以获取更多信息。

The first line of the Rationale section of PEP 338 says:

Python 2.4 adds the command line switch -m to allow modules to be located using the Python module namespace for execution as scripts. The motivating examples were standard library modules such as pdb and profile, and the Python 2.4 implementation is fine for this limited purpose.

So you can specify any module in Python’s search path this way, not just files in the current directory. You’re correct that python mymod1.py mymod2.py args has exactly the same effect. The first line of the Scope of this proposal section states:

In Python 2.4, a module located using -m is executed just as if its filename had been provided on the command line.

With -m more is possible, like working with modules which are part of a package, etc. That’s what the rest of PEP 338 is about. Read it for more info.


回答 1

值得一提的是,只有在程序包具有文件的情况下__main__.py,此方法才有效。否则,该程序包无法直接执行。

python -m some_package some_arguments

python解释器将__main__.py在包路径中查找要执行的文件。等效于:

python path_to_package/__main__.py somearguments

它将在以下时间执行内容:

if __name__ == "__main__":

It’s worth mentioning this only works if the package has a file __main__.py Otherwise, this package can not be executed directly.

python -m some_package some_arguments

The python interpreter will looking for a __main__.py file in the package path to execute. It’s equivalent to:

python path_to_package/__main__.py somearguments

It will execute the content after:

if __name__ == "__main__":

回答 2

在我看来,尽管已经多次询问并回答了这个问题(例如,在这里在这里在这里在这里),但是没有一个现有的答案可以完全或简洁地捕捉到该-m标志的所有含义。因此,以下将尝试改进之前的内容。

简介(TLDR)

-m命令执行了很多操作,并非始终需要所有这些命令。简而言之:(1)允许通过模块名而不是文件名执行python脚本(2)允许选择要添加到的目录以sys.path进行import解析,(3)允许从命令行执行具有相对导入的python脚本。

初赛

为了解释-m标志,我们首先必须弄清楚一些术语。

首先,Python的主要组织单位称为模块。模块有两种形式之一:代码模块和包模块。代码模块是包含python可执行代码的任何文件。软件包模块是包含其他模块(代码模块或软件包模块)的目录。代码模块的最常见类型是*.py文件,而软件包模块的最常见类型是包含__init__.py文件的目录。

其次,可以通过两种不同的方式唯一标识所有模块:<modulename><filename>。模块通常由Python代码中的模块名称(例如import <modulename>)和命令行上的文件名(例如)来标识python <filename>。所有Python解释器都可以通过一组定义良好的规则将模块名转换为文件名。这些规则取决于sys.path变量,因此可以通过更改此值来更改映射(有关如何完成此操作的更多信息,请参阅PEP 302)。

第三,所有模块(代码和程序包)都可以执行(这意味着与模块关联的代码将由Python解释器评估)。根据执行方法和模块类型的不同,对哪些代码进行评估以及何时修改可能会有所不同。例如,如果一个人通过执行一个包模块,python <filename>那么<filename>/__init__.py它将被评估,然后是<filename>/__main__.py。另一方面,如果一个人通过执行相同的程序包模块,import <modulename>那么__init__.py将仅执行程序包。

的历史发展 -m

-m标志最初是在Python 2.4.1中引入的。最初,它的唯一目的是提供一种识别要执行的python模块的替代方法。也就是说,如果我们同时知道模块的<filename><modulename>,则以下两个命令是等效的:python <filename> <args>python -m <modulename> <args>。另外,根据PEP 338,此迭代-m仅适用于顶级模块名称(即,可以直接在sys.path上找到的模块,而无需任何中间包)。

随着完成PEP 338-m功能扩展到支持<modulename>超出顶层modulenames表示。这意味着http.server现在已经完全支持诸如这样的名称。此增强功能还意味着模块中的所有软件包现在都已加载(即,所有软件包__init__.py文件均已评估)。

PEP 366-m带来了最终的主要功能增强。通过此更新,不仅可以支持绝对导入,还可以支持显式相对导入。这是通过修改命令中命名模块的变量来实现的。-m__package__-m

用例

-m标志有两种值得注意的用例:

  1. 从命令行执行可能不知道其文件名的模块。该用例利用了Python解释器知道如何将模块名转换为文件名这一事实。当要从命令行运行stdlib模块或第三方模块时,这特别有利。例如,很少有人知道http.server模块的文件名,但大多数人确实知道其模块名,因此我们可以使用从命令行执行它python -m http.server

  2. 要执行包含绝对导入的本地软件包,而无需安装它。PEP 338中详细介绍了该用例,并利用了将当前工作目录添加到sys.path而不是模块目录的事实。该用例与pip install -e .在开发/编辑模式下安装软件包非常相似。

缺点

经过-m多年的改进,它仍然存在一个主要缺点-它只能执行以python编写的代码模块(即* .py)。例如,如果-m用于执行C编译代码模块,则会产生以下错误,No code object available for <modulename>(请参见此处以获取更多详细信息)。

详细比较

通过python命令执行模块的效果(即python <filename>):

  • sys.path 修改为包括最终目录 <filename>
  • __name__ 设定为 '__main__'
  • __package__ 设定为 None
  • __init__.py 不评估任何软件包(包括其自身的软件包模块)
  • __main__.py评估包装模块;对代码进行代码模块评估。

通过import语句(即import <modulename>)执行模块的影响:

  • sys.path以任何方式修改
  • __name__ 设置为的绝对形式 <modulename>
  • __package__ 设置为中的直接父包 <modulename>
  • __init__.py 针对所有软件包进行评估(包括针对软件包模块的评估)
  • __main__.py评价包模块; 对代码进行代码模块评估

通过-m标志(即python -m <modulename>)执行模块的影响:

  • sys.path 修改为包括当前目录
  • __name__ 设定为 '__main__'
  • __package__ 设置为中的直接父包 <modulename>
  • __init__.py 针对所有软件包进行评估(包括针对软件包模块的评估)
  • __main__.py评估包装模块;对代码进行代码模块评估

结论

-m最简单的角度来看,该标志是使用模块名而不是文件名从命令行执行python脚本的一种方法。另外,-m提供了附加功能,结合了import语句的功能(例如,支持显式相对导入和自动包__init__评估)和python命令行的便利性。

Despite this question having been asked and answered several times (e.g., here, here, here, and here) in my opinion no existing answer fully or concisely captures all the implications of the -m flag. Therefore, the following will attempt to improve on what has come before.

Introduction (TLDR)

The -m flag does a lot of things, not all of which will be needed all the time. In short it can be used to: (1) execute python code from the command line via modulename rather than filename (2) add a directory to sys.path for use in import resolution and (3) execute python code that contains relative imports from the command line.

Preliminaries

To explain the -m flag we first need to explain a little terminology.

Python’s primary organizational unit is known as a module. Module’s come in one of two flavors: code modules and package modules. A code module is any file that contains python executable code. A package module is a directory that contains other modules (either code modules or package modules). The most common type of code modules are *.py files while the most common type of package modules are directories containing an __init__.py file.

Python allows modules to be uniquely identified in two distinct ways: modulename and filename. In general, modules are identified by modulename in Python code (e.g., import <modulename>) and by filename on the command line (e.g., python <filename>). All python interpreters are able to convert modulenames to filenames by following the same few, well-defined rules. These rules hinge on the sys.path variable. By altering this variable one can change how Python resolves modulenames into filenames (for more on how this is done see PEP 302).

All modules (both code and package) can be executed (i.e., code associated with the module will be evaluated by the Python interpreter). Depending on the execution method (and module type) what code gets evaluated, and when, can change quite a bit. For example, if one executes a package module via python <filename> then <filename>/__init__.py will be evaluated followed by <filename>/__main__.py. On the other hand, if one executes that same package module via import <modulename> then only the package’s __init__.py will be executed.

Historical Development of -m

The -m flag was first introduced in Python 2.4.1. Initially its only purpose was to provide an alternative means of identifying the python module to execute from the command line. That is, if we knew both the <filename> and <modulename> for a module then the following two commands were equivalent: python <filename> <args> and python -m <modulename> <args>. One constraint with this iteration, according to PEP 338, was that -m only worked with top level modulenames (i.e., modules that could be found directly on sys.path without any intervening package modules).

With the completion of PEP 338 the -m feature was extended to support <modulename> representations beyond the top level. This meant names such as http.server were now fully supported. This extension also meant that each parent package in modulename was now evaluated (i.e., all parent package __init__.py files were evaluated) in addition to the module referenced by the modulename itself.

The final major feature enhancement for -m came with PEP 366. With this upgrade -m gained the ability to support not only absolute imports but also explicit relative imports when executing modules. This was achieved by changing -m so that it set the __package__ variable to the parent module of the given modulename (in addition to everything else it already did).

Use Cases

There are two notable use cases for the -m flag:

  1. To execute modules from the command line for which one may not know their filename. This use case takes advantage of the fact that the Python interpreter knows how to convert modulenames to filenames. This is particularly advantageous when one wants to run stdlib modules or 3rd-party module from the command line. For example, very few people know the filename for the http.server module but most people do know its modulename so we can execute it from the command line using python -m http.server.

  2. To execute a local package containing absolute or relative imports without needing to install it. This use case is detailed in PEP 338 and leverages the fact that the current working directory is added to sys.path rather than the module’s directory. This use case is very similar to using pip install -e . to install a package in develop/edit mode.

Shortcomings

With all the enhancements made to -m over the years it still has one major shortcoming — it can only execute modules written in Python (i.e., *.py). For example, if -m is used to execute a C compiled code module the following error will be produced, No code object available for <modulename> (see here for more details).

Detailed Comparisons

Effects of module execution via import statement (i.e., import <modulename>):

  • sys.path is not modified in any way
  • __name__ is set to the absolute form of <modulename>
  • __package__ is set to the immediate parent package in <modulename>
  • __init__.py is evaluated for all packages (including its own for package modules)
  • __main__.py is not evaluated for package modules; the code is evaluated for code modules

Effects of module execution via command line (i.e., python <filename>):

  • sys.path is modified to include the final directory in <filename>
  • __name__ is set to '__main__'
  • __package__ is set to None
  • __init__.py is not evaluated for any package (including its own for package modules)
  • __main__.py is evaluated for package modules; the code is evaluated for code modules.

Effects of module execution via command line with the -m flag (i.e., python -m <modulename>):

  • sys.path is modified to include the current directory
  • __name__ is set to '__main__'
  • __package__ is set to the immediate parent package in <modulename>
  • __init__.py is evaluated for all packages (including its own for package modules)
  • __main__.py is evaluated for package modules; the code is evaluated for code modules

Conclusion

The -m flag is, at its simplest, a means to execute python scripts from the command line by using modulenames rather than filenames. The real power of -m, however, is in its ability to combine the power of import statements (e.g., support for explicit relative imports and automatic package __init__ evaluation) with the convenience of the command line.


SQLAlchemy默认DateTime

问题:SQLAlchemy默认DateTime

这是我的声明性模型:

import datetime
from sqlalchemy import Column, Integer, DateTime
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class Test(Base):
    __tablename__ = 'test'

    id = Column(Integer, primary_key=True)
    created_date = DateTime(default=datetime.datetime.utcnow)

但是,当我尝试导入此模块时,出现此错误:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "orm/models2.py", line 37, in <module>
    class Test(Base):
  File "orm/models2.py", line 41, in Test
    created_date = sqlalchemy.DateTime(default=datetime.datetime.utcnow)
TypeError: __init__() got an unexpected keyword argument 'default'

如果使用整数类型,则可以设置默认值。这是怎么回事?

This is my declarative model:

import datetime
from sqlalchemy import Column, Integer, DateTime
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class Test(Base):
    __tablename__ = 'test'

    id = Column(Integer, primary_key=True)
    created_date = DateTime(default=datetime.datetime.utcnow)

However, when I try to import this module, I get this error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "orm/models2.py", line 37, in <module>
    class Test(Base):
  File "orm/models2.py", line 41, in Test
    created_date = sqlalchemy.DateTime(default=datetime.datetime.utcnow)
TypeError: __init__() got an unexpected keyword argument 'default'

If I use an Integer type, I can set a default value. What’s going on?


回答 0

DateTime没有默认键作为输入。默认键应该是该Column功能的输入。试试这个:

import datetime
from sqlalchemy import Column, Integer, DateTime
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class Test(Base):
    __tablename__ = 'test'

    id = Column(Integer, primary_key=True)
    created_date = Column(DateTime, default=datetime.datetime.utcnow)

DateTime doesn’t have a default key as an input. The default key should be an input to the Column function. Try this:

import datetime
from sqlalchemy import Column, Integer, DateTime
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class Test(Base):
    __tablename__ = 'test'

    id = Column(Integer, primary_key=True)
    created_date = Column(DateTime, default=datetime.datetime.utcnow)

回答 1

计算数据库中的时间戳,而不是客户端中的时间戳

为了理智,您可能希望datetimes由数据库服务器而不是应用程序服务器来计算所有数据。计算应用程序中的时间戳可能会导致问题,因为网络等待时间是可变的,客户端会经历略微不同的时钟漂移,并且不同的编程语言有时会略有不同地计算时间。

SQLAlchemy允许您通过传递func.now()func.current_timestamp()(它们是彼此的别名)来执行此操作,该命令告诉DB计算时间戳本身。

使用SQLALchemy的 server_default

另外,对于已经告诉数据库计算值的默认值,通常最好使用server_default代替default。这告诉SQLAlchemy将默认值作为CREATE TABLE语句的一部分传递。

例如,如果您针对该表编写了一个临时脚本,则使用server_default意味着您无需担心手动向脚本添加时间戳调用-数据库将自动对其进行设置。

了解SQLAlchemy的onupdate/server_onupdate

SQLAlchemy还支持,onupdate以便每当更新该行时,它都会插入一个新的时间戳。再一次,最好告诉数据库来计算时间戳本身:

from sqlalchemy.sql import func

time_created = Column(DateTime(timezone=True), server_default=func.now())
time_updated = Column(DateTime(timezone=True), onupdate=func.now())

有一个server_onupdate参数,但与不同server_default,它实际上未在服务器端设置任何参数。它只是告诉SQLalchemy更新发生时(也许您在列上创建了触发器),数据库将更改列,因此SQLAlchemy将要求返回值,以便它可以更新相应的对象。

另一个潜在的陷阱:

您可能会惊讶地发现,如果在单个事务中进行大量更改,则它们都具有相同的时间戳。这是因为SQL标准指定CURRENT_TIMESTAMP根据事务的开始返回值。

PostgreSQL提供了非SQL标准statement_timestamp()clock_timestamp()并且在事务中更改。此处的文档:https : //www.postgresql.org/docs/current/static/functions-datetime.html#FUNCTIONS-DATETIME-CURRENT

UTC时间戳

如果要使用UTC时间戳,请func.utcnow()SQLAlchemy文档中提供的实现存根。但是,您需要自己提供适当的特定于驱动程序的功能。

Calculate timestamps within your DB, not your client

For sanity, you probably want to have all datetimes calculated by your DB server, rather than the application server. Calculating the timestamp in the application can lead to problems because network latency is variable, clients experience slightly different clock drift, and different programming languages occasionally calculate time slightly differently.

SQLAlchemy allows you to do this by passing func.now() or func.current_timestamp() (they are aliases of each other) which tells the DB to calculate the timestamp itself.

Use SQLALchemy’s server_default

Additionally, for a default where you’re already telling the DB to calculate the value, it’s generally better to use server_default instead of default. This tells SQLAlchemy to pass the default value as part of the CREATE TABLE statement.

For example, if you write an ad hoc script against this table, using server_default means you won’t need to worry about manually adding a timestamp call to your script–the database will set it automatically.

Understanding SQLAlchemy’s onupdate/server_onupdate

SQLAlchemy also supports onupdate so that anytime the row is updated it inserts a new timestamp. Again, best to tell the DB to calculate the timestamp itself:

from sqlalchemy.sql import func

time_created = Column(DateTime(timezone=True), server_default=func.now())
time_updated = Column(DateTime(timezone=True), onupdate=func.now())

There is a server_onupdate parameter, but unlike server_default, it doesn’t actually set anything serverside. It just tells SQLalchemy that your database will change the column when an update happens (perhaps you created a trigger on the column ), so SQLAlchemy will ask for the return value so it can update the corresponding object.

One other potential gotcha:

You might be surprised to notice that if you make a bunch of changes within a single transaction, they all have the same timestamp. That’s because the SQL standard specifies that CURRENT_TIMESTAMP returns values based on the start of the transaction.

PostgreSQL provides the non-SQL-standard statement_timestamp() and clock_timestamp() which do change within a transaction. Docs here: https://www.postgresql.org/docs/current/static/functions-datetime.html#FUNCTIONS-DATETIME-CURRENT

UTC timestamp

If you want to use UTC timestamps, a stub of implementation for func.utcnow() is provided in SQLAlchemy documentation. You need to provide appropriate driver-specific functions on your own though.


回答 2

您还可以默认使用sqlalchemy内置函数 DateTime

from sqlalchemy.sql import func

DT = Column(DateTime(timezone=True), default=func.now())

You can also use sqlalchemy builtin function for default DateTime

from sqlalchemy.sql import func

DT = Column(DateTime(timezone=True), default=func.now())

回答 3

您可能想要使用,onupdate=datetime.now以便UPDATE也可以更改该last_updated字段。

SQLAlchemy对于python执行的函数有两个默认值。

  • default 设置一次INSERT的值
  • onupdate还将值设置为UPDATE 上的可调用结果。

You likely want to use onupdate=datetime.now so that UPDATEs also change the last_updated field.

SQLAlchemy has two defaults for python executed functions.

  • default sets the value on INSERT, only once
  • onupdate sets the value to the callable result on UPDATE as well.

回答 4

default关键字参数应被给予Column对象。

例:

Column(u'timestamp', TIMESTAMP(timezone=True), primary_key=False, nullable=False, default=time_now),

默认值可以是可调用的,在这里我定义如下。

from pytz import timezone
from datetime import datetime

UTC = timezone('UTC')

def time_now():
    return datetime.now(UTC)

The default keyword parameter should be given to the Column object.

Example:

Column(u'timestamp', TIMESTAMP(timezone=True), primary_key=False, nullable=False, default=time_now),

The default value can be a callable, which here I defined like the following.

from pytz import timezone
from datetime import datetime

UTC = timezone('UTC')

def time_now():
    return datetime.now(UTC)

回答 5

根据PostgreSQL文档,https://www.postgresql.org/docs/9.6/static/functions-datetime.html

now, CURRENT_TIMESTAMP, LOCALTIMESTAMP return the time of transaction.

这被认为是一个功能:目的是允许单个事务具有“当前”时间的一致概念,以便同一事务内的多个修改具有相同的时间戳。

如果您不希望事务时间戳记,则可能要使用statement_timestampclock_timestamp

statement_timestamp()

返回当前语句的开始时间(更具体地说,是从客户端收到最新命令消息的时间)。statement_timestamp

clock_timestamp()

返回实际的当前时间,因此,即使在单个SQL命令中,其值也会更改。

As per PostgreSQL documentation, https://www.postgresql.org/docs/9.6/static/functions-datetime.html

now, CURRENT_TIMESTAMP, LOCALTIMESTAMP return the time of transaction.

This is considered a feature: the intent is to allow a single transaction to have a consistent notion of the “current” time, so that multiple modifications within the same transaction bear the same time stamp.

You might want to use statement_timestamp or clock_timestamp if you don’t want transaction timestamp.

statement_timestamp()

returns the start time of the current statement (more specifically, the time of receipt of the latest command message from the client). statement_timestamp

clock_timestamp()

returns the actual current time, and therefore its value changes even within a single SQL command.


在Matplotlib中为线上的单个点设置标记

问题:在Matplotlib中为线上的单个点设置标记

我已经使用Matplotlib在图形上绘制线条。现在,我想为线上的各个点设置样式,特别是标记。我该怎么做呢?

为了澄清我的问题,我希望能够为一行中的单个标记设置样式,而不是为该行上的每个标记设置样式。

I have used Matplotlib to plot lines on a figure. Now I would now like to set the style, specifically the marker, for individual points on the line. How do I do this?

To clarify my question, I want to be able to set the style for individual markers on a line, not every marker on said line.


回答 0

在调用中指定关键字args linestyle和/或。markerplot

例如,使用虚线和蓝色圆圈标记:

plt.plot(range(10), linestyle='--', marker='o', color='b')

相同内容的快捷方式调用:

plt.plot(range(10), '--bo')

这是可能的线条和标记样式的列表:

================    ===============================
character           description
================    ===============================
   -                solid line style
   --               dashed line style
   -.               dash-dot line style
   :                dotted line style
   .                point marker
   ,                pixel marker
   o                circle marker
   v                triangle_down marker
   ^                triangle_up marker
   <                triangle_left marker
   >                triangle_right marker
   1                tri_down marker
   2                tri_up marker
   3                tri_left marker
   4                tri_right marker
   s                square marker
   p                pentagon marker
   *                star marker
   h                hexagon1 marker
   H                hexagon2 marker
   +                plus marker
   x                x marker
   D                diamond marker
   d                thin_diamond marker
   |                vline marker
   _                hline marker
================    ===============================

编辑: 以标记点的任意子集为例,如注释中所要求:

import numpy as np
import matplotlib.pyplot as plt

xs = np.linspace(-np.pi, np.pi, 30)
ys = np.sin(xs)
markers_on = [12, 17, 18, 19]
plt.plot(xs, ys, '-gD', markevery=markers_on)
plt.show()

markevery由于此功能分支的合并,从1.4+开始,使用kwarg的最后一个示例是可能的。如果您坚持使用较旧版本的matplotlib,则仍可以通过在散点图上覆盖散点图来获得结果。有关更多详细信息,请参见编辑历史记录

Specify the keyword args linestyle and/or marker in your call to plot.

For example, using a dashed line and blue circle markers:

plt.plot(range(10), linestyle='--', marker='o', color='b')

A shortcut call for the same thing:

plt.plot(range(10), '--bo')

Here is a list of the possible line and marker styles:

================    ===============================
character           description
================    ===============================
   -                solid line style
   --               dashed line style
   -.               dash-dot line style
   :                dotted line style
   .                point marker
   ,                pixel marker
   o                circle marker
   v                triangle_down marker
   ^                triangle_up marker
   <                triangle_left marker
   >                triangle_right marker
   1                tri_down marker
   2                tri_up marker
   3                tri_left marker
   4                tri_right marker
   s                square marker
   p                pentagon marker
   *                star marker
   h                hexagon1 marker
   H                hexagon2 marker
   +                plus marker
   x                x marker
   D                diamond marker
   d                thin_diamond marker
   |                vline marker
   _                hline marker
================    ===============================

edit: with an example of marking an arbitrary subset of points, as requested in the comments:

import numpy as np
import matplotlib.pyplot as plt

xs = np.linspace(-np.pi, np.pi, 30)
ys = np.sin(xs)
markers_on = [12, 17, 18, 19]
plt.plot(xs, ys, '-gD', markevery=markers_on)
plt.show()

This last example using the markevery kwarg is possible in since 1.4+, due to the merge of this feature branch. If you are stuck on an older version of matplotlib, you can still achieve the result by overlaying a scatterplot on the line plot. See the edit history for more details.


回答 1

有一张图片显示所有标记的名称和描述,希望对您有帮助。

import matplotlib.pylab as plt
markers=['.',',','o','v','^','<','>','1','2','3','4','8','s','p','P','*','h','H','+','x','X','D','d','|','_']
descriptions=['point', 'pixel', 'circle', 'triangle_down', 'triangle_up','triangle_left', 'triangle_right', 'tri_down', 'tri_up', 'tri_left','tri_right', 'octagon', 'square', 'pentagon', 'plus (filled)','star', 'hexagon1', 'hexagon2', 'plus', 'x', 'x (filled)','diamond', 'thin_diamond', 'vline', 'hline']
x=[]
y=[]
for i in range(5):
    for j in range(5):
        x.append(i)
        y.append(j)
plt.figure()
for i,j,m,l in zip(x,y,markers,descriptions):
    plt.scatter(i,j,marker=m)
    plt.text(i-0.15,j+0.15,s=m+' : '+l)
plt.axis([-0.1,4.8,-0.1,4.5])
plt.tight_layout()
plt.axis('off')
plt.show()  

There is a picture show all markers’ name and description, i hope it will help you.

import matplotlib.pylab as plt
markers=['.',',','o','v','^','<','>','1','2','3','4','8','s','p','P','*','h','H','+','x','X','D','d','|','_']
descriptions=['point', 'pixel', 'circle', 'triangle_down', 'triangle_up','triangle_left', 'triangle_right', 'tri_down', 'tri_up', 'tri_left','tri_right', 'octagon', 'square', 'pentagon', 'plus (filled)','star', 'hexagon1', 'hexagon2', 'plus', 'x', 'x (filled)','diamond', 'thin_diamond', 'vline', 'hline']
x=[]
y=[]
for i in range(5):
    for j in range(5):
        x.append(i)
        y.append(j)
plt.figure()
for i,j,m,l in zip(x,y,markers,descriptions):
    plt.scatter(i,j,marker=m)
    plt.text(i-0.15,j+0.15,s=m+' : '+l)
plt.axis([-0.1,4.8,-0.1,4.5])
plt.tight_layout()
plt.axis('off')
plt.show()  


回答 2

供将来参考- Line2D艺术家返回的艺术家plot()还有一种set_markevery()方法,允许您仅在某些点上设置标记-请参见https://matplotlib.org/api/_as_gen/matplotlib.lines.Line2D.html#matplotlib.lines.Line2D。 set_markevery

For future reference – the Line2D artist returned by plot() also has a set_markevery() method which allows you to only set markers on certain points – see https://matplotlib.org/api/_as_gen/matplotlib.lines.Line2D.html#matplotlib.lines.Line2D.set_markevery


回答 3

更改特定点标记形状,大小的一个简单技巧是:首先将其与所有其他数据一起绘制,然后仅对该点(或一组点,如果要更改多个点的样式)再绘制一个图。假设我们要更改第二点的标记形状:

x = [1,2,3,4,5]
y = [2,1,3,6,7]

plt.plot(x, y, "-o")
x0 = [2]
y0 = [1]
plt.plot(x0, y0, "s")

plt.show()

结果是: 用多个标记绘制

A simple trick to change a particular point marker shape, size… is to first plot it with all the other data then plot one more plot only with that point(or set of points if you want to change the style of multiple points). Suppose we want to change the marker shape of second point:

x = [1,2,3,4,5]
y = [2,1,3,6,7]

plt.plot(x, y, "-o")
x0 = [2]
y0 = [1]
plt.plot(x0, y0, "s")

plt.show()

Result is: Plot with multiple markers