问题:Python中存在可变的命名元组吗?
任何人都可以修改namedtuple或提供替代类,以使其适用于可变对象吗?
主要是为了提高可读性,我想要执行类似于namedtuple的操作:
from Camelot import namedgroup
Point = namedgroup('Point', ['x', 'y'])
p = Point(0, 0)
p.x = 10
>>> p
Point(x=10, y=0)
>>> p.x *= 10
Point(x=100, y=0)
腌制所得物体必须是可能的。并且根据命名元组的特征,在表示对象时输出的顺序必须与构造对象时参数列表的顺序相匹配。
Can anyone amend namedtuple or provide an alternative class so that it works for mutable objects?
Primarily for readability, I would like something similar to namedtuple that does this:
from Camelot import namedgroup
Point = namedgroup('Point', ['x', 'y'])
p = Point(0, 0)
p.x = 10
>>> p
Point(x=10, y=0)
>>> p.x *= 10
Point(x=100, y=0)
It must be possible to pickle the resulting object. And per the characteristics of named tuple, the ordering of the output when represented must match the order of the parameter list when constructing the object.
回答 0
还有就是一个可变的替代方案collections.namedtuple
– recordclass。
它具有与API相同的API和内存占用量,namedtuple
并且支持分配(它也应该更快)。例如:
from recordclass import recordclass
Point = recordclass('Point', 'x y')
>>> p = Point(1, 2)
>>> p
Point(x=1, y=2)
>>> print(p.x, p.y)
1 2
>>> p.x += 2; p.y += 3; print(p)
Point(x=3, y=5)
对于python 3.6及更高版本recordclass
(从0.5开始)支持typehints:
from recordclass import recordclass, RecordClass
class Point(RecordClass):
x: int
y: int
>>> Point.__annotations__
{'x':int, 'y':int}
>>> p = Point(1, 2)
>>> p
Point(x=1, y=2)
>>> print(p.x, p.y)
1 2
>>> p.x += 2; p.y += 3; print(p)
Point(x=3, y=5)
有一个更完整的示例(还包括性能比较)。
由于0.9 recordclass
库提供了另一个变体- recordclass.structclass
工厂功能。它可以产生类,其实例比__slots__
基于实例的实例占用更少的内存。这对于具有属性值的实例非常重要,该属性值不打算具有参考周期。如果您需要创建数百万个实例,则可能有助于减少内存使用。这是一个说明性的例子。
There is a mutable alternative to collections.namedtuple
– recordclass.
It has the same API and memory footprint as namedtuple
and it supports assignments (It should be faster as well). For example:
from recordclass import recordclass
Point = recordclass('Point', 'x y')
>>> p = Point(1, 2)
>>> p
Point(x=1, y=2)
>>> print(p.x, p.y)
1 2
>>> p.x += 2; p.y += 3; print(p)
Point(x=3, y=5)
For python 3.6 and higher recordclass
(since 0.5) support typehints:
from recordclass import recordclass, RecordClass
class Point(RecordClass):
x: int
y: int
>>> Point.__annotations__
{'x':int, 'y':int}
>>> p = Point(1, 2)
>>> p
Point(x=1, y=2)
>>> print(p.x, p.y)
1 2
>>> p.x += 2; p.y += 3; print(p)
Point(x=3, y=5)
There is a more complete example (it also includes performance comparisons).
Since 0.9 recordclass
library provides another variant — recordclass.structclass
factory function. It can produce classes, whose instances occupy less memory than __slots__
-based instances. This is can be important for the instances with attribute values, which has not intended to have reference cycles. It may help reduce memory usage if you need to create millions of instances. Here is an illustrative example.
回答 1
types.SimpleNamespace在Python 3.3中引入,并支持所要求的要求。
from types import SimpleNamespace
t = SimpleNamespace(foo='bar')
t.ham = 'spam'
print(t)
namespace(foo='bar', ham='spam')
print(t.foo)
'bar'
import pickle
with open('/tmp/pickle', 'wb') as f:
pickle.dump(t, f)
types.SimpleNamespace was introduced in Python 3.3 and supports the requested requirements.
from types import SimpleNamespace
t = SimpleNamespace(foo='bar')
t.ham = 'spam'
print(t)
namespace(foo='bar', ham='spam')
print(t.foo)
'bar'
import pickle
with open('/tmp/pickle', 'wb') as f:
pickle.dump(t, f)
回答 2
作为此任务的一种非常Pythonic的替代方法,从Python-3.7开始,您可以使用
dataclasses
不仅行为可变的模块,NamedTuple
因为它们使用常规的类定义,而且还支持其他类功能。
从PEP-0557:
尽管它们使用了非常不同的机制,但是可以将数据类视为“具有默认值的可变命名元组”。因为数据类使用普通的类定义语法,所以您可以自由使用继承,元类,文档字符串,用户定义的方法,类工厂和其他Python类功能。
提供了一个类装饰器,该类装饰器检查类定义中具有类型注释的变量,如PEP 526 “变量注释的语法”中所定义。在本文档中,此类变量称为字段。装饰器使用这些字段将生成的方法定义添加到类中,以支持实例初始化,repr,比较方法以及(可选)规范部分中描述的其他方法。这样的类称为数据类,但该类实际上没有什么特别的:装饰器将生成的方法添加到该类中并返回给定的相同类。
PEP-0557中引入了此功能,您可以在提供的文档链接上详细了解它。
例:
In [20]: from dataclasses import dataclass
In [21]: @dataclass
...: class InventoryItem:
...: '''Class for keeping track of an item in inventory.'''
...: name: str
...: unit_price: float
...: quantity_on_hand: int = 0
...:
...: def total_cost(self) -> float:
...: return self.unit_price * self.quantity_on_hand
...:
演示:
In [23]: II = InventoryItem('bisc', 2000)
In [24]: II
Out[24]: InventoryItem(name='bisc', unit_price=2000, quantity_on_hand=0)
In [25]: II.name = 'choco'
In [26]: II.name
Out[26]: 'choco'
In [27]:
In [27]: II.unit_price *= 3
In [28]: II.unit_price
Out[28]: 6000
In [29]: II
Out[29]: InventoryItem(name='choco', unit_price=6000, quantity_on_hand=0)
As a very Pythonic alternative for this task, since Python-3.7, you can use
dataclasses
module that not only behaves like a mutable NamedTuple
because they use normal class definitions they also support other classes features.
From PEP-0557:
Although they use a very different mechanism, Data Classes can be thought of as “mutable namedtuples with defaults”. Because Data Classes use normal class definition syntax, you are free to use inheritance, metaclasses, docstrings, user-defined methods, class factories, and other Python class features.
A class decorator is provided which inspects a class definition for variables with type annotations as defined in PEP 526, “Syntax for Variable Annotations”. In this document, such variables are called fields. Using these fields, the decorator adds generated method definitions to the class to support instance initialization, a repr, comparison methods, and optionally other methods as described in the Specification section. Such a class is called a Data Class, but there’s really nothing special about the class: the decorator adds generated methods to the class and returns the same class it was given.
This feature is introduced in PEP-0557 that you can read about it in more details on provided documentation link.
Example:
In [20]: from dataclasses import dataclass
In [21]: @dataclass
...: class InventoryItem:
...: '''Class for keeping track of an item in inventory.'''
...: name: str
...: unit_price: float
...: quantity_on_hand: int = 0
...:
...: def total_cost(self) -> float:
...: return self.unit_price * self.quantity_on_hand
...:
Demo:
In [23]: II = InventoryItem('bisc', 2000)
In [24]: II
Out[24]: InventoryItem(name='bisc', unit_price=2000, quantity_on_hand=0)
In [25]: II.name = 'choco'
In [26]: II.name
Out[26]: 'choco'
In [27]:
In [27]: II.unit_price *= 3
In [28]: II.unit_price
Out[28]: 6000
In [29]: II
Out[29]: InventoryItem(name='choco', unit_price=6000, quantity_on_hand=0)
回答 3
截至2016 年 1月11日,最新的namedlist 1.7通过了Python 2.7和Python 3.5的所有测试。它是纯python实现,而recordclass
C是C扩展。当然,是否需要C扩展名取决于您的要求。
您的测试(也请参见下面的注释):
from __future__ import print_function
import pickle
import sys
from namedlist import namedlist
Point = namedlist('Point', 'x y')
p = Point(x=1, y=2)
print('1. Mutation of field values')
p.x *= 10
p.y += 10
print('p: {}, {}\n'.format(p.x, p.y))
print('2. String')
print('p: {}\n'.format(p))
print('3. Representation')
print(repr(p), '\n')
print('4. Sizeof')
print('size of p:', sys.getsizeof(p), '\n')
print('5. Access by name of field')
print('p: {}, {}\n'.format(p.x, p.y))
print('6. Access by index')
print('p: {}, {}\n'.format(p[0], p[1]))
print('7. Iterative unpacking')
x, y = p
print('p: {}, {}\n'.format(x, y))
print('8. Iteration')
print('p: {}\n'.format([v for v in p]))
print('9. Ordered Dict')
print('p: {}\n'.format(p._asdict()))
print('10. Inplace replacement (update?)')
p._update(x=100, y=200)
print('p: {}\n'.format(p))
print('11. Pickle and Unpickle')
pickled = pickle.dumps(p)
unpickled = pickle.loads(pickled)
assert p == unpickled
print('Pickled successfully\n')
print('12. Fields\n')
print('p: {}\n'.format(p._fields))
print('13. Slots')
print('p: {}\n'.format(p.__slots__))
在Python 2.7上输出
1.字段值的突变
p:10、12
2.字符串
p:点(x = 10,y = 12)
3.陈述
点(x = 10,y = 12)
4. Sizeof
p的大小:64
5.按字段名称访问
p:10、12
6.按索引访问
p:10、12
7.迭代拆包
p:10、12
8.迭代
p:[10、12]
9.有序词典
p:OrderedDict([['x',10),('y',12)])
10.就地更换(更新?)
p:点(x = 100,y = 200)
11.泡菜和腌菜
腌制成功
12.领域
p:('x','y')
13.插槽
p:('x','y')
与Python 3.5的唯一区别是namedlist
变得更小,大小为56(Python 2.7报告64)。
请注意,我已将您的测试10更改为就地更换。该namedlist
有_replace()
哪些做了浅拷贝的方法,这使我感觉良好,因为namedtuple
在标准库的工作方式。更改_replace()
方法的语义会造成混乱。我认为该_update()
方法应用于就地更新。还是我无法理解您的测试10的意图?
The latest namedlist 1.7 passes all of your tests with both Python 2.7 and Python 3.5 as of Jan 11, 2016. It is a pure python implementation whereas the recordclass
is a C extension. Of course, it depends on your requirements whether a C extension is preferred or not.
Your tests (but also see the note below):
from __future__ import print_function
import pickle
import sys
from namedlist import namedlist
Point = namedlist('Point', 'x y')
p = Point(x=1, y=2)
print('1. Mutation of field values')
p.x *= 10
p.y += 10
print('p: {}, {}\n'.format(p.x, p.y))
print('2. String')
print('p: {}\n'.format(p))
print('3. Representation')
print(repr(p), '\n')
print('4. Sizeof')
print('size of p:', sys.getsizeof(p), '\n')
print('5. Access by name of field')
print('p: {}, {}\n'.format(p.x, p.y))
print('6. Access by index')
print('p: {}, {}\n'.format(p[0], p[1]))
print('7. Iterative unpacking')
x, y = p
print('p: {}, {}\n'.format(x, y))
print('8. Iteration')
print('p: {}\n'.format([v for v in p]))
print('9. Ordered Dict')
print('p: {}\n'.format(p._asdict()))
print('10. Inplace replacement (update?)')
p._update(x=100, y=200)
print('p: {}\n'.format(p))
print('11. Pickle and Unpickle')
pickled = pickle.dumps(p)
unpickled = pickle.loads(pickled)
assert p == unpickled
print('Pickled successfully\n')
print('12. Fields\n')
print('p: {}\n'.format(p._fields))
print('13. Slots')
print('p: {}\n'.format(p.__slots__))
Output on Python 2.7
1. Mutation of field values
p: 10, 12
2. String
p: Point(x=10, y=12)
3. Representation
Point(x=10, y=12)
4. Sizeof
size of p: 64
5. Access by name of field
p: 10, 12
6. Access by index
p: 10, 12
7. Iterative unpacking
p: 10, 12
8. Iteration
p: [10, 12]
9. Ordered Dict
p: OrderedDict([('x', 10), ('y', 12)])
10. Inplace replacement (update?)
p: Point(x=100, y=200)
11. Pickle and Unpickle
Pickled successfully
12. Fields
p: ('x', 'y')
13. Slots
p: ('x', 'y')
The only difference with Python 3.5 is that the namedlist
has become smaller, the size is 56 (Python 2.7 reports 64).
Note that I have changed your test 10 for in-place replacement. The namedlist
has a _replace()
method which does a shallow copy, and that makes perfect sense to me because the namedtuple
in the standard library behaves the same way. Changing the semantics of the _replace()
method would be confusing. In my opinion the _update()
method should be used for in-place updates. Or maybe I failed to understand the intent of your test 10?
回答 4
看来这个问题的答案是否定的。
下面的内容非常接近,但从技术上讲并不是可变的。这将创建一个namedtuple()
具有更新的x值的新实例:
Point = namedtuple('Point', ['x', 'y'])
p = Point(0, 0)
p = p._replace(x=10)
另一方面,您可以使用创建一个简单的类__slots__
,该类应该可以很好地用于频繁更新类实例属性:
class Point:
__slots__ = ['x', 'y']
def __init__(self, x, y):
self.x = x
self.y = y
为了补充这个答案,我认为__slots__
在这里很好用,因为当您创建许多类实例时,它的内存使用效率很高。唯一的缺点是您不能创建新的类属性。
这是一个说明内存效率的相关线程 -Dictionary vs Object-效率更高,为什么?
该线程答案中引用的内容非常简洁地解释了为什么__slots__
内存效率更高-Python插槽
It seems like the answer to this question is no.
Below is pretty close, but it’s not technically mutable. This is creating a new namedtuple()
instance with an updated x value:
Point = namedtuple('Point', ['x', 'y'])
p = Point(0, 0)
p = p._replace(x=10)
On the other hand, you can create a simple class using __slots__
that should work well for frequently updating class instance attributes:
class Point:
__slots__ = ['x', 'y']
def __init__(self, x, y):
self.x = x
self.y = y
To add to this answer, I think __slots__
is good use here because it’s memory efficient when you create lots of class instances. The only downside is that you can’t create new class attributes.
Here’s one relevant thread that illustrates the memory efficiency – Dictionary vs Object – which is more efficient and why?
The quoted content in the answer of this thread is a very succinct explanation why __slots__
is more memory efficient – Python slots
回答 5
以下是适用于Python 3的良好解决方案:最小类使用__slots__
和Sequence
抽象基类;不会执行类似的错误检测,但它可以工作,并且其行为基本上类似于可变元组(类型检查除外)。
from collections import Sequence
class NamedMutableSequence(Sequence):
__slots__ = ()
def __init__(self, *a, **kw):
slots = self.__slots__
for k in slots:
setattr(self, k, kw.get(k))
if a:
for k, v in zip(slots, a):
setattr(self, k, v)
def __str__(self):
clsname = self.__class__.__name__
values = ', '.join('%s=%r' % (k, getattr(self, k))
for k in self.__slots__)
return '%s(%s)' % (clsname, values)
__repr__ = __str__
def __getitem__(self, item):
return getattr(self, self.__slots__[item])
def __setitem__(self, item, value):
return setattr(self, self.__slots__[item], value)
def __len__(self):
return len(self.__slots__)
class Point(NamedMutableSequence):
__slots__ = ('x', 'y')
例:
>>> p = Point(0, 0)
>>> p.x = 10
>>> p
Point(x=10, y=0)
>>> p.x *= 10
>>> p
Point(x=100, y=0)
如果需要,您也可以使用一种方法来创建类(尽管使用显式类更为透明):
def namedgroup(name, members):
if isinstance(members, str):
members = members.split()
members = tuple(members)
return type(name, (NamedMutableSequence,), {'__slots__': members})
例:
>>> Point = namedgroup('Point', ['x', 'y'])
>>> Point(6, 42)
Point(x=6, y=42)
在Python 2中,您需要稍作调整-如果您和__slots__
将停止工作。
Python 2中的解决方案是不继承Sequence
,而是继承object
。如果isinstance(Point, Sequence) == True
需要,您需要将NamedMutableSequence
作为基本类注册 到Sequence
:
Sequence.register(NamedMutableSequence)
The following is a good solution for Python 3: A minimal class using __slots__
and Sequence
abstract base class; does not do fancy error detection or such, but it works, and behaves mostly like a mutable tuple (except for typecheck).
from collections import Sequence
class NamedMutableSequence(Sequence):
__slots__ = ()
def __init__(self, *a, **kw):
slots = self.__slots__
for k in slots:
setattr(self, k, kw.get(k))
if a:
for k, v in zip(slots, a):
setattr(self, k, v)
def __str__(self):
clsname = self.__class__.__name__
values = ', '.join('%s=%r' % (k, getattr(self, k))
for k in self.__slots__)
return '%s(%s)' % (clsname, values)
__repr__ = __str__
def __getitem__(self, item):
return getattr(self, self.__slots__[item])
def __setitem__(self, item, value):
return setattr(self, self.__slots__[item], value)
def __len__(self):
return len(self.__slots__)
class Point(NamedMutableSequence):
__slots__ = ('x', 'y')
Example:
>>> p = Point(0, 0)
>>> p.x = 10
>>> p
Point(x=10, y=0)
>>> p.x *= 10
>>> p
Point(x=100, y=0)
If you want, you can have a method to create the class too (though using an explicit class is more transparent):
def namedgroup(name, members):
if isinstance(members, str):
members = members.split()
members = tuple(members)
return type(name, (NamedMutableSequence,), {'__slots__': members})
Example:
>>> Point = namedgroup('Point', ['x', 'y'])
>>> Point(6, 42)
Point(x=6, y=42)
In Python 2 you need to adjust it slightly – if you and the __slots__
will stop from working.
The solution in Python 2 is to not inherit from Sequence
, but object
. If isinstance(Point, Sequence) == True
is desired, you need to register the NamedMutableSequence
as a base class to Sequence
:
Sequence.register(NamedMutableSequence)
回答 6
让我们通过动态类型创建来实现这一点:
import copy
def namedgroup(typename, fieldnames):
def init(self, **kwargs):
attrs = {k: None for k in self._attrs_}
for k in kwargs:
if k in self._attrs_:
attrs[k] = kwargs[k]
else:
raise AttributeError('Invalid Field')
self.__dict__.update(attrs)
def getattribute(self, attr):
if attr.startswith("_") or attr in self._attrs_:
return object.__getattribute__(self, attr)
else:
raise AttributeError('Invalid Field')
def setattr(self, attr, value):
if attr in self._attrs_:
object.__setattr__(self, attr, value)
else:
raise AttributeError('Invalid Field')
def rep(self):
d = ["{}={}".format(v,self.__dict__[v]) for v in self._attrs_]
return self._typename_ + '(' + ', '.join(d) + ')'
def iterate(self):
for x in self._attrs_:
yield self.__dict__[x]
raise StopIteration()
def setitem(self, *args, **kwargs):
return self.__dict__.__setitem__(*args, **kwargs)
def getitem(self, *args, **kwargs):
return self.__dict__.__getitem__(*args, **kwargs)
attrs = {"__init__": init,
"__setattr__": setattr,
"__getattribute__": getattribute,
"_attrs_": copy.deepcopy(fieldnames),
"_typename_": str(typename),
"__str__": rep,
"__repr__": rep,
"__len__": lambda self: len(fieldnames),
"__iter__": iterate,
"__setitem__": setitem,
"__getitem__": getitem,
}
return type(typename, (object,), attrs)
这将在允许操作继续之前检查属性以查看它们是否有效。
那么,这是可腌制的吗?是(且仅当您执行以下操作时):
>>> import pickle
>>> Point = namedgroup("Point", ["x", "y"])
>>> p = Point(x=100, y=200)
>>> p2 = pickle.loads(pickle.dumps(p))
>>> p2.x
100
>>> p2.y
200
>>> id(p) != id(p2)
True
该定义必须在您的命名空间中,并且必须存在足够长的时间,以便pickle可以找到它。因此,如果您将其定义在包中,则应该可以使用。
Point = namedgroup("Point", ["x", "y"])
如果您执行以下操作,或者将定义设为临时定义,则Pickle将失败(例如,函数结束时超出范围):
some_point = namedgroup("Point", ["x", "y"])
是的,它确实保留了类型创建中列出的字段的顺序。
Let’s implement this with dynamic type creation:
import copy
def namedgroup(typename, fieldnames):
def init(self, **kwargs):
attrs = {k: None for k in self._attrs_}
for k in kwargs:
if k in self._attrs_:
attrs[k] = kwargs[k]
else:
raise AttributeError('Invalid Field')
self.__dict__.update(attrs)
def getattribute(self, attr):
if attr.startswith("_") or attr in self._attrs_:
return object.__getattribute__(self, attr)
else:
raise AttributeError('Invalid Field')
def setattr(self, attr, value):
if attr in self._attrs_:
object.__setattr__(self, attr, value)
else:
raise AttributeError('Invalid Field')
def rep(self):
d = ["{}={}".format(v,self.__dict__[v]) for v in self._attrs_]
return self._typename_ + '(' + ', '.join(d) + ')'
def iterate(self):
for x in self._attrs_:
yield self.__dict__[x]
raise StopIteration()
def setitem(self, *args, **kwargs):
return self.__dict__.__setitem__(*args, **kwargs)
def getitem(self, *args, **kwargs):
return self.__dict__.__getitem__(*args, **kwargs)
attrs = {"__init__": init,
"__setattr__": setattr,
"__getattribute__": getattribute,
"_attrs_": copy.deepcopy(fieldnames),
"_typename_": str(typename),
"__str__": rep,
"__repr__": rep,
"__len__": lambda self: len(fieldnames),
"__iter__": iterate,
"__setitem__": setitem,
"__getitem__": getitem,
}
return type(typename, (object,), attrs)
This checks the attributes to see if they are valid before allowing the operation to continue.
So is this pickleable? Yes if (and only if) you do the following:
>>> import pickle
>>> Point = namedgroup("Point", ["x", "y"])
>>> p = Point(x=100, y=200)
>>> p2 = pickle.loads(pickle.dumps(p))
>>> p2.x
100
>>> p2.y
200
>>> id(p) != id(p2)
True
The definition has to be in your namespace, and must exist long enough for pickle to find it. So if you define this to be in your package, it should work.
Point = namedgroup("Point", ["x", "y"])
Pickle will fail if you do the following, or make the definition temporary (goes out of scope when the function ends, say):
some_point = namedgroup("Point", ["x", "y"])
And yes, it does preserve the order of the fields listed in the type creation.
回答 7
根据定义,元组是不可变的。
但是,您可以创建一个字典子类,在其中可以使用点符号访问属性。
In [1]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:class AttrDict(dict):
:
: def __getattr__(self, name):
: return self[name]
:
: def __setattr__(self, name, value):
: self[name] = value
:--
In [2]: test = AttrDict()
In [3]: test.a = 1
In [4]: test.b = True
In [5]: test
Out[5]: {'a': 1, 'b': True}
Tuples are by definition immutable.
You can however make a dictionary subclass where you can access the attributes with dot-notation;
In [1]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:class AttrDict(dict):
:
: def __getattr__(self, name):
: return self[name]
:
: def __setattr__(self, name, value):
: self[name] = value
:--
In [2]: test = AttrDict()
In [3]: test.a = 1
In [4]: test.b = True
In [5]: test
Out[5]: {'a': 1, 'b': True}
回答 8
If you want similar behavior as namedtuples but mutable try namedlist
Note that in order to be mutable it cannot be a tuple.
回答 9
如果性能并不重要,则可以使用如下愚蠢的方法:
from collection import namedtuple
Point = namedtuple('Point', 'x y z')
mutable_z = Point(1,2,[3])
Provided performance is of little importance, one could use a silly hack like:
from collection import namedtuple
Point = namedtuple('Point', 'x y z')
mutable_z = Point(1,2,[3])
声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。