问题:是否有一种简单的方法来腌制python函数(或以其他方式序列化其代码)?
我正在尝试通过网络连接(使用asyncore)传输功能。是否有一种简单的方法来序列化python函数(至少在这种情况下不会有副作用),以便像这样进行传输?
理想情况下,我希望有一对类似于以下的函数:
def transmit(func):
obj = pickle.dumps(func)
[send obj across the network]
def receive():
[receive obj from the network]
func = pickle.loads(s)
func()
I’m trying to transfer a function across a network connection (using asyncore). Is there an easy way to serialize a python function (one that, in this case at least, will have no side effects) for transfer like this?
I would ideally like to have a pair of functions similar to these:
def transmit(func):
obj = pickle.dumps(func)
[send obj across the network]
def receive():
[receive obj from the network]
func = pickle.loads(s)
func()
回答 0
您可以序列化函数字节码,然后在调用方上对其进行重构。所述编组模块可以用于串行化处理的代码对象,然后可将其重新组装成一个函数。即:
import marshal
def foo(x): return x*x
code_string = marshal.dumps(foo.func_code)
然后在远程过程中(在传输code_string之后):
import marshal, types
code = marshal.loads(code_string)
func = types.FunctionType(code, globals(), "some_func_name")
func(10) # gives 100
一些警告:
元帅的格式(与此有关的任何python字节码)在主要python版本之间可能不兼容。
仅适用于cpython实现。
如果该函数引用了您需要使用的全局变量(包括导入的模块,其他函数等),则也需要对它们进行序列化或在远程端重新创建它们。我的示例只是为它提供了远程进程的全局命名空间。
您可能需要做更多的工作来支持更复杂的情况,例如闭包或生成器函数。
You could serialise the function bytecode and then reconstruct it on the caller. The marshal module can be used to serialise code objects, which can then be reassembled into a function. ie:
import marshal
def foo(x): return x*x
code_string = marshal.dumps(foo.func_code)
Then in the remote process (after transferring code_string):
import marshal, types
code = marshal.loads(code_string)
func = types.FunctionType(code, globals(), "some_func_name")
func(10) # gives 100
A few caveats:
marshal’s format (any python bytecode for that matter) may not be compatable between major python versions.
Will only work for cpython implementation.
If the function references globals (including imported modules, other functions etc) that you need to pick up, you’ll need to serialise these too, or recreate them on the remote side. My example just gives it the remote process’s global namespace.
You’ll probably need to do a bit more to support more complex cases, like closures or generator functions.
回答 1
请查看Dill,它扩展了Python的pickle库以支持更多类型,包括函数:
>>> import dill as pickle
>>> def f(x): return x + 1
...
>>> g = pickle.dumps(f)
>>> f(1)
2
>>> pickle.loads(g)(1)
2
它还支持对函数闭包中对象的引用:
>>> def plusTwo(x): return f(f(x))
...
>>> pickle.loads(pickle.dumps(plusTwo))(1)
3
Check out Dill, which extends Python’s pickle library to support a greater variety of types, including functions:
>>> import dill as pickle
>>> def f(x): return x + 1
...
>>> g = pickle.dumps(f)
>>> f(1)
2
>>> pickle.loads(g)(1)
2
It also supports references to objects in the function’s closure:
>>> def plusTwo(x): return f(f(x))
...
>>> pickle.loads(pickle.dumps(plusTwo))(1)
3
回答 2
回答 3
最简单的方法可能是inspect.getsource(object)
(请参阅inspect模块),该方法返回带有函数或方法的源代码的String。
The most simple way is probably inspect.getsource(object)
(see the inspect module) which returns a String with the source code for a function or a method.
回答 4
这完全取决于您是否在运行时生成函数:
如果这样做- inspect.getsource(object)
因为动态生成的函数会从.py
文件中获取对象的源代码,因此不适用于动态生成的函数,因此只能将在执行之前定义的函数作为源来检索。
而且,如果您的函数仍然放置在文件中,为什么不让接收者访问它们,而只传递模块和函数名。
我能想到的动态创建函数的唯一解决方案是在发送,发送源然后eval()
在接收器端将其构造为字符串。
编辑:该marshal
解决方案看起来也很聪明,不知道您可以序列化其他内置的东西
It all depends on whether you generate the function at runtime or not:
If you do – inspect.getsource(object)
won’t work for dynamically generated functions as it gets object’s source from .py
file, so only functions defined before execution can be retrieved as source.
And if your functions are placed in files anyway, why not give receiver access to them and only pass around module and function names.
The only solution for dynamically created functions that I can think of is to construct function as a string before transmission, transmit source, and then eval()
it on the receiver side.
Edit: the marshal
solution looks also pretty smart, didn’t know you can serialize something other thatn built-ins
回答 5
回答 6
code_string ='''
def foo(x):
返回x * 2
def bar(x):
返回x ** 2
'''
obj = pickle.dumps(code_string)
现在
exec(pickle.loads(obj))
富(1)
> 2
酒吧(3)
> 9
code_string = '''
def foo(x):
return x * 2
def bar(x):
return x ** 2
'''
obj = pickle.dumps(code_string)
Now
exec(pickle.loads(obj))
foo(1)
> 2
bar(3)
> 9
回答 7
你可以这样做:
def fn_generator():
def fn(x, y):
return x + y
return fn
现在,transmit(fn_generator())
将发送实际定义fn(x,y)
而不是对模块名称的引用。
您可以使用相同的技巧通过网络发送类。
You can do this:
def fn_generator():
def fn(x, y):
return x + y
return fn
Now, transmit(fn_generator())
will send the actual definiton of fn(x,y)
instead of a reference to the module name.
You can use the same trick to send classes across network.
回答 8
该模块使用的基本功能涵盖了您的查询,此外,您还可以通过网络获得最佳的压缩效果;参见说明性源代码:
y_serial.py模块::使用SQLite仓库Python对象
“序列化+持久性::在几行代码中,将Python对象压缩并注释为SQLite;然后稍后按关键字顺序按顺序检索它们,而无需任何SQL。数据库最有用的”标准”模块用于存储较少模式的数据。”
http://yserial.sourceforge.net
The basic functions used for this module covers your query, plus you get the best compression over the wire; see the instructive source code:
y_serial.py module :: warehouse Python objects with SQLite
“Serialization + persistance :: in a few lines of code, compress and annotate Python objects into SQLite; then later retrieve them chronologically by keywords without any SQL. Most useful “standard” module for a database to store schema-less data.”
http://yserial.sourceforge.net
回答 9
Cloudpickle可能就是您想要的。Cloudpickle描述如下:
cloudpickle对于群集计算特别有用,在群集计算中,Python代码通过网络传送以在可能接近数据的远程主机上执行。
用法示例:
def add_one(n):
return n + 1
pickled_function = cloudpickle.dumps(add_one)
pickle.loads(pickled_function)(42)
Cloudpickle is probably what you are looking for.
Cloudpickle is described as follows:
cloudpickle is especially useful for cluster computing where Python
code is shipped over the network to execute on remote hosts, possibly
close to the data.
Usage example:
def add_one(n):
return n + 1
pickled_function = cloudpickle.dumps(add_one)
pickle.loads(pickled_function)(42)
回答 10
这是一个帮助程序类,您可以用来包装函数以使它们可腌制。已经提到的注意事项marshal
将适用,但是将尽一切可能使用泡菜。不会在序列化过程中保留全局或闭包。
class PicklableFunction:
def __init__(self, fun):
self._fun = fun
def __call__(self, *args, **kwargs):
return self._fun(*args, **kwargs)
def __getstate__(self):
try:
return pickle.dumps(self._fun)
except Exception:
return marshal.dumps((self._fun.__code__, self._fun.__name__))
def __setstate__(self, state):
try:
self._fun = pickle.loads(state)
except Exception:
code, name = marshal.loads(state)
self._fun = types.FunctionType(code, {}, name)
Here is a helper class you can use to wrap functions in order to make them picklable. Caveats already mentioned for marshal
will apply but an effort is made to use pickle whenever possible. No effort is made to preserve globals or closures across serialization.
class PicklableFunction:
def __init__(self, fun):
self._fun = fun
def __call__(self, *args, **kwargs):
return self._fun(*args, **kwargs)
def __getstate__(self):
try:
return pickle.dumps(self._fun)
except Exception:
return marshal.dumps((self._fun.__code__, self._fun.__name__))
def __setstate__(self, state):
try:
self._fun = pickle.loads(state)
except Exception:
code, name = marshal.loads(state)
self._fun = types.FunctionType(code, {}, name)