您是否可以仅用嵌套对补丁嵌套函数进行修补，还是必须重复整个外部函数？

徐英锐

2023-03-14

问题内容：

我们使用的第三方库包含一个相当长的函数，该函数内部使用了嵌套函数。我们对该库的使用触发了该函数中的一个错误，我们非常想解决该错误。

不幸的是，库维护人员的修复速度有些慢，但是我们不想派遣库。在他们解决此问题之前，我们也无法保留发布。

我们宁愿使用Monkey-
patching在这里解决此问题，因为它比修补源更容易跟踪。但是，如果要重复一个非常大的功能，只需替换内部功能就足够了，这会让人觉得过头了，而且其他人也很难看到我们到底做了什么更改。我们是否在图书馆鸡蛋上贴了一块静态补丁？

内部函数依赖于对变量的封闭；一个人为的例子是：

def outerfunction(*args):
    def innerfunction(val):
        return someformat.format(val)

    someformat = 'Foo: {}'
    for arg in args:
        yield innerfunction(arg)

我们只想替换的实现innerfunction()。实际的外部功能远不止于此。当然，我们将重用封闭变量并维护函数签名。

问题答案：

是的，即使使用闭包，也可以替换内部函数。但是，您将不得不跳过几圈。请考虑：

您还需要将替换函数创建为嵌套函数，以确保Python创建相同的闭包。如果原始函数在名称foo和上有一个闭包，则bar需要将替换定义为闭有相同名称的嵌套函数。更重要的是，您需要以 相同的顺序 使用这些名称；闭包由索引引用。
猴子修补程序始终很脆弱，并且可能随着实现的更改而中断。这也不例外。每当您更改修补程序库的版本时，请重新测试您的猴子修补程序。

为了理解它是如何工作的，我将首先解释Python如何处理嵌套函数。Python使用代码
对象根据需要生成函数对象。每个代码对象都有一个关联的常量序列，嵌套函数的代码对象按该序列存储：

>>> def outerfunction(*args):
...     def innerfunction(val):
...         return someformat.format(val)
...     someformat = 'Foo: {}'
...     for arg in args:
...         yield innerfunction(arg)
... 
>>> outerfunction.__code__
<code object outerfunction at 0x105b27ab0, file "<stdin>", line 1>
>>> outerfunction.__code__.co_consts
(None, <code object innerfunction at 0x10f136ed0, file "<stdin>", line 2>, 'outerfunction.<locals>.innerfunction', 'Foo: {}')

该co_consts序列是一个不变的对象，一个元组，因此我们不能只交换内部代码对象。再告诉我们如何将产生一个新的函数对象与刚才的代码替换对象。

接下来，我们需要介绍闭包。在编译时，Python确定a）someformat不是in的本地名称，innerfunction并且b）它在中的相同名称之上outerfunction。Python不仅会生成字节码以产生正确的名称查找，而且还将对嵌套函数和外部函数的代码对象进行注释，以记录someformat将要关闭的代码：

>>> outerfunction.__code__.co_cellvars
('someformat',)
>>> outerfunction.__code__.co_consts[1].co_freevars
('someformat',)

您要确保替换内部代码对象仅将那些相同的名称列出为自由变量，并且这样做的顺序相同。

关闭是在运行时创建的；产生它们的字节码是外部函数的一部分：

>>> import dis
>>> dis.dis(outerfunction)
  2           0 LOAD_CLOSURE             0 (someformat)
              2 BUILD_TUPLE              1
              4 LOAD_CONST               1 (<code object innerfunction at 0x10f136ed0, file "<stdin>", line 2>)
              6 LOAD_CONST               2 ('outerfunction.<locals>.innerfunction')
              8 MAKE_FUNCTION            8 (closure)
             10 STORE_FAST               1 (innerfunction)

# ... rest of disassembly omitted ...

LOAD_CLOSURE那里的字节码为someformat变量创建了一个闭包。Python按照 在内部函数中首次使用的顺序
创建与该函数使用的闭合一样多的闭合。这是以后要记住的重要事实。函数本身按位置查找这些闭包：

>>> dis.dis(outerfunction.__code__.co_consts[1])
  3           0 LOAD_DEREF               0 (someformat)
              2 LOAD_METHOD              0 (format)
              4 LOAD_FAST                0 (val)
              6 CALL_METHOD              1
              8 RETURN_VALUE

该LOAD_DEREF操作码选择了在关闭位置0这里访问的someformat关闭。

从理论上讲，这也意味着您可以为内部函数中的闭包使用完全不同的名称，但是出于调试目的，坚持使用相同的名称更加有意义。这也使验证替换功能正确插入插槽变得容易，因为co_freevars如果使用相同的名称，您可以比较元组。

现在是交换技巧。函数是对象，就像Python中的其他对象一样，是特定类型的实例。该类型通常不会公开，但是type()调用仍会返回它。这同样适用于代码对象，并且两种类型甚至都有文档：

>>> type(outerfunction)
<type 'function'>
>>> print(type(outerfunction).__doc__)
Create a function object.

  code
    a code object
  globals
    the globals dictionary
  name
    a string that overrides the name from the code object
  argdefs
    a tuple that specifies the default argument values
  closure
    a tuple that supplies the bindings for free variables
>>> type(outerfunction.__code__)
<type 'code'>
>>> print(type(outerfunction.__code__).__doc__)
code(argcount, posonlyargcount, kwonlyargcount, nlocals, stacksize,
      flags, codestring, constants, names, varnames, filename, name,
      firstlineno, lnotab[, freevars[, cellvars]])

Create a code object.  Not for the faint of heart.

（确切的参数计数和文档字符串在不同的Python版本之间有所不同； Python 3.0添加了该kwonlyargcount参数，从Python
3.8开始，添加了posonlyargcount）。

我们将使用这些类型对象来生成code具有更新的常量的新对象，然后生成具有更新的代码对象的新功能对象；以下函数与Python 2.7至3.8版本兼容。

def replace_inner_function(outer, new_inner):
    """Replace a nested function code object used by outer with new_inner

    The replacement new_inner must use the same name and must at most use the
    same closures as the original.

    """
    if hasattr(new_inner, '__code__'):
        # support both functions and code objects
        new_inner = new_inner.__code__

    # find original code object so we can validate the closures match
    ocode = outer.__code__
    function, code = type(outer), type(ocode)
    iname = new_inner.co_name
    orig_inner = next(
        const for const in ocode.co_consts
        if isinstance(const, code) and const.co_name == iname)

    # you can ignore later closures, but since they are matched by position
    # the new sequence must match the start of the old.
    assert (orig_inner.co_freevars[:len(new_inner.co_freevars)] ==
            new_inner.co_freevars), 'New closures must match originals'

    # replace the code object for the inner function
    new_consts = tuple(
        new_inner if const is orig_inner else const
        for const in outer.__code__.co_consts)

    # create a new code object with the new constants
    try:
        # Python 3.8 added code.replace(), so much more convenient!
        ncode = ocode.replace(co_consts=new_consts)
    except AttributeError:
        # older Python versions, argument counts vary so we need to check
        # for specifics.
        args = [
            ocode.co_argcount, ocode.co_nlocals, ocode.co_stacksize,
            ocode.co_flags, ocode.co_code,
            new_consts,  # replacing the constants
            ocode.co_names, ocode.co_varnames, ocode.co_filename,
            ocode.co_name, ocode.co_firstlineno, ocode.co_lnotab,
            ocode.co_freevars, ocode.co_cellvars,
        ]
        if hasattr(ocode, 'co_kwonlyargcount'):
            # Python 3+, insert after co_argcount
            args.insert(1, ocode.co_kwonlyargcount)
        # Python 3.8 adds co_posonlyargcount, but also has code.replace(), used above
        ncode = code(*args)

    # and a new function object using the updated code object
    return function(
        ncode, outer.__globals__, outer.__name__,
        outer.__defaults__, outer.__closure__
    )

上面的函数验证新的内部函数（可以作为代码对象或函数形式传入）确实将使用与原始闭包相同的闭包。然后，它创建新的代码和函数对象以匹配旧的outer函数对象，但嵌套函数（按名称定位）被替换为猴子补丁。

为了说明以上所有方法，让我们替换innerfunction为一个，将每个格式化值增加2：

>>> def create_inner():
...     someformat = None  # the actual value doesn't matter
...     def innerfunction(val):
...         return someformat.format(val + 2)
...     return innerfunction
... 
>>> new_inner = create_inner()

新的内部函数也被创建为嵌套函数。这很重要，因为它可以确保Python将使用正确的字节码来查找someformat闭包。我使用了一条return语句来提取函数对象，但是您也可以查看create_inner.__code__.co_consts获取代码对象。

现在，我们可以修补原有功能外，换出只是内部功能：

>>> new_outer = replace_inner_function(outerfunction, new_inner)
>>> list(outerfunction(6, 7, 8))
['Foo: 6', 'Foo: 7', 'Foo: 8']
>>> list(new_outer(6, 7, 8))
['Foo: 8', 'Foo: 9', 'Foo: 10']

原始函数回显了原始值，但是新返回的值增加了2。

您甚至可以创建使用更少闭包的新替换内部函数：

>>> def demo_outer():
...     closure1 = 'foo'
...     closure2 = 'bar'
...     def demo_inner():
...         print(closure1, closure2)
...     demo_inner()
...
>>> def create_demo_inner():
...     closure1 = None
...     def demo_inner():
...         print(closure1)
...
>>> replace_inner_function(demo_outer, create_demo_inner.__code__.co_consts[1])()
foo

因此，要完成图片：

使用相同的闭包将猴子补丁内部函数创建为嵌套函数
使用replace_inner_function()产生一个新的外部函数
Monkey修补了原始外部函数，以使用在步骤2中生成的新外部函数。

您是否可以仅用嵌套对补丁嵌套函数进行修补，还是必须重复整个外部函数？

相关阅读

相关文章

相关问答

相关工具

相关文档