当前位置: 首页 > 面试题库 >

从Python中不安全的用户输入评估数学方程式

公良运锋
2023-03-14
问题内容

我有一个网站,用户可以在其中输入数学方程式(表达式),然后根据该网站提供的数据(常数)对这些方程式进行评估。所需的数学运算,包括符号,算术运算min()max()以及其他一些基本功能。一个示例方程式可以是:

max(a * b + 100, a / b - 200)

可以eval()使用Python来简单地做到这一点,但是众所周知,这会导致站点受损。做数学方程式评估的安全方法是什么?

如果选择使用Python本身来评估表达式,那么将存在任何限制Python的Python沙箱,因此仅用户供应商运算符和函数可用。像定义函数一样,应完全禁用成熟的Python。子流程还可以(请参见PyPy沙箱)。特别是,对于用于利用内存和CPU使用率的循环和其他漏洞,应该将其关闭。


问题答案:

在没有第三方软件包的情况下,使用Python进行此操作相对容易。

  • 使用compile()以制备单一行Python表达是用于字节码eval()

  • 不通过运行字节码eval(),而是在自定义操作码循环中运行它,而仅实现您真正需要的操作码。例如,没有内置插件,没有属性访问权限,因此无法逃避沙箱。

但是,还有一些陷阱,例如为CPU耗尽和内存耗尽做准备,它们并不是特定于此方法的,在其他方法上也存在问题。

这是有关该主题的完整博客文章。这是一个相关要点。下面是缩短的示例代码。

""""

    The orignal author: Alexer / #python.fi

"""

import opcode
import dis
import sys
import multiprocessing
import time

# Python 3 required
assert sys.version_info[0] == 3, "No country for old snakes"


class UnknownSymbol(Exception):
    """ There was a function or constant in the expression we don't support. """


class BadValue(Exception):
    """ The user tried to input dangerously big value. """

    MAX_ALLOWED_VALUE = 2**63


class BadCompilingInput(Exception):
    """ The user tried to input something which might cause compiler to slow down. """


def disassemble(co):
    """ Loop through Python bytecode and match instructions  with our internal opcodes.

    :param co: Python code object
    """
    code = co.co_code
    n = len(code)
    i = 0
    extended_arg = 0
    result = []
    while i < n:
        op = code[i]

        curi = i
        i = i+1
        if op >= dis.HAVE_ARGUMENT:
            # Python 2
            # oparg = ord(code[i]) + ord(code[i+1])*256 + extended_arg
            oparg = code[i] + code[i+1] * 256 + extended_arg
            extended_arg = 0
            i = i+2
            if op == dis.EXTENDED_ARG:
                # Python 2
                #extended_arg = oparg*65536L
                extended_arg = oparg*65536
        else:
            oparg = None

        # print(opcode.opname[op])

        opv = globals()[opcode.opname[op].replace('+', '_')](co, curi, i, op, oparg)

        result.append(opv)

    return result

# For the opcodes see dis.py
# (Copy-paste)
# https://docs.python.org/2/library/dis.html

class Opcode:
    """ Base class for out internal opcodes. """
    args = 0
    pops = 0
    pushes = 0
    def __init__(self, co, i, nexti, op, oparg):
        self.co = co
        self.i = i
        self.nexti = nexti
        self.op = op
        self.oparg = oparg

    def get_pops(self):
        return self.pops

    def get_pushes(self):
        return self.pushes

    def touch_value(self, stack, frame):
        assert self.pushes == 0
        for i in range(self.pops):
            stack.pop()


class OpcodeArg(Opcode):
    args = 1


class OpcodeConst(OpcodeArg):
    def get_arg(self):
        return self.co.co_consts[self.oparg]


class OpcodeName(OpcodeArg):
    def get_arg(self):
        return self.co.co_names[self.oparg]


class POP_TOP(Opcode):
    """Removes the top-of-stack (TOS) item."""
    pops = 1
    def touch_value(self, stack, frame):
        stack.pop()


class DUP_TOP(Opcode):
    """Duplicates the reference on top of the stack."""
    # XXX: +-1
    pops = 1
    pushes = 2
    def touch_value(self, stack, frame):
        stack[-1:] = 2 * stack[-1:]


class ROT_TWO(Opcode):
    """Swaps the two top-most stack items."""
    pops = 2
    pushes = 2
    def touch_value(self, stack, frame):
        stack[-2:] = stack[-2:][::-1]


class ROT_THREE(Opcode):
    """Lifts second and third stack item one position up, moves top down to position three."""
    pops = 3
    pushes = 3
    direct = True
    def touch_value(self, stack, frame):
        v3, v2, v1 = stack[-3:]
        stack[-3:] = [v1, v3, v2]


class ROT_FOUR(Opcode):
    """Lifts second, third and forth stack item one position up, moves top down to position four."""
    pops = 4
    pushes = 4
    direct = True
    def touch_value(self, stack, frame):
        v4, v3, v2, v1 = stack[-3:]
        stack[-3:] = [v1, v4, v3, v2]


class UNARY(Opcode):
    """Unary Operations take the top of the stack, apply the operation, and push the result back on the stack."""
    pops = 1
    pushes = 1


class UNARY_POSITIVE(UNARY):
    """Implements TOS = +TOS."""
    def touch_value(self, stack, frame):
        stack[-1] = +stack[-1]


class UNARY_NEGATIVE(UNARY):
    """Implements TOS = -TOS."""
    def touch_value(self, stack, frame):
        stack[-1] = -stack[-1]


class BINARY(Opcode):
    """Binary operations remove the top of the stack (TOS) and the second top-most stack item (TOS1) from the stack. They perform the operation, and put the result back on the stack."""
    pops = 2
    pushes = 1


class BINARY_POWER(BINARY):
    """Implements TOS = TOS1 ** TOS."""
    def touch_value(self, stack, frame):
        TOS1, TOS = stack[-2:]
        print(TOS1, TOS)
        if abs(TOS1) > BadValue.MAX_ALLOWED_VALUE or abs(TOS) > BadValue.MAX_ALLOWED_VALUE:
            raise BadValue("The value for exponent was too big")

        stack[-2:] = [TOS1 ** TOS]


class BINARY_MULTIPLY(BINARY):
    """Implements TOS = TOS1 * TOS."""
    def touch_value(self, stack, frame):
        TOS1, TOS = stack[-2:]
        stack[-2:] = [TOS1 * TOS]


class BINARY_DIVIDE(BINARY):
    """Implements TOS = TOS1 / TOS when from __future__ import division is not in effect."""
    def touch_value(self, stack, frame):
        TOS1, TOS = stack[-2:]
        stack[-2:] = [TOS1 / TOS]


class BINARY_MODULO(BINARY):
    """Implements TOS = TOS1 % TOS."""
    def touch_value(self, stack, frame):
        TOS1, TOS = stack[-2:]
        stack[-2:] = [TOS1 % TOS]


class BINARY_ADD(BINARY):
    """Implements TOS = TOS1 + TOS."""
    def touch_value(self, stack, frame):
        TOS1, TOS = stack[-2:]
        stack[-2:] = [TOS1 + TOS]


class BINARY_SUBTRACT(BINARY):
    """Implements TOS = TOS1 - TOS."""
    def touch_value(self, stack, frame):
        TOS1, TOS = stack[-2:]
        stack[-2:] = [TOS1 - TOS]


class BINARY_FLOOR_DIVIDE(BINARY):
    """Implements TOS = TOS1 // TOS."""
    def touch_value(self, stack, frame):
        TOS1, TOS = stack[-2:]
        stack[-2:] = [TOS1 // TOS]


class BINARY_TRUE_DIVIDE(BINARY):
    """Implements TOS = TOS1 / TOS when from __future__ import division is in effect."""
    def touch_value(self, stack, frame):
        TOS1, TOS = stack[-2:]
        stack[-2:] = [TOS1 / TOS]


class BINARY_LSHIFT(BINARY):
    """Implements TOS = TOS1 << TOS."""
    def touch_value(self, stack, frame):
        TOS1, TOS = stack[-2:]
        stack[-2:] = [TOS1 << TOS]


class BINARY_RSHIFT(BINARY):
    """Implements TOS = TOS1 >> TOS."""
    def touch_value(self, stack, frame):
        TOS1, TOS = stack[-2:]
        stack[-2:] = [TOS1 >> TOS]


class BINARY_AND(BINARY):
    """Implements TOS = TOS1 & TOS."""
    def touch_value(self, stack, frame):
        TOS1, TOS = stack[-2:]
        stack[-2:] = [TOS1 & TOS]


class BINARY_XOR(BINARY):
    """Implements TOS = TOS1 ^ TOS."""
    def touch_value(self, stack, frame):
        TOS1, TOS = stack[-2:]
        stack[-2:] = [TOS1 ^ TOS]


class BINARY_OR(BINARY):
    """Implements TOS = TOS1 | TOS."""
    def touch_value(self, stack, frame):
        TOS1, TOS = stack[-2:]
        stack[-2:] = [TOS1 | TOS]


class RETURN_VALUE(Opcode):
    """Returns with TOS to the caller of the function."""
    pops = 1
    final = True
    def touch_value(self, stack, frame):
        value = stack.pop()
        return value


class LOAD_CONST(OpcodeConst):
    """Pushes co_consts[consti] onto the stack.""" # consti
    pushes = 1
    def touch_value(self, stack, frame):
        # XXX moo: Validate type
        value = self.get_arg()
        assert isinstance(value, (int, float))
        stack.append(value)


class LOAD_NAME(OpcodeName):
    """Pushes the value associated with co_names[namei] onto the stack.""" # namei
    pushes = 1
    def touch_value(self, stack, frame):
        # XXX moo: Get name from dict of valid variables/functions
        name = self.get_arg()
        if name not in frame:
            raise UnknownSymbol("Does not know symbol {}".format(name))
        stack.append(frame[name])


class CALL_FUNCTION(OpcodeArg):
    """Calls a function. The low byte of argc indicates the number of positional parameters, the high byte the number of keyword parameters. On the stack, the opcode finds the keyword parameters first. For each keyword argument, the value is on top of the key. Below the keyword parameters, the positional parameters are on the stack, with the right-most parameter on top. Below the parameters, the function object to call is on the stack. Pops all function arguments, and the function itself off the stack, and pushes the return value.""" # argc
    pops = None
    pushes = 1

    def get_pops(self):
        args = self.oparg & 0xff
        kwargs = (self.oparg >> 8) & 0xff
        return 1 + args + 2 * kwargs

    def touch_value(self, stack, frame):
        argc = self.oparg & 0xff
        kwargc = (self.oparg >> 8) & 0xff
        assert kwargc == 0
        if argc > 0:
            args = stack[-argc:]
            stack[:] = stack[:-argc]
        else:
            args = []
        func = stack.pop()

        assert func in frame.values(), "Uh-oh somebody injected bad function. This does not happen."

        result = func(*args)
        stack.append(result)


def check_for_pow(expr):
    """ Python evaluates power operator during the compile time if its on constants.

    You can do CPU / memory burning attack with ``2**999999999999999999999**9999999999999``.
    We mainly care about memory now, as we catch timeoutting in any case.

    We just disable pow and do not care about it.
    """
    if "**" in expr:
        raise BadCompilingInput("Power operation is not allowed")


def _safe_eval(expr, functions_and_constants={}, check_compiling_input=True):
    """ Evaluate a Pythonic math expression and return the output as a string.

    The expr is limited to 1024 characters / 1024 operations
    to prevent CPU burning or memory stealing.

    :param functions_and_constants: Supplied "built-in" data for evaluation
    """

    # Some safety checks
    assert len(expr) < 1024

    # Check for potential bad compiler input
    if check_compiling_input:
        check_for_pow(expr)

    # Compile Python source code to Python code for eval()
    code = compile(expr, '', 'eval')

    # Dissect bytecode back to Python opcodes
    ops = disassemble(code)
    assert len(ops) < 1024

    stack = []
    for op in ops:
        value = op.touch_value(stack, functions_and_constants)

    return value


 类似资料:
  • 问题内容: 我正在编写一个程序,其中将方程式作为字符串输入,然后求值。到目前为止,我已经提出了: 我既需要此方程式的字符串版本,也需要评估的版本。但是,这是非常危险的功能。但是,使用是行不通的,因为这是一个方程式。是否存在一个Python函数,该函数将评估字符串中的数学表达式,就像输入数字一样? 问题答案: 一种方法是使用numexpr。它主要是用于优化(和多线程)numpy操作的模块,但它也可以

  • 问题内容: 它们各自的答案使我思考如何有效地解析(或多或少受信任的)用户给出的单个数学表达(按照该答案的大致术语)来自数据库的20k到30k的输入值。我实施了一个快速而肮脏的基准测试,因此可以比较不同的解决方案。 #解决方案#1:评估[是的,完全不安全] #解决方案#2a:sympy-evalf( http://www.sympy.org) #解决方案#2b:sympy-lambdify( htt

  • 问题内容: 什么是实现将采用字符串并根据运算符优先级输出结果的python程序的最佳方法(例如:“ 4 + 3 * 5”将输出19)。我在谷歌上寻找解决这个问题的方法,但是它们都太复杂了,我正在寻找一个(相对)简单的方法。 澄清:我需要比eval()稍微先进的东西-我希望能够添加其他运算符(例如,最大运算符-4 $ 2 = 4),或者,我对此在学术上比对专业更感兴趣-我想知道 该怎么 做。 问题答

  • 问题内容: 我需要在Javascript中评估用户输入的算术表达式,例如“ 2 *(3 + 4)”,但出于安全原因,我不想使用它。 我可以去除所有不是数字或运算符的字符,但是我不确定这是否安全,如果用户可以使用,,等功能,那会很好。 是否有进行算术表达式评估的Javascript库? 问题答案: 您可以尝试使用JavaScript Expression Evaluator: 该库是Raphael

  • 问题内容: 是否有适用于Python的数学表达式解析器+计算器? 我不是第一个提出这个问题的人,但是答案通常指向。例如,可以这样做: 但这并不安全: 另外,用于解析和评估数学表达式对我来说似乎是错误的。 我找到了PyMathParser,但它也可以在后台使用,并且没有更好的效果: 是否有可用的库可以在不使用Python解析器的情况下解析和评估数学表达式? 问题答案: 查看Paul McGuire的

  • 问题内容: 可以安全地假设函数参数在Python中是从左到右求值的吗? 参考文献指出,这种情况会发生,但是也许有某种方法可以更改此顺序,这可能会破坏我的代码。 我想做的是为函数调用添加时间戳: 我知道我可以按顺序评估参数: 但是它看起来不太优雅,因此如果可以依靠它,我宁愿采用第一种方法。 问题答案: 是的,Python始终从左到右评估函数参数。 据我所知,这适用于任何逗号分隔的列表: