问题：

“中位数”算法的Python实现

谢宸

2023-03-14

我已经用python编写了medians算法的median的实现，但是它似乎没有输出正确的结果，而且对我来说它似乎也没有线性复杂度，知道我哪里出错了吗？

def select(L):
    if len(L) < 10:
        L.sort()
        return L[int(len(L)/2)]
    S = []
    lIndex = 0
    while lIndex+5 < len(L)-1:
        S.append(L[lIndex:lIndex+5])
        lIndex += 5
    S.append(L[lIndex:])
    Meds = []
    for subList in S:
        print(subList)
    Meds.append(select(subList))
    L2 = select(Meds)
    L1 = L3 = []
    for i in L:
        if i < L2:
            L1.append(i)
        if i > L2:
            L3.append(i)
    if len(L) < len(L1):
        return select(L1)
    elif len(L) > len(L1) + 1:
        return select(L3)
    else:
        return L2

这个函数是这样调用的:

L = list(range(100))
shuffle(L)
print(select(L))

乐:不好意思。GetMed是一个简单地对列表排序并返回len(list)处的元素的函数，它应该在那里被选择，我现在修复了它，但我仍然得到错误的输出。至于缩进，代码工作没有错误，我看不出有什么问题:-？？

LE2：我期望50（对于当前L），它给我30到70的输出，不多不少（还没有）

LE3：非常感谢，这就成功了。现在它成功了。但是我很困惑，我试图在这个方法和简单的方法之间进行比较，我只是对数组排序并输出结果。现在，从我到目前为止读到的内容来看，select方法的时间复杂度应该是O（n）确定性选择。虽然我可能无法与python开发人员的优化相比，但我确实期望得到比我更接近的结果，例如，如果我将列表的范围更改为10000000，select将在84.1083711625552秒内输出结果，而排序和返回方法将在18.92556029528825秒内完成。有哪些好方法可以使该算法更快？

共有2个答案

越信鸥

2023-03-14

下面是我的PYTHON实现。为了获得更快的速度，您可能需要使用PYPY。

对于您关于 SPEED 的问题：每列 5 个数字的理论速度是 ~10N，因此我每列使用 15 个数字，对于 ~5N 时的 2 倍速度，而最佳速度是 ~4N。但是，关于最先进解决方案的最佳速度，我可能是错的。在我自己的测试中，我的程序运行速度略快于使用 sort（）的程序。当然，您的里程可能会有所不同。

假设python程序是“median.py”，运行它的例子是“python ./median.py 100”。对于速度基准测试，您可能需要注释掉验证代码，并使用PYPY。

#!/bin/python
#
# TH @stackoverflow, 2016-01-20, linear time "median of medians" algorithm
#
import sys, random


items_per_column = 15


def find_i_th_smallest( A, i ):
    t = len(A)
    if(t <= items_per_column):
        # if A is a small list with less than items_per_column items, then:
        #     1. do sort on A
        #     2. return the i-th smallest item of A
        #
        return sorted(A)[i]
    else:
        # 1. partition A into columns of items_per_column items each. items_per_column is odd, say 15.
        # 2. find the median of every column
        # 3. put all medians in a new list, say, B
        #
        B = [ find_i_th_smallest(k, (len(k) - 1)/2) for k in [A[j:(j + items_per_column)] for j in range(0,len(A),items_per_column)]]

        # 4. find M, the median of B
        #
        M = find_i_th_smallest(B, (len(B) - 1)/2)

        # 5. split A into 3 parts by M, { < M }, { == M }, and { > M }
        # 6. find which above set has A's i-th smallest, recursively.
        #
        P1 = [ j for j in A if j < M ]
        if(i < len(P1)):
            return find_i_th_smallest( P1, i)
        P3 = [ j for j in A if j > M ]
        L3 = len(P3)
        if(i < (t - L3)):
            return M
        return find_i_th_smallest( P3, i - (t - L3))


# How many numbers should be randomly generated for testing?
#
number_of_numbers = int(sys.argv[1])


# create a list of random positive integers
#
L = [ random.randint(0, number_of_numbers) for i in range(0, number_of_numbers) ]


# Show the original list
#
print L


# This is for validation
#
print sorted(L)[int((len(L) - 1)/2)]


# This is the result of the "median of medians" function.
# Its result should be the same as the validation.
#
print find_i_th_smallest( L, (len(L) - 1) / 2)

王念

2023-03-14

1)您的代码缩进是错误的，试试这个:

def select(L):
    if len(L) < 10:
        L.sort()
        return L[int(len(L)/2)]
    S = []
    lIndex = 0
    while lIndex+5 < len(L)-1:
        S.append(L[lIndex:lIndex+5])
        lIndex += 5
    S.append(L[lIndex:])
    Meds = []
    for subList in S:
        print(subList)
        Meds.append(select(subList))
    L2 = select(Meds)
    L1 = L3 = []
    for i in L:
        if i < L2:
            L1.append(i)
        if i > L2:
            L3.append(i)
    if len(L) < len(L1):
        return select(L1)
    elif len(L) > len(L1) + 1:
        return select(L3)
    else:
        return L2

2）您使用的方法不返回中位数，它只是返回一个与中位数相差不远的数字。要获得中位数，您需要计算有多少个数字大于伪中位数，如果多数大于伪中位数，则重复该算法，将数字大于伪中位数，否则重复其他数字。

def select(L, j):
    if len(L) < 10:
        L.sort()
        return L[j]
    S = []
    lIndex = 0
    while lIndex+5 < len(L)-1:
        S.append(L[lIndex:lIndex+5])
        lIndex += 5
    S.append(L[lIndex:])
    Meds = []
    for subList in S:
        Meds.append(select(subList, int((len(subList)-1)/2)))
    med = select(Meds, int((len(Meds)-1)/2))
    L1 = []
    L2 = []
    L3 = []
    for i in L:
        if i < med:
            L1.append(i)
        elif i > med:
            L3.append(i)
        else:
            L2.append(i)
    if j < len(L1):
        return select(L1, j)
    elif j < len(L2) + len(L1):
        return L2[0]
    else:
        return select(L3, j-len(L1)-len(L2))

警告：L=M = []不是L = []和M = []

类似资料：

python移位运算的实现

本文向大家介绍python移位运算的实现，包括了python移位运算的实现的使用技巧和注意事项，需要的朋友参考一下密码算法程序设计实践选的SHA-1。在写的过程中遇到一丢丢关于python移位的问题，记录一下。 SHA-1其中第一步需要填充消息。简单阐述一下sha1填充消息的过程：如输入消息“123”，先转成ascii码——313233，消息长度为3*8=24。即00110001 0011
python实现数独算法实例

本文向大家介绍python实现数独算法实例，包括了python实现数独算法实例的使用技巧和注意事项，需要的朋友参考一下本文实例讲述了python实现数独算法的方法。分享给大家供大家参考。具体如下：希望本文所述对大家的Python程序设计有所帮助。
python 换位密码算法的实例详解

本文向大家介绍python 换位密码算法的实例详解，包括了python 换位密码算法的实例详解的使用技巧和注意事项，需要的朋友参考一下 python 换位密码算法的实例详解一前言：换位密码基本原理：先把明文按照固定长度进行分组，然后对每一组的字符进行换位操作，从而实现加密。例如，字符串“Error should never pass silently”，使用秘钥1432进行加密时，首先将字符
数学 - 位运算实现除法[M]

029. Divide Two Integers[M] 问题 Divide two integers without using multiplication, division and mod operator. If it is overflow, return MAX_INT. 思路这道题难在不能使用乘除取余操作，所以我们只能手动的实现除法，我们来看看如何实现。我们假设被除数为D，除数为
算法python实现

线性回归python实现 1.算法python代码包含Normal Equations，批量梯度下降和随机梯度下降，这里的代码跟Logistic回归的代码类似 # -*- coding: utf-8 -*- import matplotlib.pyplot as plt import numpy as np class LinearRegression(object): def _
算法python实现

Logistic回归python实现 1.算法python代码 # -*- coding: utf-8 -*- import matplotlib.pyplot as plt import numpy as np class Logistic(object): def __init__(self): self._history_w = [] self.

“中位数”算法的Python实现

共有2个答案

相关问答

相关文章

相关阅读

相关工具

相关文档