首页 > 试题广场 >

数据流中的中位数

[编程题]数据流中的中位数
  • 热度指数:419957 时间限制:C/C++ 1秒,其他语言2秒 空间限制:C/C++ 256M,其他语言512M
  • 算法知识视频讲解
如何得到一个数据流中的中位数?如果从数据流中读出奇数个数值,那么中位数就是所有数值排序之后位于中间的数值。如果从数据流中读出偶数个数值,那么中位数就是所有数值排序之后中间两个数的平均值。我们使用Insert()方法读取数据流,使用GetMedian()方法获取当前读取数据的中位数。

数据范围:数据流中数个数满足 ,大小满足

进阶: 空间复杂度 , 时间复杂度
示例1

输入

[5,2,3,4,1,6,7,0,8]

输出

"5.00 3.50 3.00 3.50 3.00 3.50 4.00 3.50 4.00 "

说明

数据流里面不断吐出的是5,2,3...,则得到的平均数分别为5,(5+2)/2,3...   
示例2

输入

[1,1,1]

输出

"1.00 1.00 1.00 "
希尔排序
class Solution:
    def __init__(self):
        self.arr = []
    def Insert(self, num):
        self.arr.append(num)
#         self.arr.sort()
        n = len(self.arr)
        gap = n//2
        while gap > 0:
            for i in range(gap, n):
                while self.arr[i-gap] > self.arr[i] and i>0:
                    self.arr[i-gap], self.arr[i] = self.arr[i], self.arr[i-gap]
                    i -= gap
            gap = gap//2
    def GetMedian(self):
        n = len(self.arr)
        if n%2 == 1:
            return self.arr[n//2]
        else:
            return (self.arr[n//2] + self.arr[n//2 - 1])/2


发表于 2021-06-10 11:59:59 回复(0)
Python /是浮点除法(最好加上小数点),//是整数除法。
# -*- coding:utf-8 -*-
class Solution:
    def __init__(self):
        self.s = []
    def Insert(self, num):
        # write code here
        self.s.append(num)
    def GetMedian(self):
        # write code here
        self.s.sort()
        idx = len(self.s)//2
        if len(self.s)%2==1:
            return self.s[idx]
        else: #0123
            return (self.s[idx-1]+self.s[idx])/2.0


发表于 2021-04-18 03:14:52 回复(0)
暴力直接的方法:生成列表不断读入数据流,排序找中位数
class Solution:
    def __init__(self):
        self.dflow=[]
    def Insert(self, num):
        # write code here
        self.dflow.append(num)
        
    def GetMedian(self,data):
        # write code here
        self.Insert(data)
        if len(self.dflow)==0:
            return None
        else:
            sort_dflow=sorted(self.dflow)
            if len(sort_dflow)%2==0:
                med=sum(sort_dflow[len(sort_dflow)//2-1:len(sort_dflow)//2+1])/2.0
            else:
                med=sort_dflow[len(sort_dflow)//2]
            return med


发表于 2020-08-13 21:44:02 回复(0)
# -*- coding:utf-8 -*-
class Solution:
    def __init__(self):
        self.array = []
    def Insert(self, num):
        # write code here
        self.array.append(num)
        self.array.sort()
    def GetMedian(self,array):
        # write code here
        length = len(self.array)
        if (length % 2 == 1):
            return self.array[length // 2]
        return (self.array[length // 2]+self.array[length // 2 -1]) / 2.0
最后一行,如果换成2,就会报错。。。猜测可能这是Python2.7的编译器,如果写2的话,会直接取整,省略小数点后面的数。
发表于 2020-07-27 19:44:34 回复(0)
思路:

最大堆保留较小的一半,最小堆保留较大的一半。最后得到的中位数就一定是最大堆的堆顶+最小堆的堆顶或者是最大堆的堆顶或者是最小堆的堆顶。

具体操作是先把数据读入最大堆。然后把堆顶元素放到最小堆中,判定如果最大堆与最小堆的差距为1的时候,把最小堆堆顶(最小值)放入最大堆。这样周而复始。,最终就得到了最大堆保留较小的一半,最小堆保留最大的一半。

在python中只有最小堆,所以最大堆要上负号,最小堆也要上负号(负负为正)。

python中除法这里要写2.0不然会被整除法为整数。(python2是这样,python3就是得到小数)



代码是抄的某位大佬的。他提供了一个参考leetcode的参考链接:https://www.nowcoder.com/questionTerminal/9be0172896bd43948f8a32fb954e1be1?answerType=1&f=discussion

import heapq
class Solution:
    def __init__(self):
        self.count=0
        self.max_heap=[]
        self.min_heap=[]
    def Insert(self, num):
        # write code here
        self.count+=1
        heapq.heappush(self.max_heap,-num)
        max_heap_top=heapq.heappop(self.max_heap)
        heapq.heappush(self.min_heap,-max_heap_top)
        if self.count&1:
            min_heap_top=heapq.heappop(self.min_heap)
            heapq.heappush(self.max_heap,-min_heap_top)
        
    def GetMedian(self,s):
        # write code here
        if self.count&1:
            return -self.max_heap[0]
        else:
            return (self.min_heap[0]-self.max_heap[0])/2.0
        


编辑于 2020-06-29 11:54:12 回复(0)

python用的自己实现的最大和最小堆的class,GetMedian需要加个参数,否则python版会报错。

# -*- coding:utf-8 -*-
# 最小堆
class minheap:
    def __init__(self):
        self.minheap = []

    def __len__(self):
        return len(self.minheap)

    def insert(self, elem):
        self.minheap.append(elem)
        l = len(self.minheap) - 1
        while l != 0 and self.minheap[int((l - 1) / 2)] > self.minheap[l]:
            self.minheap[int((l - 1) / 2)], self.minheap[l] = self.minheap[l], self.minheap[int((l - 1) / 2)]
            l = int((l - 1) / 2)

    def pophead(self):
        self.minheap[0] = self.minheap[-1]
        self.minheap = self.minheap[:-1]
        i, l = 0, len(self.minheap) - 1
        while 1:
            if 2 * i + 1 > l:  # 没有子节点
                break
            elif 2 * i + 2 > l:  # 只有左节点
                if self.minheap[2 * i + 1] < self.minheap[i]:
                    self.minheap[2 * i + 1], self.minheap[i] = self.minheap[i], self.minheap[2 * i + 1]
                    i = 2 * i + 1
                else:
                    break
            else:  # 儿女双全
                if self.minheap[2 * i + 1] < self.minheap[i] > self.minheap[2 * i + 2]:  # 比两个孩子都大
                    if self.minheap[2 * i + 1] < self.minheap[2 * i + 2]:  # 左孩子小于右孩子
                        self.minheap[2 * i + 1], self.minheap[i] = self.minheap[i], self.minheap[2 * i + 1]
                        i = 2 * i + 1
                    else:
                        self.minheap[2 * i + 2], self.minheap[i] = self.minheap[i], self.minheap[2 * i + 2]
                        i = 2 * i + 2
                elif self.minheap[2 * i + 1] < self.minheap[i] < self.minheap[2 * i + 2]:
                    self.minheap[2 * i + 1], self.minheap[i] = self.minheap[i], self.minheap[2 * i + 1]
                    i = 2 * i + 1
                elif self.minheap[2 * i + 1] > self.minheap[i] > self.minheap[2 * i + 2]:
                    self.minheap[2 * i + 2], self.minheap[i] = self.minheap[i], self.minheap[2 * i + 2]
                    i = 2 * i + 2
                else:
                    break

    def head(self):
        return self.minheap[0]


# 最大堆
class maxheap:
    def __init__(self):
        self.maxheap = []

    def __len__(self):
        return len(self.maxheap)

    def insert(self, elem):
        self.maxheap.append(elem)
        l = len(self.maxheap) - 1
        while l != 0 and self.maxheap[int((l - 1) / 2)] < self.maxheap[l]:
            self.maxheap[int((l - 1) / 2)], self.maxheap[l] = self.maxheap[l], self.maxheap[int((l - 1) / 2)]
            l = int((l - 1) / 2)

    def pophead(self):
        self.maxheap[0] = self.maxheap[-1]
        self.maxheap = self.maxheap[:-1]
        i, l = 0, len(self.maxheap) - 1
        while 1:
            if 2 * i + 1 > l:  # 没有子节点
                break
            elif 2 * i + 2 > l:  # 只有左节点
                if self.maxheap[2 * i + 1] > self.maxheap[i]:
                    self.maxheap[2 * i + 1], self.maxheap[i] = self.maxheap[i], self.maxheap[2 * i + 1]
                    i = 2 * i + 1
                else:
                    break
            else:  # 儿女双全
                if self.maxheap[2 * i + 1] > self.maxheap[i] < self.maxheap[2 * i + 2]:  # 比两个孩子都大
                    if self.maxheap[2 * i + 1] > self.maxheap[2 * i + 2]:  # 左孩子小于右孩子
                        self.maxheap[2 * i + 1], self.maxheap[i] = self.maxheap[i], self.maxheap[2 * i + 1]
                        i = 2 * i + 1
                    else:
                        self.maxheap[2 * i + 2], self.maxheap[i] = self.maxheap[i], self.maxheap[2 * i + 2]
                        i = 2 * i + 2
                elif self.maxheap[2 * i + 1] > self.maxheap[i] > self.maxheap[2 * i + 2]:
                    self.maxheap[2 * i + 1], self.maxheap[i] = self.maxheap[i], self.maxheap[2 * i + 1]
                    i = 2 * i + 1
                elif self.maxheap[2 * i + 1] < self.maxheap[i] < self.maxheap[2 * i + 2]:
                    self.maxheap[2 * i + 2], self.maxheap[i] = self.maxheap[i], self.maxheap[2 * i + 2]
                    i = 2 * i + 2
                else:
                    break

    def head(self):
        return self.maxheap[0]
class Solution:
    def __init__(self):
        self.minH = minheap()
        self.maxH = maxheap()
    def Insert(self, num):
        self.maxH.insert(num)
        if len(self.maxH) > len(self.minH) + 1:
            self.minH.insert(self.maxH.head())
            self.maxH.pophead()
        if len(self.minH) > 0 and self.maxH.head() > self.minH.head():
            self.maxH.insert(self.minH.head())
            self.minH.insert(self.maxH.head())
            self.maxH.pophead()
            self.minH.pophead()

    def GetMedian(self,n = None):
        if len(self.maxH) is len(self.minH):
            return float(self.maxH.head() + self.minH.head())/2
        elif len(self.maxH) is len(self.minH) + 1:
            return self.maxH.head()

发表于 2020-06-09 02:24:55 回复(0)

Python Solution

# -*- coding:utf-8 -*-
class Solution:
    def __init__(self):
        self.arr=[]
    def Insert(self, num):
        self.arr.append(num)
        self.arr.sort()
    def GetMedian(self,p):
        n = len(self.arr)
        return self.arr[int(n/2)] if n&1==1 else (self.arr[int(n/2)]+self.arr[int(n/2-1)])/2.0



编辑于 2020-06-05 18:11:14 回复(0)
class Solution:
    array = list()

    # 按照降序排序插入
    def Insert(self, num):
        if len(self.array) == 0:
            self.array.append(num)
            return
        for i in range(len(self.array)):
            if self.array[i] >= num:
                self.array.insert(i, num)
                return
        self.array.append(num)

    def GetMedian(self, n=None):
        m = len(self.array)
        if m == 1:
            return self.array[0]
        if m % 2 == 0:
            median = int(m / 2)
            return (self.array[median - 1] + self.array[median]) / 2.00
        else:
            return self.array[int(m / 2)] / 1.00

个人感觉最不费脑子的方法😂
发表于 2020-06-04 14:19:12 回复(0)
# -*- coding:utf-8 -*-
class Solution:
    def __init__(self):
        self.data=[]
    def Insert(self, num):
        if self.data==[]:
            self.data.append(num)
        else:
            last=self.data[-1]
            if num>=last:
                self.data.append(num)
            else:
                for i in range(len(self.data)):
                    if num>=self.data[i]:
                        pass
                    else:
                        self.data.insert(i,num)
                        break
        # write code here
    def GetMedian(self,*kw,**kww):
        lenn=len(self.data)
        if lenn<=0:
            return None
        if lenn==1:
            return self.data[0]
        if lenn%2==1:
            return self.data[(lenn-1)//2]
        if lenn%2==0:
            return (self.data[lenn//2]+self.data[(lenn//2)-1])/2.0
        # write code here

发表于 2020-05-28 16:06:05 回复(0)
# -*- coding:utf-8 -*-
"""
有很多种解法
这里采用的是排序的数组取中位数
Insert O(n)
get O(1)
"""

from bisect import bisect_left


class Solution:
    def __init__(self):
        self.nums = []

    def Insert(self, num):
        index = bisect_left(self.nums,num)
        self.nums.insert(index,num)

    def GetMedian(self,data):
        n = len(self.nums)
        if n%2 == 0:
            return (self.nums[int((n-1)//2)] + self.nums[int(n//2)])/2.0
        else:
            return self.nums[int(n/2)]

#5,2,3,4,1,6,7,0,8
s = Solution()
s.Insert(5)
print(s.GetMedian())
s.Insert(2)
print(s.GetMedian())
s.Insert(3)
print(s.GetMedian())
s.Insert(4)
print(s.GetMedian())
编辑于 2020-03-01 14:24:53 回复(0)
# -*- coding:utf-8 -*-
class Solution:
    def __init__(self):
        self.list1=[]
    def GetMedian(self,data):
        # write code here
        length=len(self.list1)
        if length==1:
            return self.list1[0]
        if length%2!=0:
            return self.list1[(length+1)/2-1]
        else:
            return (self.list1[(len(self.list1))/2-1]+self.list1[(len(self.list1))/2])/2.0
    def Insert(self, num):
        # write code here
        self.list1.append(num)
        self.list1.sort()
        
    
该题的难点就是在于求浮点数,不能除以2,而是应该除以2.0.另外,geimedian方法是单独的一个函数,而不是放在insert中的一个方法
发表于 2020-02-06 16:27:22 回复(0)
# -*- coding:utf-8 -*-
class Solution:
    def __init__(self):
        self.data = []

    def Insert(self, num):
        # write code here
        self.data.append(num)
        self.data.sort()

    def GetMedian(self, data):
        # write code here
        if len(self.data) % 2 == 1:
            return self.data[len(self.data) // 2]
        else:
            return (self.data[len(self.data) // 2] + self.data[len(self.data) // 2 - 1]) / 2.0

发表于 2020-01-19 18:58:19 回复(0)
初始化一个大根堆和一个小根堆,小根堆里的所有元素大于大根堆里的所有元素,这样只要在总数为奇数时取大根堆顶的元素即为中位数,偶数时取两个根顶元素的平均值。
当元素总数为奇数时,把该元素放入大根堆,偶数时放入小根堆。为了满足小根堆里的所有元素大于大根堆里的所有元素,先放入另一堆,再取出堆顶元素放入应该放入的堆中。
# -*- coding:utf-8 -*-
import heapq
class Solution:
    def __init__(self):
        self.small=[]
        self.large=[]
        self.c=0
    def Insert(self, num):
        self.c+=1
        if self.c%2:
            heapq.heappush(self.large,-heapq.heappushpop(self.small,num))
        else:
            heapq.heappush(self.small,-heapq.heappushpop(self.large,-num))
    def GetMedian(self,x=None):
        if self.c%2:
            return -self.large[0]
        else:
            return (-self.large[0]+self.small[0])/2.0


编辑于 2019-12-14 00:00:11 回复(0)
代码写的比较详细,思路同剑指offer上的思想
然后唯一比较绕的就是heap库函数本身只能实现最小堆,但是我们可以通过将num变成负数,那么全是负数的最小堆实际上就是最大堆了,不清楚的你细品细细品^_^
import heapq
class Solution:
    def __init__(self):
        #数组右侧, 最小堆
        self.small = []
        #数组左侧, 最大堆(值取反后插入最小堆可以实现)
        self.large = []
        self.length = 0
        self.odd = True
        
    def Insert(self, num):
        #奇数时,插入最大堆
        if self.odd:
            #如果当前数字比最小堆的堆顶元素要大, 先弹出最小堆堆顶元素, 将其插入最大堆, 再把当前元素插入最小堆即可
            if not self.small:
                heapq.heappush(self.large, -num)
            elif self.small[0]<num:
                heapq.heappush(self.large, -heapq.heappop(self.small))
                heapq.heappush(self.small, num)
            else:
                heapq.heappush(self.large, -num)
        #偶数时,插入最小堆
        else:
            #如果当前数字比最大堆的堆顶元素要小, 先弹出最大堆堆顶元素, 将其插入最小堆, 再把当前元素插入最大堆即可
            if not self.large:
                heapq.heappush(self.small, num)
            elif -self.large[0]>num:
                heapq.heappush(self.small, -heapq.heappop(self.large))
                heapq.heappush(self.large, -num)
            else:
                heapq.heappush(self.small, num)
        self.odd = not self.odd
        self.length += 1
            
    def GetMedian(self,ss=None):
        #因为奇数先插的最大堆,因此最大堆长度>=最小堆长度
        if len(self.large) > len(self.small):
            return float(-self.large[0])
        return (self.small[0] - self.large[0]) /2.0


编辑于 2019-11-14 20:14:22 回复(0)
# -*- coding:utf-8 -*-
# 两个队列,一个存大的数,一个存小的数
class Solution:
    def __init__(self):
        self.small = []
        self.big = []
    def Insert(self, num):
        # write code here
        if self.small == [] or num <= max(self.small):
            self.small.append(num)
        else:
            self.big.append(num)
        
        if len(self.small) == len(self.big) + 2:
            self.big.append(max(self.small))
            self.small.pop(self.small.index(max(self.small)))
        
        if len(self.small) + 1 == len(self.big):
            self.small.append(min(self.big))
            self.big.pop(self.big.index(min(self.big)))
    def GetMedian(self,n=None):
        # write code here
        return (max(self.small) + min(self.big))/2.0 if len(self.big) == len(self.small) else max(self.small)

发表于 2019-09-28 14:48:01 回复(0)
太累了。。。
用最大堆,最小堆分别存放中位数左边和右边的数据。主要是维持两个堆元素个数相等或相差1
class Solution:
    def __init__(self):
        self.n=0
        self.left=[]
        self.right=[]
    def Insert(self, num):
        self.n=(self.n+1)%2
        if self.n==0:
            if self.right:
                if num>=self.right[-1]:
                    self.right.append(num)
                    self.right[-1],self.right[-2]=self.right[-2],self.right[-1]
                else:
                    if num>=self.left[-1]:
                        self.right.append(num)
                    else:
                        self.right.append(self.left.pop())
                        self.left.append(num)
                        nn=len(self.left)
                        for i in range(nn-1):
                            if self.left[i]>self.left[i+1]:
                                self.left[i],self.left[i+1]=self.left[i+1],self.left[i]
            else:
                if num>=self.left[-1]:
                    self.right.append(num)
                else:
                    self.right.append(self.left.pop())
                    self.left.append(num)
                    nn=len(self.left)
                    for i in range(nn-1):
                        if self.left[i]>self.left[i+1]:
                            self.left[i],self.left[i+1]=self.left[i+1],self.left[i]
        elif self.n==1:
            if self.left:
                if num<self.left[-1]:
                    self.left.append(num)
                    self.left[-1],self.left[-2]=self.left[-2],self.left[-1]
                elif num<=self.right[-1]:
                    self.left.append(num)
                else:
                    self.left.append(self.right.pop())
                    self.right.append(num)
                    nn=len(self.right)
                    for i in range(nn-1):
                        if self.right[i]<self.right[i+1]:
                            self.right[i],self.right[i+1]=self.right[i+1],self.right[i]
            else:
                self.left.append(num)
    def GetMedian(self,nb):
        if self.n==1:
            return self.left[-1]
        else:
            return (self.right[-1]+0.0+self.left[-1])/2


发表于 2019-09-16 21:59:19 回复(0)
题目有问题。。给GetMedian加一个参数就对了。
def __init__(self):
    self.res=[]
def Insert(self, num):
    # write code here
    self.res.append(num)
def GetMedian(self,nb):
    # write code here
    self.res.sort()
    n=len(self.res)//2
    return self.res[n] if len(self.res)%2==1 else float(self.res[n]+self.res[n-1])/2


发表于 2019-08-29 21:47:42 回复(0)
# -*- coding:utf-8 -*-
class Solution:
    def __init__(self):
        self.list = []
    def Insert(self, num):
        # write code here
        self.list.append(num)
        self.list.sort()
    def GetMedian(self,num):
        # write code here
        if len(self.list)%2 == 1:
            return self.list[len(self.list)/2]
        elif len(self.list)%2 == 0:
            return (self.list[len(self.list)/2]+self.list[len(self.list)/2-1])/2.0


发表于 2019-06-27 11:39:11 回复(0)
大顶堆p,小顶堆q,使用heapq模块实现。
import heapq
# -*- coding:utf-8 -*-
class Solution:
    def __init__(self):
        self.q = []
        self.p = []
    def Insert(self, num):
        # write code here
        if not self.p or num <= -self.p[0]: heapq.heappush(self.p, -num)
        else: heapq.heappush(self.q, num)
        if len(self.p) == len(self.q) + 2: heapq.heappush(self.q, -heapq.heappop(self.p))
        elif len(self.p) + 1 == len(self.q): heapq.heappush(self.p, -heapq.heappop(self.q))
    def GetMedian(self, data):
        # write code here
        return (self.q[0] - self.p[0]) / 2.0 if len(self.q) == len(self.p) else -self.p[0]
对数据流进行二分法插入,之后返回中间值。
# -*- coding:utf-8 -*-
class Solution:
    def __init__(self):
        self.arr = []
    def Insert(self, num):
        # write code here
        i, j = 0, len(self.arr)-1
        while i <= j:
            m = (i+j) // 2
            if self.arr[m] < num: i = m + 1
            else: j = m - 1
        self.arr = self.arr[:i] + [num] + self.arr[i:]
    def GetMedian(self, data):
        # write code here
        m = len(self.arr)//2
        return (self.arr[m]+self.arr[m-1])/2.0 if not len(self.arr) % 2 else self.arr[m]
编辑于 2019-05-30 17:45:20 回复(1)
    def __init__(self):
        """
        initialize your data structure here.
        """
        self.max_heap = []
        self.min_heap = []

    def Insert(self, num):
        """
        :type num: int
        :rtype: None
        """
        # 维护两个堆,大根堆保存最小的1/2数据,小根堆保存最大的1/2数据
        # 两个堆的长度不超过1个
        # 注意: python中没有大根堆,用小根堆取负代替

        # Corner case:初始化
        if len(self.max_heap) == 0 and len(self.min_heap) == 0:
            heapq.heappush(self.min_heap, num)
            return
        # 初始化的时候必须保证小根堆的数比大根堆的树要大
        if len(self.max_heap) == 0 and len(self.min_heap) == 1:
            if num > self.min_heap[0]:
                temp = heapq.heappop(self.min_heap)
                heapq.heappush(self.max_heap, -temp)
                heapq.heappush(self.min_heap, num)
            else:
                heapq.heappush(self.max_heap, -num)
            return

        if len(self.min_heap) >= len(self.max_heap):
            if num < -self.max_heap[-1]:
                heapq.heappush(self.max_heap, -num)
            else:
                heapq.heappush(self.min_heap, num)
                temp = heapq.heappop(self.min_heap)
                heapq.heappush(self.max_heap, -temp)
        else:
            if num > self.min_heap[-1]:
                heapq.heappush(self.min_heap, num)
            else:
                heapq.heappush(self.max_heap, -num)
                temp = heapq.heappop(self.max_heap)
                heapq.heappush(self.min_heap, -temp)

    def GetMedian(self,n=None):
        # 注意取值的时候应该是heap[0]
        if len(self.min_heap) == len(self.max_heap):
            if len(self.min_heap) != 0:
                return float(self.min_heap[0] - self.max_heap[0]) / 2
            else:
                return None
        elif len(self.min_heap) > len(self.max_heap):
            return self.min_heap[0]
        else:
            return -self.max_heap[0]
发表于 2019-05-13 09:01:49 回复(0)

问题信息

难度:
41条回答 109088浏览

热门推荐

通过挑战的用户

查看代码