首页 > 试题广场 >

两个有序数组的中位数

[编程题]两个有序数组的中位数
  • 热度指数:25705 时间限制:C/C++ 1秒,其他语言2秒 空间限制:C/C++ 32M,其他语言64M
  • 算法知识视频讲解
有两个大小分别为m和n的有序数组AB。请找出这两个数组的中位数。你需要给出时间复杂度在O(log (m+n))以内的算法。
示例1

输入

[],[1]

输出

1.00000

title: LeetCode刷题笔记(链表):median-of-two-sorted-arrays
comments: true
mathjax: true
categories:

  • Leetcode
    tags:
  • Leetcode
  • List



[toc]

题目描述

There are two sorted arrays A and B of size m and n respectively. Find the median of the two sorted arrays. The overall run time complexity should be O(log (m+n)).

解题思路

要求在O(log (m+n))时间内找到中位数,所以像那些合并之后再二分查找、或者一边比较一边合并到总量一半的方法肯定是不行的。

我们可以将原问题转变成一个寻找第k小数的问题(假设两个原序列升序排列),这样中位数实际上是第(m+n)/2小的数。所以只要解决了第k小数的问题,原问题也得以解决。

首先假设数组A和B的元素个数都大于k/2,我们比较A[k/2-1]和B[k/2-1]两个元素,这两个元素分别表示A的第k/2小的元素和B的第k/2小的元素。这两个元素比较共有三种情况:>、<和=。如果A[k/2-1] < B[k/2-1],这表示A[0]到A[k/2-1]的元素都在A和B合并之后的前k小的元素中。换句话说,A[k/2-1]不可能大于两数组合并之后的第k小值,所以我们可以将其抛弃。

当A[k/2-1]=B[k/2-1]时,我们已经找到了第k小的数,也即这个相等的元素,我们将其记为m。由于在A和B中分别有k/2-1个元素小于m,所以m即是第k小的数。(这里可能有人会有疑问,如果k为奇数,则m不是中位数。这里是进行了理想化考虑,在实际代码中略有不同,是先求k/2,然后利用k-k/2获得另一个数。)

通过上面的分析,我们即可以采用递归的方式实现寻找第k小的数。此外我们还需要考虑几个边界条件:

  • 如果A或者B为空,则直接返回B[k-1]或者A[k-1];
  • 如果k为1,我们只需要返回A[0]和B[0]中的较小值;
  • 如果A[k/2-1]=B[k/2-1],返回其中一个;

C++版代码实现

class Solution {
public:

    double findKth(int A[], int m, int B[], int n, int k){
        if(m > n)
            return findKth(B, n, A, m, k);  //始终保持元素较少的数组位于前面的位置
        if(m == 0)
            return B[k-1];                  //如果位于前面的数组为空,则直接返回后面数组的第k-1个元素
        if(k == 1)
            return min(A[0], B[0]);         //如果k等于1,则返回两个数组头元素的最小值

        int pa = min(k / 2, m), pb = k - pa;
        if(A[pa-1] < B[pb-1])
            return findKth(A + pa, m - pa, B, n, k - pa);
        else if(A[pa - 1] > B[pb - 1])
            return findKth(A, m, B + pb, n - pb, k - pb);
        else
            return A[pa-1];
    }

    double findMedianSortedArrays(int A[], int m, int B[], int n) {
        int total = m + n;
        if(total & 0x1)
            return findKth(A, m, B, n, total / 2 + 1);
        else
            return (findKth(A, m, B, n, total/2)
                   + findKth(A, m, B, n, total / 2 + 1)) / 2;
    }
};

<font color="red" size=3 face="仿宋">系列教程持续发布中,欢迎订阅、关注、收藏、评论、点赞哦~~( ̄▽ ̄~)~</font>

<font color="red" size=3 face="仿宋">完的汪(∪。∪)。。。zzz</font>

发表于 2018-01-22 11:03:35 回复(3)
public double findMedianSortedArrays(int[] A, int[] B) {
		int m = A.length, n = B.length;
		// 不论总数是奇数还是偶数,以l和r为下标的两数的均值都是medium
		int l = (m + n + 1) / 2;
		int r = (m + n + 2) / 2;

		return (getkth(A, 0, B, 0, l) + getkth(A, 0, B, 0, r)) / 2.0;
	}

	private int getkth(int[] A, int aStart, int[] B, int bStart, int k) {
		if (aStart >= A.length)
			return B[bStart + k - 1];
		if (bStart >= B.length)
			return A[aStart + k - 1];
		if (k == 1)
			return Math.min(A[aStart], B[bStart]);
		int aMin = Integer.MAX_VALUE, bMin = Integer.MAX_VALUE;
		if (aStart + k / 2 - 1 < A.length)
			aMin = A[aStart + k / 2 - 1];
		if (bStart + k / 2 - 1 < B.length)
			bMin = B[bStart + k / 2 - 1];

		if (aMin < bMin)
			return getkth(A, aStart + k / 2, B, bStart, k - k / 2);
		else
			return getkth(A, aStart, B, bStart + k / 2, k - k / 2);
	}

发表于 2017-06-17 11:51:36 回复(6)
//归并数组,但是不需要完全合并,只需要合并到中位数即可
//但是,如果m+n为偶数,那么返回中间两个值的平均数
class Solution {
public:
    double findMedianSortedArrays(int A[], int m, int B[], int n) {
        int mid = (m + n) / 2+1;
        int a = 0;
        int b = 0;
        int result;  //永远比result2慢一步
        int result2=0;
        for(int i=0;i<mid;i++){  //复杂度为(m+n)/2+1,也就是m+n
            if(b>=n||(a<m  &&A[a]<B[b] )){
                result = result2;
                result2 = A[a]; //result2为奇数个数的中位数
                a++;
            }            
            else{
                result = result2;
                result2 = B[b];
                b++;
            }
        }
        if((m+n)%2 == 0) return double(result+result2)/2.0;  //如果为偶数长度
        return result2;
    }
};

编辑于 2017-05-19 14:57:26 回复(2)
用了最笨的方法,ac是通过了,就是没有什么技术含量。
import java.util.*;


public class Solution {
    /**
     * 
     * @param A int整型一维数组 
     * @param B int整型一维数组 
     * @return double浮点型
     */
    public double findMedianSortedArrays (int[] A, int[] B) {
        if(A.length == 0 && B.length == 0){
            return 0;
        }
        double result = 0d;
        List<Integer> list = new ArrayList<Integer>();
        //将A数组中的值存入list中
        for(int i = 0 ; i < A.length; i++){
            list.add(A[i]);
        }
        //将B数组中的值存入list中
        for(int i = 0 ; i < B.length; i++){
            list.add(B[i]);
        }
        //对list排序
        Collections.sort(list);
        if(list.size() % 2 == 0){ //如果list长度是偶数,则为最中间的两个数之和除以2
            int temp = list.size() / 2;
            result = (double)(list.get(temp) + list.get(temp - 1))/2;
        }else{ //如果list长度是奇数,则为最中间那个数本身
            int temp = list.size() / 2;
            result = (double)list.get(temp);
        }
        return result;
    }
}

发表于 2020-06-11 15:49:12 回复(1)
法1:数据流中找中位数,并没有利用有序数组这个特征,创建一个大顶堆一个小顶堆,大顶堆维护比中位数小的数,小顶堆维护比中位数大的数,维护两个堆的元素个数差小于等于1,最后在结果能在堆顶中找到

private static double findMedianSortedArrays(int[] A, int[] B) {

        // 最小堆存放比中位数大的数

        PriorityQueue<Integer> minHeap = new PriorityQueue<>();     

        // 最大堆存放比中位数小的数

        PriorityQueue<Integer> maxHeap = new PriorityQueue<>(A.length+B.length, new Comparator<Integer>() {

            @Override

            public int compare(Integer o1, Integer o2) {

                // TODO Auto-generated method stub

                return o2.compareTo(o1);

            }

        });

        

        for (int n : A) {

            

            // 当堆为空时先插入堆

            if(maxHeap.isEmpty()) {

                maxHeap.add(n);

                continue;

            } else if(minHeap.isEmpty()) {

                minHeap.add(n);

                continue;

            }

            

            

            // 调整 less 和 more 的边界, 处理less大于more的情况,交换一下边界好了

            if(maxHeap.peek() > minHeap.peek()) {

                int tmp = minHeap.remove();

                minHeap.add(maxHeap.remove());

                maxHeap.add(tmp);

            }

            

            if(n <= maxHeap.peek()) {

                maxHeap.add(n);

            } else {

                minHeap.add(n);

            }

            

            // 调整两堆的平衡

            if(maxHeap.size()-minHeap.size() >= 2) {

                minHeap.add(maxHeap.remove());

            } else if (minHeap.size()-maxHeap.size() >= 2) {

                maxHeap.add(minHeap.remove());

            }

        }

        

        for (int n : B) {

            

            // 当堆为空时先插入堆

            if(maxHeap.isEmpty()) {

                maxHeap.add(n);

                continue;

            } else if(minHeap.isEmpty()) {

                minHeap.add(n);

                continue;

            }

            

            

            // 调整 less 和 more 的边界, 处理less大于more的情况,交换一下边界好了

            if(maxHeap.peek() > minHeap.peek()) {

                int tmp = minHeap.remove();

                minHeap.add(maxHeap.remove());

                maxHeap.add(tmp);

            }

            

            if(n <= maxHeap.peek()) {

                maxHeap.add(n);

            } else {

                minHeap.add(n);

            }

            

            // 调整两堆的平衡

            if(maxHeap.size()-minHeap.size() >= 2) {

                minHeap.add(maxHeap.remove());

            } else if (minHeap.size()-maxHeap.size() >= 2) {

                maxHeap.add(minHeap.remove());

            }

        }

        

        // 存放完毕

        if(maxHeap.size() > minHeap.size()) { // 小于中位数的多

            return maxHeap.remove();

        } else if(minHeap.size() > maxHeap.size()) { // 大于等于中位数的多

            return minHeap.remove();

        } else {

            return ((float)minHeap.remove() + (float)maxHeap.remove())/2;

        }

    }

法2:利用有序数组这个特征,稍后再补上


发表于 2019-02-27 11:45:13 回复(0)
//合并数组,然后快速确定中间值
/*解题思路:
(1)第一步将两个有序数组合并成一个有序的数组(或者向量)(类似于两个有序链表的合并)
(2)得到最终的数组(或者向量)长度为m+n,然后判断是有奇数个值,还是有偶数个值
(3)如果有奇数个值,那么只有一个中间值,对应的编号为 (数组(或者向量)长度 -1)/2,取出值,直接返回
(4)如果有偶数个值,那么有两个中间值,对应编号为:
    1)数组(或者向量)长度 /2 2)数组(或者向量)长度 /2 - 1
(5)取出对应的值,然后求平均,得到最终结果 */
class Solution {
public:
    double findMedianSortedArrays(int A[], int m, int B[], int n) {
        int C[m+n];
        memset(C,0,(m+n)*sizeof(int));
        int k=0;
        int i=0,j=0;//定义下标,其中i是向量1的下标,j是向量2的下标
        while(i<m&&j<n)
        {
            
            if(A[i]<=B[j])
                C[k++]=A[i++];
            else
                C[k++]=B[j++];

        }
         //数组B还有剩余部分,将数组B剩余部分依次插入
        if(i==m)
        {
            while(j<n)
                C[k++]=B[j++];
        }    
        else if(j==n)//数组A还有剩余部分,将数组A剩余部分依次插入
        {
            while(i<m)
                C[k++]=A[i++];
        }
        if((m+n)&1)//有奇数个值(一个中间值)
            return C[(m+n)>>1]*1.0;
        else////有偶数个值(两个中间值,求平均值)
        	return (C[(m+n)>>1]+C[((m+n)>>1)-1])*0.5;   ////-号优先级比>>高
    }
};

Let me add some interpretation of the find kth function based on my understanding

We have two arrays:

nums1[0], nums1[1]....nums1[m - 1];

nums2[0], nums2[2]....nums2[n - 1];

the result after merging:

num[0],num[1],num[2]...num[m + n - 1];

Let‘s compare nums1[k / 2 - 1] and nums2[k / 2 - 1]

if nums1[k / 2 - 1] < nums2 [k / 2 - 1]

then the nums1[k / 2 - 1] and it's left side elements must smaller than kth number in num arrary(num[k - 1]).
Why?
Assume that nums1[k / 2 - 1] == num[k - 1];

Let's count the number of elements which smaller than nums1[k / 2 - 1].

Consider an extreme case : nums1[0]....nums1[k / 2 - 2] and nums2[0]...nums2[k / 2 - 2] smaller than nums1[k / 2 - 1];

In this special case, we only have k / 2 - 1 + k / 2 - 1 = k - 2 elements smaller than the nums1[k / 2 - 1]. so nums1[k / 2 - 1] only can be (k - 1)th smallest number (num[k - 2]);
So, it's a contradiction with our assumption.

And now we could say, The num[k / 2 - 1] and it's left side elements must smaller than the Kth smallest number.
so we could remove the elements which in this range and shrink the problem set.
same idea when nums1[k / 2 - 1] > nums2 [k / 2 - 1]. we could remove the elements in the nums2;

Correct me, if I'm wrong. Thanks



class Solution {
public:
    int find_kth(int A[], int m, int B[], int n,int k)
    {
    	//假设m总是<=n
        if(m>n)
            return find_kth(B,n,A,m,k);
        if(m==0)
            return B[k-1];
        if(k==1) return min(A[0],B[0]);
        
        //将k分为两部分
        int ia=min(m,k/2);
        int ib=k-ia;
        if(A[ia-1]<B[ib-1])
            return find_kth(A+ia,m-ia,B,n,k-ia);
        else if(A[ia-1]>B[ib-1])
            return find_kth(A,m,B+ib,n-ib,k-ib);
        else
            return A[ia-1];
    }
    
    double findMedianSortedArrays(int A[], int m, int B[], int n) {
        int total=m+n;
        if(total&1)
            return find_kth(A,m,B,n,total/2+1);
        else
            return(find_kth(A,m,B,n,total/2)+find_kth(A,m,B,n,total/2+1))/2.0;
            
    }
};

编辑于 2016-07-14 14:22:51 回复(10)
public class Solution {
    public double findMedianSortedArrays(int A[], int B[]) {
        int m = A.length,n = B.length;
        int[] c = new int[m+n];
        int i = 0,j = 0,k = 0;
        for(;i < m && j < n && k < m+n;k++){
            if(A[i] < B[j]) c[k] = A[i++];
            else c[k] = B[j++];
        }
        while(k < m+n && j < n) c[k++] = B[j++];
        while(k < m+n && i < m) c[k++] = A[i++];
        if((m+n)%2 == 1) return (double)c[(m+n)/2];
        else return (double)(c[(m+n)/2 - 1] + c[(m+n)/2]) / 2.0;
    }
}

发表于 2018-03-29 21:50:21 回复(4)
思路:
(1)把A,B两个数组复制到同一个数组c
(2)Arrays.sort()方法对c数组排个序。
(3)根据c中的元素个数分步:
        如果是偶数个元素,居中的两个数字相加在除2(乘1.0是为了转换成浮点数);
        如果是奇数个元素,直接取中间位置的数(没有求平均值不需要再转换浮点数)。
import java.util.*;

public class Solution {
    public double findMedianSortedArrays (int[] A, int[] B) {
        double result = 0.0;
        int[] c = new int[A.length + B.length];
        System.arraycopy(A, 0, c, 0, A.length);
        System.arraycopy(B, 0, c, A.length, B.length);
        Arrays.sort(c);
        if(c.length%2==0) {
            result = (c[c.length/2-1]+c[c.length/2])*1.0/2;
        } else {
            result = c[c.length/2];
        }
        return result;
    }
}


发表于 2020-07-08 10:25:18 回复(0)
这个题有点像牛顿逼近法。。
是log(m+n)的算法,
每次找A或者B中最小的n/2个(n/2个在一行中取完),更新起始端点,
然后n=n-n/2,重复上一步
直到n为1,返回

        int N,M;
    int f(int A[],int i,int B[],int j,int k){
        int re=1e9;
        if(k==1){
            if(i<N)re=min(re,A[i]);
            if(j<M)re=min(re,B[j]);
            return re;
        }
        
        int te=k/2;
        if(i+te<=N&&j+te<=M){
            if(A[i+te-1]>B[j+te-1])re=f(A,i,B,j+te,k-te);
            else re=f(A,i+te,B,j,k-te);
        }
        else if(i+te<=N)re=f(A,i+te,B,j,k-te);
        else re=f(A,i,B,j+te,k-te);
        
        return re;
    }
    double findMedianSortedArrays(int A[], int n, int B[], int m) {
        int l=(m+n+1)/2;
        int r=(m+n+2)/2;
        N=n,M=m;
        return (f(A,0,B,0,l)+f(A,0,B,0,r))/2.0;
    }


发表于 2020-03-18 15:47:45 回复(0)
public class Solution {
    //思路:合并两个数组,然后找到中间值。
    public double findMedianSortedArrays(int A[], int B[]) {
        int pa = 0;
        int pb = 0;
        int alength = A.length;
        int blength = B.length;
        int[] temp = new int[alength + blength];
        int ptemp = 0;
        //合并A与B两个有序数组到临时数组temp中
        while(pa<alength && pb<blength) {
            if(A[pa] <= B[pb]) {
                temp[ptemp] = A[pa];
                pa++;
            } else {
                temp[ptemp] = B[pb];
                pb++;
            }
            ptemp++;
        }
        while(pa<alength) {
            temp[ptemp] = A[pa];
            pa++;
            ptemp++;
        }
        while(pb<blength) {
            temp[ptemp] = B[pb];
            pb++;
            ptemp++;
        }
        //如果数组长度为奇数,则返回下标为中间值的数组元素值;
        if((alength+blength)%2 != 0) return (double)temp[(alength+blength)/2];
        //如果数组长度为偶数,则返回下标在中间的两个数组元素值的平均值;
        else return (temp[(alength+blength)/2]+temp[(alength+blength)/2 - 1])/2.0;
    }
}

发表于 2019-01-02 13:34:39 回复(1)
睶头像
public class Solution {
    public double findMedianSortedArrays(int[] nums1, int[] nums2) {
        int[] nums = new int[nums1.length + nums2.length];
        if(nums.length == 0)
            return 0.0;
        int i1 = 0, i2 = 0, i = 0;
        while(i < nums.length){
            if(i1 >= nums1.length){
                nums[i++] = nums2[i2++];
            }
            else if(i2 >= nums2.length){
                nums[i++] = nums1[i1++];
            }
            else{
                if(nums1[i1] > nums2[i2])
                    nums[i++] = nums2[i2++];
                else
                    nums[i++] = nums1[i1++];
            }
        }
        if(nums.length % 2 == 1)
            return nums[(nums.length-1) / 2];
        else
            return (nums[(nums.length-1) / 2] + nums[(nums.length-1) / 2 + 1]) / 2.0;
    }
}

发表于 2018-08-22 17:03:00 回复(0)
将两个有序数组合并,最多只需要保存两个数(总长度为偶数时),因此只需要用两个数来保存遍历过的前两个数就行。
最大的问题还会在i,j为何值时到了中间区域,这个可以用短数组验证得到
    public double findMedianSortedArrays(int A[], int B[]) {
        int cur = 0;
        int pre = 0;
        int i = 0, j = 0;
        int total = A.length + B.length;
        while(i < A.length && j<B.length){
            pre = cur;
            if(A[i] < B[j]){
                cur = A[i++];
            }
            else{
                cur = B[j++];
            }
            if( i + j == total / 2 + 1){
                if(total % 2 == 0) return (double)(pre + cur) / 2;
                else return (double)cur;
            }
        }
        while(i<A.length){
            pre = cur;
            cur = A[i++];
            if( i + j == total / 2 + 1){
                if(total % 2 == 0) return (double)(pre + cur) / 2;
                else return (double)cur;
            }
        }
        while(j<B.length){
            pre = cur;
            cur = B[j++];
            if( i + j == total / 2 + 1){
                if(total % 2 == 0) return (double)(pre + cur) / 2;
                else return (double)cur;
            }
        }
        if(total % 2 == 0) return (double)(pre + cur) / 2;
        else return (double)cur;
    }

发表于 2018-08-03 22:49:56 回复(0)
思路:
与归并排序类似,两个有序序列依次合并。
两个序列长度之和若为奇数,则合并至(m+n)/2+1即可,若为偶数,则需要将中间两位全部合并,同时取平均值。代码如下
double findMedianSortedArrays(int A[], int m, int B[], int n) {
    int mid = (m + n) / 2 + 1;
    int i,j;
    vector<int> vec;
    for ( i = 0, j = 0; i<m&&j<n;)
    {
        if (A[i] > B[j])
            vec.push_back(B[j++]);
        else
            vec.push_back(A[i++]);
        if (vec.size() == mid)
            break;
    }
    while(vec.size() != mid)
    {
        if (i < m)
            vec.push_back(A[i++]);
        if (j < n)
            vec.push_back(B[j++]);
    }
    if ((m+n) % 2 == 0)
        return (vec[vec.size() - 1] + vec[vec.size() - 2]) / 2.0;
    return *(vec.end()-1);
}

发表于 2017-09-28 20:33:33 回复(1)
class Solution {
public:
    double findMedianSortedArrays(int A[], int m, int B[], int n) {
        if(m<0 || n<0)
            return 0;
        double r = 0;
        if((m+n) & 1)
            r = findK(A,m,B,n,(m+n+1)/2);
        else
            r = (double)(findK(A,m,B,n,(m+n)/2) + findK(A,m,B,n,(m+n)/2+1) )/2;
        return r;     }          int findK(int A[], int m, int B[], int n, int k)     {         if(n < m)             return findK(B,n,A,m,k);         if(m == 0)             return B[k-1];         if(k == 1)             return (A[0]<B[0])?A[0]:B[0];         int a = min(m, k/2);         int b = k - a;         if(A[a-1] < B[b-1])             return findK(A+a, m-a, B, n, k-a);         else if(A[a-1] > B[b-1])             return findK(A, m, B+b, n-b, k-b);         else             return A[a-1];     }
};

发表于 2017-09-20 15:13:34 回复(0)
class Solution {
private:
    int kthMin(int A[], int m, int B[], int n, int k) {
        if(n < m)
            return kthMin(B, n, A, m, k);
        if(m == 0)
            return B[k - 1];
        if(k == 1)
            return min(A[0], B[0]);
        int pa = min(m, k / 2);
        int pb = k - pa;
        if(A[pa - 1] == B[pb - 1]) {
            return A[pa - 1];
        }
        else if(A[pa - 1] < B[pb - 1])
            return kthMin(A + pa, m - pa, B, n, k - pa);
        else
            return kthMin(A, m, B + pb, n - pb, k - pb);
    }
public:
    double findMedianSortedArrays(int A[], int m, int B[], int n) {
        if(m <= 0 && n <= 0)
            return 0.0;
        return (kthMin(A, m, B, n, (m + n + 2) >> 1) + kthMin(A, m, B, n, (m + n + 1) >> 1)) / 2.0;
    }
};

发表于 2017-07-22 18:13:15 回复(1)

This problem is notoriously hard to implement due to all the corner cases. Most implementations consider odd-lengthed and even-lengthed arrays as two different cases and treat them separately. As a matter of fact, with a little mind twist. These two cases can be combined as one, leading to a very simple solution where (almost) no special treatment is needed.

First, let's see the concept of 'MEDIAN' in a slightly unconventional way. That is:

"if we cut the sorted array to two halves of EQUAL LENGTHS, then
median is the AVERAGE OF Max(lower_half) and Min(upper_half), i.e. the
two numbers immediately next to the cut
".

For example, for [2 3 5 7], we make the cut between 3 and 5:

[2 3 / 5 7]

then the median = (3+5)/2. Note that I'll use '/' to represent a cut, and (number / number) to represent a cut made through a number in this article.

for [2 3 4 5 6], we make the cut right through 4 like this:

[2 3 (4/4) 5 7]

Since we split 4 into two halves, we say now both the lower and upper subarray contain 4. This notion also leads to the correct answer: (4 + 4) / 2 = 4;

For convenience, let's use L to represent the number immediately left to the cut, and R the right counterpart. In [2 3 5 7], for instance, we have L = 3 and R = 5, respectively.

We observe the index of L and R have the following relationship with the length of the array N:

N        Index of L / R 1 0 / 0 2 0 / 1 3 1 / 1 4 1 / 2 5 2 / 2 6 2 / 3 7 3 / 3 8 3 / 4 

It is not hard to conclude that index of L = (N-1)/2, and R is at N/2. Thus, the median can be represented as

(L + R)/2 = (A[(N-1)/2] + A[N/2])/2

To get ready for the two array situation, let's add a few imaginary 'positions' (represented as #'s) in between numbers, and treat numbers as 'positions' as well.

[6 9 13 18]  ->   [# 6 # 9 # 13 # 18 #]    (N = 4)
position index 0 1 2 3 4 5 6 7 8 (N_Position = 9)
		  
[6 9 11 13 18]->   [# 6 # 9 # 11 # 13 # 18 #]   (N = 5)
position index 0 1 2 3 4 5 6 7 8 9 10 (N_Position = 11)

As you can see, there are always exactly 2*N+1 'positions' regardless of length N. Therefore, the middle cut should always be made on the Nth position (0-based). Since index(L) = (N-1)/2 and index(R) = N/2 in this situation, we can infer that index(L) = (CutPosition-1)/2, index(R) = (CutPosition)/2.


Now for the two-array case:

A1: [# 1 # 2 # 3 # 4 # 5 #]    (N1 = 5, N1_positions = 11)

A2: [# 1 # 1 # 1 # 1 #]     (N2 = 4, N2_positions = 9)

Similar to the one-array problem, we need to find a cut that divides the two arrays each into two halves such that

"any number in the two left halves" <= "any number in the two right
halves".

We can also make the following observations:

  1. There are 2N1 + 2N2 + 2 position altogether. Therefore, there must be exactly N1 + N2 positions on each side of the cut, and 2 positions directly on the cut.

  2. Therefore, when we cut at position C2 = K in A2, then the cut position in A1 must be C1 = N1 + N2 - k. For instance, if C2 = 2, then we must have C1 = 4 + 5 - C2 = 7.

    [# 1 # 2 # 3 # (4/4) # 5 #]    
    
     [# 1 / 1 # 1 # 1 #]
  3. When the cuts are made, we'd have two L's and two R's. They are

    L1 = A1[(C1-1)/2]; R1 = A1[C1/2];
     L2 = A2[(C2-1)/2]; R2 = A2[C2/2];

In the above example,

L1 = A1[(7-1)/2] = A1[3] = 4; R1 = A1[7/2] = A1[3] = 4;
    L2 = A2[(2-1)/2] = A2[0] = 1; R2 = A1[2/2] = A1[1] = 1;

Now how do we decide if this cut is the cut we want? Because L1, L2 are the greatest numbers on the left halves and R1, R2 are the smallest numbers on the right, we only need

L1 <= R1 && L1 <= R2 && L2 <= R1 && L2 <= R2 

to make sure that any number in lower halves <= any number in upper halves. As a matter of fact, since
L1 <= R1 and L2 <= R2 are naturally guaranteed because A1 and A2 are sorted, we only need to make sure:

L1 <= R2 and L2 <= R1.

Now we can use simple binary search to find out the result.

If we have L1 > R1, it means there are too many large numbers on the left half of A1, then we must move C1 to the left (i.e. move C2 to the right); If L2 > R1, then there are too many large numbers on the left half of A2, and we must move C2 to the left.
Otherwise, this cut is the right one. 
After we find the cut, the medium can be computed as (max(L1, L2) + min(R1, R2)) / 2;

Two side notes:

A. since C1 and C2 can be mutually determined from each other, we might as well select the shorter array (say A2) and only move C2 around, and calculate C1 accordingly. That way we can achieve a run-time complexity of O(log(min(N1, N2)))

B. The only edge case is when a cut falls on the 0th(first) or the 2Nth(last) position. For instance, if C2 = 2N2, then R2 = A2[2*N2/2] = A2[N2], which exceeds the boundary of the array. To solve this problem, we can imagine that both A1 and A2 actually have two extra elements, INT_MAX at A[-1] and INT_MAX at A[N]. These additions don't change the result, but make the implementation easier: If any L falls out of the left boundary of the array, then L = INT_MIN, and if any R falls out of the right boundary, then R = INT_MAX.


I know that was not very easy to understand, but all the above reasoning eventually boils down to the following concise code:

double findMedianSortedArrays(vector<int>& nums1, vector<int>& nums2) { int N1 = nums1.size(); int N2 = nums2.size(); if (N1 < N2) return findMedianSortedArrays(nums2, nums1); // Make sure A2 is the shorter one. if (N2 == 0) return ((double)nums1[(N1-1)/2] + (double)nums1[N1/2])/2; // If A2 is empty int lo = 0, hi = N2 * 2; while (lo <= hi) { int mid2 = (lo + hi) / 2; // Try Cut 2  int mid1 = N1 + N2 - mid2; // Calculate Cut 1 accordingly double L1 = (mid1 == 0) ? INT_MIN : nums1[(mid1-1)/2]; // Get L1, R1, L2, R2 respectively double L2 = (mid2 == 0) ? INT_MIN : nums2[(mid2-1)/2]; double R1 = (mid1 == N1 * 2) ? INT_MAX : nums1[(mid1)/2]; double R2 = (mid2 == N2 * 2) ? INT_MAX : nums2[(mid2)/2]; if (L1 > R2) lo = mid2 + 1; // A1's lower half is too big; need to move C1 left (C2 right) else if (L2 > R1) hi = mid2 - 1; // A2's lower half too big; need to move C2 left. else return (max(L1,L2) + min(R1, R2)) / 2; // Otherwise, that's the right cut. } return -1;
}
发表于 2017-03-13 00:41:24 回复(2)

To solve this problem, we need to understand "What is the use of median". In statistics, the median is used for dividing a set into two equal length subsets, that one subset is always greater than the other. If we understand the use of median for dividing, we are very close to the answer.

First let's cut A into two parts at a random position i:

left_A             |        right_A
A[0], A[1], ..., A[i-1] |  A[i], A[i+1], ..., A[m-1] 

Since A has m elements, so there are m+1 kinds of cutting( i = 0 ~ m ). And we know: len(left_A) = i, len(right_A) = m - i . Note: when i = 0 , left_A is empty, and when i = m , right_A is empty.

With the same way, cut B into two parts at a random position j:

left_B             |        right_B
B[0], B[1], ..., B[j-1] |  B[j], B[j+1], ..., B[n-1] 

Put left_A and left_B into one set, and put right_A and right_B into another set. Let's name them left_part and right_part :

left_part          |        right_part
A[0], A[1], ..., A[i-1] |  A[i], A[i+1], ..., A[m-1] B[0], B[1], ..., B[j-1] |  B[j], B[j+1], ..., B[n-1] 

If we can ensure:

1) len(left_part) == len(right_part) 2) max(left_part) <= min(right_part)

then we divide all elements in {A, B} into two parts with equal length, and one part is always greater than the other. Then median = (max(left_part) + min(right_part))/2.

To ensure these two conditions, we just need to ensure:

(1) i + j == m - i + n - j (or: m - i + n - j + 1) if n >= m, we just need to set: i = 0 ~ m, j = (m + n + 1)/2 - i (2) B[j-1] <= A[i] and A[i-1] <= B[j]

ps.1 For simplicity, I presume A[i-1],B[j-1],A[i],B[j] are always valid even if i=0/i=m/j=0/j=n . I will talk about how to deal with these edge values at last.

ps.2 Why n >= m? Because I have to make sure j is non-nagative since 0 <= i <= m and j = (m + n + 1)/2 - i. If n < m , then j may be nagative, that will lead to wrong result.

So, all we need to do is:

Searching i in [0, m], to find an object `i` that:
    B[j-1] <= A[i] and A[i-1] <= B[j], ( where j = (m + n + 1)/2 - i )

And we can do a binary search following steps described below:

<1> Set imin = 0, imax = m, then start searching in [imin, imax] <2> Set i = (imin + imax)/2, j = (m + n + 1)/2 - i <3> Now we have len(left_part)==len(right_part). And there are only 3 situations
     that we may encounter: <a> B[j-1] <= A[i] and A[i-1] <= B[j]
        Means we have found the object `i`, so stop searching. <b> B[j-1] > A[i]
        Means A[i] is too small. We must `ajust` i to get `B[j-1] <= A[i]`.
        Can we `increase` i?
            Yes. Because when i is increased, j will be decreased.
            So B[j-1] is decreased and A[i] is increased, and `B[j-1] <= A[i]` may be satisfied.
        Can we `decrease` i?
            `No!` Because when i is decreased, j will be increased.
            So B[j-1] is increased and A[i] is decreased, and B[j-1] <= A[i] will be never satisfied.
        So we must `increase` i. That is, we must ajust the searching range to [i+1, imax]. So, set imin = i+1, and goto <2>. <c> A[i-1] > B[j]
        Means A[i-1] is too big. And we must `decrease` i to get `A[i-1]<=B[j]`.
        That is, we must ajust the searching range to [imin, i-1].
        So, set imax = i-1, and goto <2>.

When the object i is found, the median is:

max(A[i-1], B[j-1]) (when m + n is odd)
or (max(A[i-1], B[j-1]) + min(A[i], B[j]))/2 (when m + n is even)

Now let's consider the edges values i=0,i=m,j=0,j=n where A[i-1],B[j-1],A[i],B[j] may not exist. Actually this situation is easier than you think.

What we need to do is ensuring that max(left_part) <= min(right_part). So, if i and j are not edges values(means A[i-1],B[j-1],A[i],B[j] all exist), then we must check both B[j-1] <= A[i] and A[i-1] <= B[j]. But if some of A[i-1],B[j-1],A[i],B[j] don't exist, then we don't need to check one(or both) of these two conditions. For example, if i=0, then A[i-1] doesn't exist, then we don't need to check A[i-1] <= B[j]. So, what we need to do is:

Searching i in [0, m], to find an object `i` that:
    (j == 0 or i == m or B[j-1] <= A[i]) and (i == 0 or j == n or A[i-1] <= B[j])
    where j = (m + n + 1)/2 - i

And in a searching loop, we will encounter only three situations:

<a> (j == 0 or i == m or B[j-1] <= A[i]) and  (i == 0 or j = n or A[i-1] <= B[j])  Means i is perfect, we can stop searching.

<b> j > 0 and i < m and B[j - 1] > A[i]
    Means i is too small, we must increase it.

<c> i > 0 and j < n and A[i - 1] > B[j]  Means i is too big, we must decrease it.

Thank @Quentin.chen , him pointed out that: i < m ==> j > 0 and i > 0 ==> j < n . Because:

m <= n, i < m ==> j = (m+n+1)/2 - i > (m+n+1)/2 - m >= (2*m+1)/2 - m >= 0 m <= n, i > 0 ==> j = (m+n+1)/2 - i < (m+n+1)/2 <= (2*n+1)/2 <= n

So in situation <b> and <c>, we don't need to check whether j > 0 and whether j < n.

Below is the accepted code:

def median(A, B):
    m, n = len(A), len(B) if m > n:
        A, B, m, n = B, A, n, m if n == 0:
        raise ValueError

    imin, imax, half_len = 0, m, (m + n + 1) / 2 while imin <= imax: i = (imin + imax) / 2 j = half_len - i if i < m and B[j-1] > A[i]: # i is too small, must increase it imin = i + 1 elif i > 0 and A[i-1] > B[j]: # i is too big, must decrease it imax = i - 1 else: # i is perfect if i == 0: max_of_left = B[j-1]
            elif j == 0: max_of_left = A[i-1] else: max_of_left = max(A[i-1], B[j-1]) if (m + n) % 2 == 1:
                return max_of_left if i == m: min_of_right = B[j]
            elif j == n: min_of_right = A[i] else: min_of_right = min(A[i], B[j])

            return (max_of_left + min_of_right) / 2.0
发表于 2017-03-13 00:40:05 回复(0)
class Solution {
public:
    double findMedianSortedArrays(int A[], int m, int B[], int n) {
        if(m < n) return findMedianSortedArrays(B,n,A,m);
        if(n == 0) return (double)(A[(m - 1)/2] + A[(m/2)])/2.0;
        int lo = 0, hi = 2 * n, c2 = n;
        while(c2 >= 0 && c2 <= 2*n) {
            int c1 = m + n - c2;
            double L1 = (c1 == 0) ? INT_MIN : A[(c1 - 1)/2];
            double L2 = (c2 == 0) ? INT_MIN : B[(c2 - 1)/2];
            double R1 = (c1 == 2 * m) ? INT_MAX : A[c1/2];
            double R2 = (c2 == 2 * n) ? INT_MAX : B[c2/2];
            if(L1 > R2) c2++;
            else if(L2 > R1) c2--;
            else return (double)(max(L1,L2) + min(R1,R2)) / 2.0;
        }
        return -1;
    }
};
编辑于 2017-03-13 10:48:45 回复(0)
/*如果我们可以在两个数列中求出第K小的元素,便可以解决该问题
    不妨设数列A元素个数为n,数列B元素个数为m,各自升序排序,求第k小元素
    取A[k / 2] B[k / 2] 比较,
    如果 A[k / 2] > B[k / 2] 那么,所求的元素必然不在B的前k / 2个元素中(证明反证法)
    反之,必然不在A的前k / 2个元素中,于是我们可以将A或B数列的前k / 2元素删去,求剩下两个数列的
    k - k / 2小元素,于是得到了数据规模变小的同类问题,递归解决
    如果 k / 2 大于某数列个数,所求元素必然不在另一数列的前k / 2个元素中,同上操作就好。
    */
class Solution {
public:
    int findKth(int A[], int m, int B[], int n, int ast, int bst, int k){
        if(ast >= m)
            return B[bst + k -1];
        if(bst >= n)
            return A[ast + k -1];
        if(k == 1)
            return (A[ast] < B[bst] ? A[ast] : B[bst]);

        int Ak = (ast + k/2 - 1 >= m) ? INT_MAX : A[ast + k/2 - 1];
        int Bk = (bst + k/2 - 1 >= n) ? INT_MAX : B[bst + k/2 - 1];
        if(Ak < Bk)
            return findKth(A,m,B,n,ast+k/2,bst,k-k/2);
        else
            return findKth(A,m,B,n,ast,bst+k/2,k-k/2);

    }
    double findMedianSortedArrays(int A[], int m, int B[], int n){
        double median = 0.0;
        if((m+n) & 1){
            median = (double)findKth(A,m,B,n,0,0,(m+n+1)/2);
        }
        else{
            median = (double)(findKth(A,m,B,n,0,0,(m+n)/2) + findKth(A,m,B,n,0,0,(m+n)/2+1)) / 2;
        }
        return median;
   }
};

发表于 2016-12-05 15:24:54 回复(0)
class Solution:
    def findMedianSortedArrays(self , A , B ):
        # write code here
        A.extend(B)
        A.sort()
        if len(A)%2==0:
            res = (A[len(A)//2-1]+A[len(A)//2])/2
            return res
        else:
            res = A[len(A)//2]
            return res

发表于 2021-07-03 16:43:29 回复(0)