有 Java 编程相关的问题?

你可以在下面搜索框中键入要查询的问题!

Java中的中值算法

我正试图在Java中为这样一种方法实现中间值:

Select(Comparable[] list, int pos, int colSize, int colMed)
  • list是一个要找到指定位置的值列表
  • pos是指定的位置
  • colSize是我在第一阶段创建的列的大小
  • colMed是我在这些列中用作medX的位置

我不确定哪种排序算法最适合使用,也不确定如何准确实现这一点


共 (5) 个答案

  1. # 1 楼答案

    我知道这是一篇很老的帖子,你可能已经不记得了。但我想知道,在实现时,您是否测量了实现的运行时间

    我尝试了这个算法,并将其与使用java排序方法(Arrays.sort())的简单方法进行比较,然后从排序后的数组中选择第k个元素。我得到的结果是,当数组的大小约为十万个或更多元素时,该算法仅优于java排序算法。它只快了2到3倍,这显然不是对数(n)时间快

    你对此有何评论

  2. # 2 楼答案

    这个问题是关于Java的,所以就在这里

    import java.util.*;
    
    public class MedianOfMedians {
        private MedianOfMedians() {
    
        }
    
        /**
         * Returns median of list in linear time.
         * 
         * @param list list to search, which may be reordered on return
         * @return median of array in linear time.
         */
        public static Comparable getMedian(ArrayList<Comparable> list) {
            int s = list.size();
            if (s < 1)
                throw new IllegalArgumentException();
            int pos = select(list, 0, s, s / 2);
            return list.get(pos);
        }
    
        /**
         * Returns position of k'th largest element of sub-list.
         * 
         * @param list list to search, whose sub-list may be shuffled before
         *            returning
         * @param lo first element of sub-list in list
         * @param hi just after last element of sub-list in list
         * @param k
         * @return position of k'th largest element of (possibly shuffled) sub-list.
         */
        public static int select(ArrayList<Comparable> list, int lo, int hi, int k) {
            if (lo >= hi || k < 0 || lo + k >= hi)
                throw new IllegalArgumentException();
            if (hi - lo < 10) {
                Collections.sort(list.subList(lo, hi));
                return lo + k;
            }
            int s = hi - lo;
            int np = s / 5; // Number of partitions
            for (int i = 0; i < np; i++) {
                // For each partition, move its median to front of our sublist
                int lo2 = lo + i * 5;
                int hi2 = (i + 1 == np) ? hi : (lo2 + 5);
                int pos = select(list, lo2, hi2, 2);
                Collections.swap(list, pos, lo + i);
            }
    
            // Partition medians were moved to front, so we can recurse without making another list.
            int pos = select(list, lo, lo + np, np / 2);
    
            // Re-partition list to [<pivot][pivot][>pivot]
            int m = triage(list, lo, hi, pos);
            int cmp = lo + k - m;
            if (cmp > 0)
                return select(list, m + 1, hi, k - (m - lo) - 1);
            else if (cmp < 0)
                return select(list, lo, m, k);
            return lo + k;
        }
    
        /**
         * Partition sub-list into 3 parts [<pivot][pivot][>pivot].
         * 
         * @param list
         * @param lo
         * @param hi
         * @param pos input position of pivot value
         * @return output position of pivot value
         */
        private static int triage(ArrayList<Comparable> list, int lo, int hi,
                int pos) {
            Comparable pivot = list.get(pos);
            int lo3 = lo;
            int hi3 = hi;
            while (lo3 < hi3) {
                Comparable e = list.get(lo3);
                int cmp = e.compareTo(pivot);
                if (cmp < 0)
                    lo3++;
                else if (cmp > 0)
                    Collections.swap(list, lo3, --hi3);
                else {
                    while (hi3 > lo3 + 1) {
                        assert (list.get(lo3).compareTo(pivot) == 0);
                        e = list.get(--hi3);
                        cmp = e.compareTo(pivot);
                        if (cmp <= 0) {
                            if (lo3 + 1 == hi3) {
                                Collections.swap(list, lo3, lo3 + 1);
                                lo3++;
                                break;
                            }
                            Collections.swap(list, lo3, lo3 + 1);
                            assert (list.get(lo3 + 1).compareTo(pivot) == 0);
                            Collections.swap(list, lo3, hi3);
                            lo3++;
                            hi3++;
                        }
                    }
                    break;
                }
            }
            assert (list.get(lo3).compareTo(pivot) == 0);
            return lo3;
        }
    
    }
    

    这里有一个单元测试来检查它的工作情况

    import java.util.*;
    
    import junit.framework.TestCase;
    
    public class MedianOfMedianTest extends TestCase {
        public void testMedianOfMedianTest() {
            Random r = new Random(1);
            int n = 87;
            for (int trial = 0; trial < 1000; trial++) {
                ArrayList list = new ArrayList();
                int[] a = new int[n];
                for (int i = 0; i < n; i++) {
                    int v = r.nextInt(256);
                    a[i] = v;
                    list.add(v);
                }
                int m1 = (Integer)MedianOfMedians.getMedian(list);
                Arrays.sort(a);
                int m2 = a[n/2];
                assertEquals(m1, m2);
            }
        }
    }
    

    然而,上面的代码对于实际使用来说太慢了

    这里有一种更简单的方法来获取k'th元素,它不保证性能,但在实践中要快得多:

    /**
     * Returns position of k'th largest element of sub-list.
     * 
     * @param list list to search, whose sub-list may be shuffled before
     *            returning
     * @param lo first element of sub-list in list
     * @param hi just after last element of sub-list in list
     * @param k
     * @return position of k'th largest element of (possibly shuffled) sub-list.
     */
    static int select(double[] list, int lo, int hi, int k) {
        int n = hi - lo;
        if (n < 2)
            return lo;
    
        double pivot = list[lo + (k * 7919) % n]; // Pick a random pivot
    
        // Triage list to [<pivot][=pivot][>pivot]
        int nLess = 0, nSame = 0, nMore = 0;
        int lo3 = lo;
        int hi3 = hi;
        while (lo3 < hi3) {
            double e = list[lo3];
            int cmp = compare(e, pivot);
            if (cmp < 0) {
                nLess++;
                lo3++;
            } else if (cmp > 0) {
                swap(list, lo3, --hi3);
                if (nSame > 0)
                    swap(list, hi3, hi3 + nSame);
                nMore++;
            } else {
                nSame++;
                swap(list, lo3, --hi3);
            }
        }
        assert (nSame > 0);
        assert (nLess + nSame + nMore == n);
        assert (list[lo + nLess] == pivot);
        assert (list[hi - nMore - 1] == pivot);
        if (k >= n - nMore)
            return select(list, hi - nMore, hi, k - nLess - nSame);
        else if (k < nLess)
            return select(list, lo, lo + nLess, k);
        return lo + k;
    }
    
  3. # 3 楼答案

    我同意Chip Uni的答案/解决方案。我将仅对排序部分进行评论,并提供一些进一步的解释:

    您不需要任何排序算法。该算法类似于快速排序,不同之处在于只求解一个分区(左或右)。我们只需要找到一个最佳的支点,使左右部分尽可能相等,这意味着N/2+N/4+N/8…=2N次迭代,因此O(N)的时间复杂度。上述算法称为中值中值算法,它计算中值为5的中值,从而得出算法的线性时间复杂度

    然而,在搜索第n个最小/最大元素的范围时(我想你是用这个算法实现的),会使用排序算法来加速算法。在多达7到10个元素的小型阵列上,插入排序特别快

    实施说明:

    M = select({x[i]}, n/10)
    

    实际上意味着取五元素组所有中间值的中值。您可以通过创建另一个大小为(n - 1)/5 + 1的数组并递归调用相同的算法来找到第n/10个元素(这是新创建的数组的中值)

  4. # 4 楼答案

    @android开发者:

    for (i = 1 to n/5) do
        x[i] = select(S[i],3)
    

    真的

    for (i = 1 to ceiling(n/5) do
        x[i] = select(S[i],3)
    

    具有适合您的数据的上限函数(例如在java 2 Double中) 这也会影响中位数wrt,仅取n/10,但我们发现最接近数组中出现的平均值,而不是真正的平均值。 另一个值得注意的是,S[i]的元素可能少于3个,所以我们想要找到关于长度的中位数;用k=3将其传递到select并不总是有效的。(例如n=11,我们有3个子群2W5,1W1元素)

  5. # 5 楼答案

    我不知道你是否还需要解决这个问题,但是http://www.ics.uci.edu/~eppstein/161/960130.html有一个算法:

    select(L,k)
    {
        if (L has 10 or fewer elements)
        {
            sort L
            return the element in the kth position
        }
    
        partition L into subsets S[i] of five elements each
            (there will be n/5 subsets total).
    
        for (i = 1 to n/5) do
            x[i] = select(S[i],3)
    
        M = select({x[i]}, n/10)
    
        partition L into L1<M, L2=M, L3>M
        if (k <= length(L1))
            return select(L1,k)
        else if (k > length(L1)+length(L2))
            return select(L3,k-length(L1)-length(L2))
        else return M
    }
    

    祝你好运