有 Java 编程相关的问题?

你可以在下面搜索框中键入要查询的问题!

java基于索引数组对数组进行排序的有效就地方法是什么?

在某些机器学习算法中,矩阵的列根据每列的相关性进行旋转和排序。新数据应该以相同的顺序进行转换。因此,如果我的初始排序将[0,2,1,3]作为索引数组,那么新数据也应该按如下方式排序:第一、第三、第二、第四个元素。这就是为什么我想创建一个排序索引数组,以后可以将其用作重新排序新数据的源。我在下面的实现中成功地做到了这一点

我的问题是关于如何使用索引数组对新数据重新排序。在我的实现中,我首先创建新数据阵列的克隆。比只将源数组中的元素复制到目标数组中的适当索引更容易。这是最有效的方法吗?或者有没有一种更有效的方法,比如对数据进行排序

import java.util.stream.*;
import java.util.*;

public class IndexSorter<T> {

   private final int[] indices;
   private final int[] reverted;

   public IndexSorter(T[] data, Comparator<T> comparator){
     
     // generate index array based on initial data and a comparator:
     indices = IntStream.range(0, data.length)
                        .boxed()
                        .sorted( (a, b) -> comparator.compare(data[a],data[b]))
                        .mapToInt(a -> a)
                        .toArray();

     // also create an index array to be able to revert the sort
     reverted = new int[indices.length];
     for(int i=0;i<indices.length;i++){
       reverted[indices[i]] = i;
     }
   }

   // sort new data based on initial array
   public T[] sort(T[] data){
     return sortUsing(data, indices);
   }
   
   // revert sorted data 
   public T[] revert(T[] data){
     return sortUsing(data, reverted);
   }

   private T[] sortUsing(T[] data, int[] ind){
     if(data.length != indices.length){
       throw new IllegalArgumentException(
         String.format("Data length does not match: (%s, should be: %s) "
         ,  data.length, indices.length));
     }
     // create a copy of the data (efficively this just creates a new array)
     T[] sorted = data.clone();
     // fill the copy with the sorted data
     IntStream.range(0, ind.length)
              .forEach(i -> sorted[i]=data[ind[i]]);
     return sorted;
   }
}

class App {
  public static void main(String args[]){
      IndexSorter<String> sorter = new IndexSorter<>(args, String::compareTo);
      String[] data = sorter.sort(args);
      System.out.println(Arrays.toString(data));
      data = sorter.revert(data);
      System.out.println(Arrays.toString(data));
      data = IntStream.range(0, data.length)
                                .mapToObj(Integer::toString)
                                .toArray(String[]::new);
      data = sorter.sort(data);
      System.out.println(Arrays.toString(data));
      data = sorter.revert(data);
      System.out.println(Arrays.toString(data));
  }
}

共 (2) 个答案

  1. # 1 楼答案

    我找到了一种就地排序的方法,使用位集跟踪哪些索引具有正确的元素。这是在方法上的sortUsing。我希望有人能使用这个算法

    您可以这样测试它:

    java App this is just some random test to show the result
    

    然后结果将首先显示排序的结果,而不是还原的结果。 同一索引数组还用于对索引的int数组进行排序,还原版本为:

    [is, just, random, result, show, some, test, the, this, to]
    [this, is, just, some, random, test, to, show, the, result]
    [1, 2, 4, 9, 7, 3, 5, 8, 0, 6]
    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
    

    代码如下:

    import java.util.stream.*;
    import java.util.*;
    
    public class IndexSorter<T> {
    
      private final int[] indices;
      private final int[] reverted;
      private final BitSet done;
    
      public IndexSorter(T[] data, Comparator<T> comparator){
    
        // generate index array based on initial data and a comparator:
        indices = IntStream.range(0, data.length)
                           .boxed()
                           .sorted( (a, b) -> comparator.compare(data[a],data[b]))
                           .mapToInt(a -> a)
                           .toArray();
    
        // also create an index array to be able to revert the sort
        reverted = new int[indices.length];
        for(int i=0;i<indices.length;i++){
          reverted[indices[i]] = i;
        }
        done = new BitSet(data.length);
      }
    
      // sort new data based on initial array
      public void sort(T[] data){
        sortUsing(data, indices);
      }
    
      // revert sorted data 
      public void revert(T[] data){
        sortUsing(data, reverted);
      }
    
      private void sortUsing(T[] data, int[] ind){
        if(data.length != indices.length){
          throw new IllegalArgumentException(
              String.format("Data length does not match: (%s, should be: %s) "
                ,  data.length, indices.length));
        }
        int ia=0, ib=0, x = 0;
        T a = null, b = null;
        for (int i=0; i< data.length && done.cardinality()<data.length; i++){
          ia = i;
          ib = ind[ia];
          if(done.get(ia)){ // index is already done
            continue;
          } 
          if(ia==ib){       // element is at the right place
            done.set(ia);
            continue;
          }
          x = ia;           // start a loop at x = ia 
                            // some next index will be x again eventually
          a = data[ia];     // keep element a as the last value after the loop 
          while(ib!=x && !done.get(ia) ){
            b = data[ib];   // element from index b must go to index a 
            data[ia]=b;
            done.set(ia);
            ia = ib;
            ib = ind[ia];   // get next index
          }
          data[ia]=a;       // set value a to last index
          done.set(ia);
        }
        done.clear();
      }
    }
    
    class App {
      public static void main(String args[]){
        IndexSorter<String> sorter = new IndexSorter<>(args, String::compareTo);
        sorter.sort(args);
        System.out.println(Arrays.toString(args));
        sorter.revert(args);
        System.out.println(Arrays.toString(args));
        String[] data = IntStream.range(0, args.length)
          .mapToObj(Integer::toString)
          .toArray(String[]::new);
        sorter.sort(data);
        System.out.println(Arrays.toString(data));
        sorter.revert(data);
        System.out.println(Arrays.toString(data));
      }
    }
    
  2. # 2 楼答案

    我不建议复制数据。因为这是一种非常昂贵的内存分配。使用库方法(如Arrays.sort)对数据进行就地排序要高效得多