有 Java 编程相关的问题?

你可以在下面搜索框中键入要查询的问题!

用Java计算Excel列中的重复行数

我有一个excel文档,我想实现两件事

  1. 我想得到每个类别的违规数量
  2. 每个类别的最早和最晚违规日期

我已经可以通过Apache Poi获取文档,但我无法实现这些

我获取xlsx文件的代码如下所示:

public class App {
    public static final String SAMPLE_XLSX_FILE_PATH = "./data.xlsx";

    public static void main(String[] args) throws IOException, InvalidFormatException {

        // Creating a Workbook from an Excel file (.xls or .xlsx)
        Workbook workbook = WorkbookFactory.create(new File(SAMPLE_XLSX_FILE_PATH));

        // Retrieving the number of sheets in the Workbook
        System.out.println("Workbook has " + workbook.getNumberOfSheets() + " Sheets : ");



        Iterator<Sheet> sheetIterator = workbook.sheetIterator();
        System.out.println("Retrieving Sheets using Iterator");
        while (sheetIterator.hasNext()) {
            Sheet sheet = sheetIterator.next();
            System.out.println("=> " + sheet.getSheetName());
        }


        // Getting the Sheet at index zero
        Sheet sheet = workbook.getSheetAt(0);


        DataFormatter dataFormatter = new DataFormatter();


        System.out.println("\n\nIterating over Rows and Columns using Iterator\n");
        Iterator<Row> rowIterator = sheet.rowIterator();
        while (rowIterator.hasNext()) {
            Row row = rowIterator.next();

            // Now let's iterate over the columns of the current row
            Iterator<Cell> cellIterator = row.cellIterator();

            while (cellIterator.hasNext()) {
                Cell cell = cellIterator.next();
                String cellValue = dataFormatter.formatCellValue(cell);
                System.out.print(cellValue + "\t");
            }
            System.out.println();
        }

        // Closing the workbook
        workbook.close();
    }


}

我的excel文件如下所示: Excel file


共 (1) 个答案

  1. # 1 楼答案

    1. 您必须跳过第一行,因为它似乎是没有实际数据的“标题行”
    2. 因为你对违规数量和日期(最早和最晚)感兴趣,所以只需要评估C列和D列(或者E列——在excel表格截图中很难看到)

    因此,假设相关日期在D列(因此有索引3)中,并且该列是一个日期列,您可以使用如下内容:

    Iterator<Row> rowIterator = sheet.rowIterator();
    boolean isFirstRow = true;
    Map<String, AtomicInteger> numbersByCategory = new TreeMap<>();
    Map<String, Date> earliestDateByCategory = new TreeMap<>();
    Map<String, Date> latestDateByCategory = new TreeMap<>();
    while (rowIterator.hasNext()) {
        Row row = rowIterator.next();
        if (isFirstRow) {
            isFirstRow = false; //skip the header row
            continue;
        }
    
        //first find out if it's the first row to this category; if not increment the number by one
        String category = row.getCell(2).getStringCellValue();
        AtomicInteger numberToCategory = numbersByCategory.get(category);
        if (numberToCategory == null) {
            numberToCategory = new AtomicInteger(0);
            numbersByCategory.put(category, numberToCategory);
        }
        numberToCategory.incrementAndGet();
    
        Date relevantDateOfRow = row.getCell(3).getDateCellValue();
    
        //now check if the date is a new 'earliest' to the category 
        Date earliestDateToCategory = earliestDateByCategory.get(category);
        if (earliestDateToCategory == null || relevantDateOfRow.before(earliestDateToCategory)) {
            earliestDateByCategory.put(category, relevantDateOfRow); //we found a new earliest date
        }
    
        //now check if the date is a new 'latest' to the category
        Date latestDateToCategory = latestDateByCategory.get(category);
        if (latestDateToCategory == null || relevantDateOfRow.after(latestDateToCategory)) {
            latestDateByCategory.put(category, relevantDateOfRow); //we found a new latest date
        }
    }
    
    //now you have 3 results that can be used for further processing (e.g. show on screen)
    System.out.println("Number: " + numbersByCategory);
    System.out.println("Earliest: " + earliestDateByCategory);
    System.out.println("Latest: " + latestDateByCategory);