java如何在允许扩展的同时，将具有不同字段计数的分隔文本行解析到对象中？

2 月，2 周 Questions & Answers 179

例如：

SEG1|asdasd|20111212|asdsad
SEG2|asdasd|asdasd
SEG3|sdfsdf|sdfsdf|sdfsdf|sdfsfsdf
SEG4|sdfsfs|

基本上，每个SEG*行都需要解析成一个对应的对象，定义每个字段是什么。一些，例如SEG1中的第三个字段将被解析为Date

通常情况下，每个字段中都会添加一个类似的附加实例：

SEG1|asdasd|20111212|asdsad|12334455

目前，我正在考虑使用以下类型的算法：

List<String> segments = Arrays.asList(string.split("\r"); // Will always be a CR.
List<String> fields;
String fieldName;
for (String segment : segments) {
    fields = Arrays.asList(segment.split("\\|");
    fieldName = fields.get(0);
    SEG1 seg1;
    if (fieldName.compareTo("SEG1") == 0) {
        seg1 = new Seg1();
        seg1.setField1(fields.get(1));
        seg1.setField2(fields.get(2));
        seg1.setField3(fields.get(3));
    } else if (fieldName.compareTo("SEG2") == 0) {
        ...
    } else if (fieldName.compareTo("SEG3") == 0) {
        ...
    } else {
        // Erroneous/failure case.
    }
}

根据填充的对象，某些字段也可能是可选的。我担心的是，如果我向类中添加一个新字段，那么使用expect字段计数的任何检查也需要更新。在允许填充类对象中的新字段类型或修改字段类型的同时，如何解析这些行

interface Segment {} class SEG1 implements Segment { void setField1(final String field){}; void setField2(final String field){}; void setField3(final String field){}; } enum Parser { SEGMENT1("SEG1") { @Override protected Segment parse(final String[] fields) { final SEG1 segment = new SEG1(); segment.setField1(fields[0]); segment.setField1(fields[1]); segment.setField1(fields[2]); return segment; } }, ... ; private final String name; private Parser(final String name) { this.name = name; } protected abstract Segment parse(String[] fields); public static Segment parse(final String segment) { final int firstSeparator = segment.indexOf('|'); final String name = segment.substring(0, firstSeparator); final String[] fields = segment.substring(firstSeparator + 1).split("\\|"); for (final Parser parser : values()) if (parser.name.equals(name)) return parser.parse(fields); return null; } }

get the field names from the first line in the file for (every line in the file except the first one) { for (every value in the line) { if (the value is not empty) { use reflection to get the setter for the field and invoke it with the value } } }

共 (4) 个答案

# 1 楼答案

如果您可以为所有要解析的类定义一个公共接口，我建议如下：

对于每种类型的段，在枚举中添加一个元素，并在parse(String[])方法中处理不同类型的字段

# 2 楼答案
我会在文件格式中添加一个标题行，其中包含存储在文件中的字段的名称，因此看起来更像这样：
```
(1) field1|field2|field3|field4|field5
(2) SEG1|asdasd|20111212|asdsad|
(3) SEG2|asdasd||asdasd|
(4) SEG3|sdfsdf|sdfsdf|sdfsdf|sdfsfsdf
(5) SEG4|sdfsfs|||
```
这在CSV文件中很常见。我还添加了更多分隔符，使每行有五个“值”。通过这种方式，只需在一行中输入两个分隔符即可指定null值（有关null值不是最后一个值的示例，请参见上面的第三行）

现在，解析代码知道需要设置哪些字段，并且可以使用循环中的反射调用setter。伪代码：
```
get the field names from the first line in the file

for (every line in the file except the first one) {

    for (every value in the line) {

        if (the value is not empty) {

            use reflection to get the setter for the field and invoke it with the
            value
        }
    }
}
```
这允许您使用其他字段扩展文件，而无需更改代码。这也意味着你可以有有意义的字段名。反射可能会因为不同的类型而变得有点复杂，例如int、String、boolean等等。所以我不得不说，如果可以，请遵循@sethu的建议，使用一个现成的、经过验证的库来为您实现这一点
# 3 楼答案
1. 您可以使用集合，例如ArrayList
2. 你可以使用var-args
如果希望使其可扩展，可能需要在循环中处理每个段，而不是处理每个事件
# 4 楼答案

是否有必要使用带|的同一字符串作为分隔符？如果使用相同的类来创建字符串，那么它是Xstream的理想情况。Xstream将把java对象转换成XML并返回。Xstream将处理一些字段是可选的场景。您将不必编写任何解析文本的代码。这里有一个链接：

http://x-stream.github.io/

Python中文网

有 Java 编程相关的问题?

java如何在允许扩展的同时，将具有不同字段计数的分隔文本行解析到对象中？

共 (4) 个答案

# 1 楼答案

# 2 楼答案

# 3 楼答案

# 4 楼答案