筛选结果为HBase时的java OutofOrdersCannerExtenception
我正试图通过以下方式在HBase中过滤结果:
List<Filter> andFilterList = new ArrayList<>();
SingleColumnValueFilter sourceLowerFilter = new SingleColumnValueFilter(Bytes.toBytes("cf"), Bytes.toBytes("source"), CompareFilter.CompareOp.GREATER, Bytes.toBytes(lowerLimit));
sourceLowerFilter.setFilterIfMissing(true);
SingleColumnValueFilter sourceUpperFilter = new SingleColumnValueFilter(Bytes.toBytes("cf"), Bytes.toBytes("source"), CompareFilter.CompareOp.LESS_OR_EQUAL, Bytes.toBytes(upperLimit));
sourceUpperFilter.setFilterIfMissing(true);
SingleColumnValueFilter targetLowerFilter = new SingleColumnValueFilter(Bytes.toBytes("cf"), Bytes.toBytes("target"), CompareFilter.CompareOp.GREATER, Bytes.toBytes(lowerLimit));
targetLowerFilter.setFilterIfMissing(true);
SingleColumnValueFilter targetUpperFilter = new SingleColumnValueFilter(Bytes.toBytes("cf"), Bytes.toBytes("target"), CompareFilter.CompareOp.LESS_OR_EQUAL, Bytes.toBytes(upperLimit));
targetUpperFilter.setFilterIfMissing(true);
andFilterList.add(sourceUpperFilter);
andFilterList.add(targetUpperFilter);
FilterList andFilter = new FilterList(FilterList.Operator.MUST_PASS_ALL, andFilterList);
List<Filter> orFilterList = new ArrayList<>();
orFilterList.add(sourceLowerFilter);
orFilterList.add(targetLowerFilter);
FilterList orFilter = new FilterList(FilterList.Operator.MUST_PASS_ONE, orFilterList);
FilterList fl = new FilterList(FilterList.Operator.MUST_PASS_ALL);
fl.addFilter(andFilter);
fl.addFilter(orFilter);
Scan edgeScan = new Scan();
edgeScan.setFilter(fl);
ResultScanner edgeScanner = table.getScanner(edgeScan);
Result edgeResult;
logger.info("Writing edges...");
while ((edgeResult = edgeScanner.next()) != null) {
// Some code
}
此代码启动此错误:
org.apache.hadoop.hbase.DoNotRetryIOException: Failed after retry of OutOfOrderScannerNextException: was there a rpc timeout?
at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:402)
at org.deustotech.internet.phd.framework.rdf2subdue.RDF2Subdue.writeFile(RDF2Subdue.java:150)
at org.deustotech.internet.phd.framework.rdf2subdue.RDF2Subdue.run(RDF2Subdue.java:39)
at org.deustotech.internet.phd.Main.main(Main.java:32)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:297)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id: 178 number_of_rows: 100 close_scanner: false next_call_seq: 0
at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3098)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29497)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:168)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:39)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:111)
at java.lang.Thread.run(Thread.java:745)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:285)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:204)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90)
at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:354)
... 9 more
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException): org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id: 178 number_of_rows: 100 close_scanner: false next_call_seq: 0
at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3098)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29497)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:168)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:39)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:111)
at java.lang.Thread.run(Thread.java:745)
at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1453)
at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1657)
at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1715)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:29900)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:174)
... 13 more
RPC超时设置为600000。鉴于这些结果,我已尝试删除一些过滤器:
- sourceUpperFilter&&;(sourceLowerFilter | | targetLowerFilter)-->;成功
- targetUpperFilter&&;(sourceLowerFilter | | targetLowerFilter)-->;成功
- (sourceUpperFilter&;targetUpperFilter)&&;(sourceLowerFilter)-->;失败
- (sourceUpperFilter&;targetUpperFilter)&&;(targetLowerFilter)-->;失败
任何帮助都将不胜感激。多谢各位
# 1 楼答案
原因:从一个大区域寻找几行。填写#行需要时间 根据客户方的要求。此时,客户端将获得一个rpc超时。 因此,客户端将在同一扫描仪上重试该调用。记得下一个吗 打电话给客户说从你现在的位置给我接下来的N行。旧的失败了 呼叫正在进行中,可能会提前一些行。所以这个重试电话 会错过那些排。。。。为了避免这种情况并区分这种情况,我们 此扫描序列号和此异常。一看到这一点,客户就会关门 打开扫描仪并创建一个具有正确起始行的新扫描仪。但是这样 只会再发生一次。同样,这个电话也可能超时
所以我们 必须调整超时和/或扫描缓存值
心跳机制避免了长时间扫描的超时
在我们的案例中,我们使用了hbase中的大量数据 RPC超时=1800000,租赁期=1800000,我们使用了fuzzy row filters和
scan.setCaching(xxxx)// value need to be adjusted ;
注意:值过滤器比行过滤器慢(因为执行全表扫描需要很长时间)
通过以上所有预防措施,我们成功地使用mapreduce从hbase查询了大量数据
希望这个解释能有所帮助
# 2 楼答案
我通过设置
hbase.client.scanner.caching
来解决这个问题see also
希望这个答案有帮助