用Wilde card/Star*模式(Scala/Java)列出S3中的对象
我试图用星形模式列出S3存储桶中的所有对象,但AWS SKD和Hadoop FS似乎都不支持它
首先,我尝试了SDK:
val prefix = "xyz/staging/*/2020/04/05/"
val s3 = AmazonS3ClientBuilder.standard.withRegion(Regions.US_EAST_1).build
val result = s3.listObjectsV2("MyBucket", prefix)
val objects = result.getObjectSummaries
import scala.collection.JavaConversions._
for (os <- objects) {
System.out.println("* " + os.getKey)
}
这种方法没有任何回报
然后我尝试了Hadoop FS
import java.net.URI
import org.apache.hadoop.fs.FileSystem
import org.apache.hadoop.fs.Path
import org.apache.hadoop.conf.Configuration
val path = "s3://MyBucket/xyz/staging/*/2020/04/05/"
val fileSystem = FileSystem.get(URI.create(path), ss.sparkContext.hadoopConfiguration)
val it = fileSystem.listFiles(new Path(path), true)
while (it.hasNext()) {
...
}
这一个抱怨路径不存在,所以我认为它不理解*模式
共 (0) 个答案