有 Java 编程相关的问题?

你可以在下面搜索框中键入要查询的问题!

google云存储google云存储Java API对于大文件(20G)比gsuitl cp慢得多

使用java存储api将20G文件移动到google bucket需要40多分钟。使用gsutil cp时花了4分钟。你知道java存储api哪里出了问题吗

第一次尝试使用JavaAPI

    BlobInfo blobInfo = null;
    try (BufferedInputStream inputStream = new BufferedInputStream(new FileInputStream(fileToUpload))) {
        blobInfo =
            BlobInfo.newBuilder(bucketName, bucketFilePath)
                .setContentType("application/octet-stream")
                .setContentDisposition(String.format("attachment; filename=\"%s\"", bucketFilePath))
                .setMd5(fileToUploadMd5)
                .build();
        try (WriteChannel writer = storage.writer(blobInfo, Storage.BlobWriteOption.md5Match())) {
            ByteStreams.copy(inputStream, Channels.newOutputStream(writer));
        }
    } catch (StorageException ex) {
        if (!(400 == ex.getCode() && "invalid".equals(ex.getReason()))) {
            throw ex;
        }
    }

Java API的第二次尝试

    BlobInfo blobInfo =
        BlobInfo.newBuilder(bucketName, bucketFilePath)
            .setContentType("application/octet-stream")
            .setContentDisposition(String.format("attachment; filename=\"%s\"", bucketFilePath))
            .setMd5(fileToUploadMd5)
            .build();

    // Write the file to the bucket
    writeFileToBucket(storage, fileToUpload.toPath(), blobInfo);

private void writeFileToBucket(Storage storage, Path fileToUpload, BlobInfo blobInfo) throws Exception {
    // Code from : https://github.com/googleapis/google-cloud-java/blob/master/google-cloud-
    // examples/src/main/java/com/google/cloud/examples/storage/StorageExample.java
    if (Files.size(fileToUpload) > 1_000_000) {
        // When content is not available or large (1MB or more) it is recommended
        // to write it in chunks via the blob's channel writer.
        try (WriteChannel writer = storage.writer(blobInfo)) {
          byte[] buffer = new byte[1024];
          try (InputStream input = Files.newInputStream(fileToUpload)) {
            int limit;
            while ((limit = input.read(buffer)) >= 0) {
              try {
                writer.write(ByteBuffer.wrap(buffer, 0, limit));
              } catch (Exception ex) {
                ex.printStackTrace();
              }
            }
          }
        }
      } else {
        byte[] bytes = Files.readAllBytes(fileToUpload);
        // create the blob in one request.
        storage.create(blobInfo, bytes);
      }
}

两次JavaAPI尝试都花费了40多分钟

gsutil代码

gcloud auth激活服务帐户--密钥文件serviceAccountJsonKeyFile

gsutil cp fileToUpload gs://google bucket name


共 (2) 个答案

  1. # 1 楼答案

    GSutil内置了优化大文件上传的功能,特别是通过拆分文件和并行发送多个部分来优化带宽

    更多详情here

    类似的功能很难实现

  2. # 2 楼答案

    您必须增加缓冲区大小。 有了100MB的缓冲区,我的上传速度达到了120MB/s