在提交基于估计器的运行(启用docker)时,在AmlCompute上使用FileDataSet的推荐方法是什么
我的文件数据集大约为1.5Gb,包含1000个图像。
我有一个表格数据集,其中引用了该文件数据集中的图像。此表格数据集包含对其他(掩码)图像的类或引用,具体取决于我尝试训练的模型
因此,为了将图像加载到内存(np.array),我必须根据tablerdataset中的文件名从文件位置读取图像
在这一点上,我看到了两个选项,但没有一个是可行的,因为它们需要花费时间(+1小时)才能完成,而且根本不可行:
装载文件数据集
image_dataset = ws.datasets['imagedata']
mounted_images = image_dataset.mount()
mounted_images.start()
print('Data set mounted', datetime.datetime.now())
load_image(mounted_images.mount_point + '/myfilename.png')
下载数据集
image_dataset = ws.datasets['chart-imagedata']
image_dataset.download(target_path='chartimages', overwrite=False)
我想以最快的方式在AmlCompute上启动估计器,并尽可能快速、轻松地访问文件
我看了一下stackoverflow上的这个post,他们表示最好在train.py脚本中更新azureml sdk包,我已经应用了它,但没有区别
已编辑(更多信息):
STANDARD_D2_V2
的我的计算目标(0-4的集群,但仅使用1个节点)的大小我正在使用的train.py(仅用于复制目的):
# Force latest prerelease version of certain packages
import subprocess
import sys
def install(package):
subprocess.check_call([sys.executable, "-m", "pip", "install", "--upgrade", "--pre", package])
install('azureml-core')
install('azureml-sdk')
# General references
import argparse
import os
import numpy as np
import pandas as pd
import datetime
from azureml.core import Workspace, Dataset, Datastore, Run, Experiment
import sys
import time
ws = Run.get_context().experiment.workspace
# Download file data set
print('Downloading data set', datetime.datetime.now())
image_dataset = ws.datasets['chart-imagedata']
image_dataset.download(target_path='chartimages', overwrite=False)
print('Data set downloaded', datetime.datetime.now())
# mount file data set
print('Mounting data set', datetime.datetime.now())
image_dataset = ws.datasets['chart-imagedata']
mounted_images = image_dataset.mount()
mounted_images.start()
print('Data set mounted', datetime.datetime.now())
print('Training finished')
我用的是张量流估计器:
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException
# Choose a name for your CPU cluster
gpu_cluster_name = "g-train-cluster"
# Verify that cluster does not exist already
try:
gpu_cluster = ComputeTarget(workspace=ws, name=gpu_cluster_name)
print('Found existing cluster, use it.')
except ComputeTargetException:
compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_D2_V2',
max_nodes=4, min_nodes=0)
gpu_cluster = ComputeTarget.create(ws, gpu_cluster_name, compute_config)
print('Creating new cluster')
constructor_parameters = {
'source_directory':training_name,
'script_params':script_parameters,
'compute_target':gpu_cluster,
'entry_script':'train.py',
'pip_requirements_file':'requirements.txt',
'use_gpu':True,
'framework_version': '2.0',
'use_docker':True}
estimator = TensorFlow(**constructor_parameters)
run = self.__experiment.submit(estimator)
目前没有回答
相关问题 更多 >
编程相关推荐