如何只允许特定的机器在Luigi中运行任务

class DownloadSQLData(luigi.Task): # ... def run(self): # Only Machine A can do this # ... class TransformData(luigi.Task): # ... def requires(self): return DownloadSQLData(date=self.date) class UploadToDrive(luigi.Task): # ... def requires(self): return TransformData(date=self.date) def run(self): # Only Machine B can do this # ... class DoSomethingElseWithData(luigi.Task): #... def requires(self): return TransformData(date=self.date)

1条回答

网友

1楼 · 发布于 2024-09-23 16:25:32

Luigi本身不能进行调度，即在某些机器上运行某些任务或将任务调度到某个时间运行。也就是说，有很多方法可以实现你想要的。你知道吗

解决方案1:让我们介绍机器C，它可以访问机器A和B。使用许多工具（https://wiki.python.org/moin/SecureShell）机器C可以运行任务从A检索数据，在C上转换数据，然后在上载之前传输到B。你知道吗

解决方案2:此解决方案很可能工作量过大和/或不可行。在网络调度器（类似slurmhttps://www.schedmd.com/）中设置机器A,B,C，以C作为头调度器，并将A和B指定为特定类型的资源（可能是SQL和GDrive）。然后，从C开始，将slurm任务安排为luigi作业（https://github.com/pharmbio/sciluigi可以提供帮助）。这些slurm任务应该指定每个任务所需的给定资源。这就是它！你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章