Lambda在完成后自动删除转录作业

2024-10-05 17:44:03 发布

您现在位置:Python中文网/ 问答频道 /正文

我想编辑我的lambda,这样它会删除转录作业时,它的工作状态读“完成”。我有以下代码:

 import json
    import time
    import boto3
    from urllib.request import urlopen

    def lambda_handler(event, context):
        transcribe = boto3.client("transcribe")
        s3 = boto3.client("s3")

        if event:
            file_obj = event["Records"][0]
            bucket_name = str(file_obj["s3"]["bucket"]["name"])
            file_name = str(file_obj["s3"]["object"]["key"])
            s3_uri = create_uri(bucket_name, file_name)
            file_type = file_name.split("2019.")[1]
            job_name = file_name
            transcribe.start_transcription_job(TranscriptionJobName=job_name,
                                                Media ={"MediaFileUri": s3_uri},
                                                MediaFormat = file_type,
                                                LanguageCode = "en-US",
                                                Settings={
                                                    "VocabularyName": "Custom_Vocabulary_by_Brand_Other_Brands",
                                                    "ShowSpeakerLabels": True,
                                                    "MaxSpeakerLabels": 4
                                                })


            while True:
                status = transcribe.get_transcription_job(TranscriptionJobName=job_name)
                if status["TranscriptionJob"]["TranscriptionJobStatus"] in ["FAILED"]:
                    break
                print("It's in progress")
            while True:
                status = transcribe.get_transcription_job(TranscriptionJobName=job_name)
                if status["TranscriptionJob"]["TranscriptionJobStatus"] in ["COMPLETED"]:
                    transcribe.delete_transcription_job(TranscriptionJobName=job_name
                )

                time.sleep(5)

            load_url = urlopen(status["TranscriptionJob"]["Transcript"]["TranscriptFileUri"])
            load_json = json.dumps(json.load(load_url))

            s3.put_object(Bucket = bucket_name, Key = "transcribeFile/{}.json".format(job_name), Body=load_json)


        # TODO implement
        return {
            'statusCode': 200,
            'body': json.dumps('Hello from Lambda!')
        }

    def create_uri(bucket_name, file_name):
        return "s3://"+bucket_name+"/"+file_name

处理作业的部分是:

 while True:
        status = transcribe.get_transcription_job(TranscriptionJobName=job_name)
        if status["TranscriptionJob"]["TranscriptionJobStatus"] in ["FAILED"]:
            break
        print("It's in progress")
    while True:
        status = transcribe.get_transcription_job(TranscriptionJobName=job_name)
        if status["TranscriptionJob"]["TranscriptionJobStatus"] in ["COMPLETED"]:
            transcribe.delete_transcription_job(TranscriptionJobName=job_name
        )

如果作业正在进行,它会说“正在进行”,但当它读到“已完成”时,它会删除。你知道吗

你知道为什么我现在的代码不能工作吗?它完成转录作业,但不删除它。你知道吗


Tags: nameinjsontrueifs3bucketstatus
2条回答

抱歉,伙计们,我又看了一眼,犯了一个非常愚蠢的错误。我的transcribe.delete_transcription_job(TranscriptionJobName=job_name完全不正确。你知道吗

请在下面找到正确的工作代码:

import json
import time
import boto3
from urllib.request import urlopen

def lambda_handler(event, context):
    transcribe = boto3.client("transcribe")
    s3 = boto3.client("s3")

    if event:
        file_obj = event["Records"][0]
        bucket_name = str(file_obj["s3"]["bucket"]["name"])
        file_name = str(file_obj["s3"]["object"]["key"])
       s3_uri = create_uri(bucket_name, file_name)
        file_type = file_name.split("2019.")[1]
        job_name = file_name
        transcribe.start_transcription_job(TranscriptionJobName=job_name,
                                            Media ={"MediaFileUri": s3_uri},
                                            MediaFormat = file_type,
                                            LanguageCode = "en-US",
                                            Settings={
                                                "VocabularyName": "Custom_Vocabulary_by_Brand_Other_Brands",
                                                "ShowSpeakerLabels": True,
                                                "MaxSpeakerLabels": 4
                                            })


        while True:
            status = transcribe.get_transcription_job(TranscriptionJobName=job_name)
            if status["TranscriptionJob"]["TranscriptionJobStatus"] in ["COMPLETED", "FAILED"]:
                transcribe.delete_transcription_job(TranscriptionJobName=job_name)
                break
            print("It's in progress")

            time.sleep(5)

        load_url = urlopen(status["TranscriptionJob"]["Transcript"]["TranscriptFileUri"])
        load_json = json.dumps(json.load(load_url))

        s3.put_object(Bucket = bucket_name, Key = "transcribeFile/{}.json".format(job_name), Body=load_json)


    # TODO implement
    return {
        'statusCode': 200,
        'body': json.dumps('Hello from Lambda!')
    }

def create_uri(bucket_name, file_name):
    return "s3://"+bucket_name+"/"+file_name

如果你能避免的话,你不应该投票来获取信息,尤其是在Lambda中。你知道吗

响应转录作业状态变化的正确方法是use CloudWatch Events。例如,您可以配置一个规则,以便在转录作业成功完成时将事件路由到AWS Lambda函数。你知道吗

当由于转录作业中的状态更改而调用Lambda函数时,Lambda函数将接收event数据,例如:

{
    "version": "0",
    "id": "1a234567-1a6d-3ab4-1234-abf8b19be1234",
    "detail-type": "Transcribe Job State Change",
    "source": "aws.transcribe",
    "account": "123456789012",
    "time": "2019-11-19T10:00:05Z",
    "region": "us-east-1",
    "resources": [],
    "detail": {
        "TranscriptionJobName": "my-transcribe-test",
        "TranscriptionJobStatus": "COMPLETED"
    }
}

使用TranscriptionJobName将状态更改关联回原始作业。你知道吗

相关问题 更多 >