如何创建红移卸载头

2024-09-27 00:14:31 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个redshift的unload查询,它提供了所需的数据。我希望在python脚本中添加一个头行,该脚本将写入每个文件。提前谢谢。在

标题行:应该与此完全相同。 ~id~从~to~标签开始时间:日期结束时间:日期accountid:字符串srcaddr:字符串dstaddr:字符串srcport:字符串dsport:字符串协议:字符串数据包:Int字节:内景操作:字符串在

    def handler(event, context):
        now   = datetime.utcnow().replace(second=0, microsecond=0)
        past  = now - timedelta(minutes=120)

        cmd = """unload (SELECT 'pk' AS id,
        'version' AS version,
        'accountid' AS accountid,
        'interfaceid' AS interfaceid,
        'srcaddr' AS srcaddr,
        'dstaddr' AS dstaddr,
        'srcport' AS srcport,
        'dstport' AS dstport,
        'protocol' AS protocol,
        'packets' AS packets,
        'bytes' AS bytes,
        'starttime' AS startime,
        'endtime' AS endtime,
        'action' AS action,
        'logstatus' AS logstatus,
        'fromeni' AS fromeni,
        'toeni' AS toeni
    UNION ALL SELECT
        cast(pk AS varchar(255)) AS id,
        cast(version AS varchar(255)) AS version,
        cast(accountid AS varchar(255)) AS accountid,
        cast(interfaceid AS varchar(255)) AS interfaceid,
        cast(srcaddr AS varchar(255)) AS srcaddr,
        cast(dstaddr AS varchar(255)) AS dstaddr,
        cast(srcport AS varchar(255)) AS srcport,
        cast(dstport AS varchar(255)) AS dstport,
        cast(protocol AS varchar(255)) AS protocol,
        cast(packets AS varchar(255)) AS packets,
        cast(bytes AS varchar(255)) AS bytes,
        cast(starttime AS varchar(255)) AS starttime,
        cast(endtime AS varchar(255)) AS endtime,
        action,
        logstatus,
        cast(fromeni AS varchar(255)) AS fromeni,
        cast(toeni AS varchar(255)) AS toeni
    FROM dimension.vpc_flow_logs)\
            to 's3://xxxxxxxxxx/' \
            iam_role 'arn:aws:iam::xxxxxxxxx:role/task' \
            delimiter as ',' \
            PARALLEL OFF \
            ESCAPE \
            manifest allowoverwrite"""

        try:
            cursor = conn.cursor()
            cursor.execute(cmd)
            conn.commit()
        finally:
            cursor.close()
            conn.close()

    if __name__ == '__main__':
        handler({},{})

Tags: 字符串bytesversionasprotocolpacketscastvarchar
1条回答
网友
1楼 · 发布于 2024-09-27 00:14:31

好消息!截至2018年10月2日,Redshift开始在UNLOAD命令中推出对HEADER选项的支持,该命令将列标题导出为csv的标题。在

以下是他们的声明: https://forums.aws.amazon.com/ann.jspa?annID=6136

文档也已更新: https://docs.aws.amazon.com/redshift/latest/dg/r_UNLOAD.html

例如

UNLOAD ('SELECT * FROM customers')
TO 's3://<bucket-name>/customers.csv'
CREDENTIALS 'aws_iam_role=arn:aws:iam::<IAM_USER_ID#>:role/myRedshiftRole'
PARALLEL OFF DELIMITER ',' HEADER;

我刚在一个有100-300列的数据库上试过,效果不错。UNION列定义的日子一去不复返了!为任何使用红移的数据科学家节省了很多时间,并且需要为第三方平台导出数据!在

也可以使用默认的PARALLEL ON处理多个输出文件。在

相关问题 更多 >

    热门问题