某些字段的红移复制命令值未跨

2024-09-27 00:16:54 发布

您现在位置:Python中文网/ 问答频道 /正文

我已经创建了一个staging表,我想使用COPY命令从S3 bucket中的JSON文件填充它。它起作用了,但是firstNameitemInSessionlastNamesessionIduserAgentuserId的值没有被复制

表.py

from dwh_conn import *

drop_staging_event = "DROP TABLE IF EXISTS staging_event;"

create_staging_event = ("""
CREATE TABLE IF NOT EXISTS staging_event(
    artist VARCHAR,
    auth VARCHAR,
    firstName VARCHAR,
    gender TEXT,
    itemInSession INTEGER,
    lastName VARCHAR,
    length FLOAT,
    level TEXT,
    location VARCHAR,
    method TEXT,
    page TEXT,
    registration FLOAT,
    sessionId INTEGER,
    song VARCHAR,
    status INTEGER,
    ts BIGINT,
    userAgent VARCHAR,
    userId VARCHAR
    );""")

load_staging_event = ("""COPY staging_event FROM 's3://dwh-training/data/log_data/2018/11/'
    credentials 'aws_iam_role={}' region 'us-east-2' JSON 'auto';""").format(ROLE_ARN)

以下返回了空行:

SELECT firstName, itemInSession, lastName, sessionId, userAgent, userId FROM staging_event LIMIT 5;
 firstname | iteminsession | lastname | sessionid | useragent | userid 
-----------+---------------+----------+-----------+-----------+--------
           |               |          |           |           | 
           |               |          |           |           | 
           |               |          |           |           | 
           |               |          |           |           | 
           |               |          |           |           | 
(5 rows)

但其他字段也有值:

SELECT artist, auth, gender, length, level, location, method, page, registration, song, status, ts FROM staging_event LIMIT 5;
          artist           |   auth    | gender |  length   | level |         location         | method |   page   | registration  |              song               | status |      ts       
---------------------------+-----------+--------+-----------+-------+--------------------------+--------+----------+---------------+---------------------------------+--------+---------------
 N.E.R.D. FEATURING MALICE | Logged In | M      |  288.9922 | free  | New Orleans-Metairie, LA | PUT    | NextSong | 1541033612796 | Am I High (Feat. Malice)        |    200 | 1541121934796
                           | Logged In | F      |           | free  | Lubbock, TX              | GET    | Home     | 1540708070796 |                                 |    200 | 1541122176796
 Death Cab for Cutie       | Logged In | F      | 216.42404 | free  | Lubbock, TX              | PUT    | NextSong | 1540708070796 | A Lack Of Color (Album Version) |    200 | 1541122241796
 Tracy Gang Pussy          | Logged In | F      | 221.33506 | free  | Lubbock, TX              | PUT    | NextSong | 1540708070796 | I Have A Wish                   |    200 | 1541122457796
 Skillet                   | Logged In | M      | 178.02404 | free  | Harrisburg-Carlisle, PA  | PUT    | NextSong | 1540006905796 | Monster (Album Version)         |    200 | 1541126568796
(5 rows)

有人知道为什么没有填充空字段吗?谢谢


Tags: textineventfreeputfirstnameuseragentuserid
1条回答
网友
1楼 · 发布于 2024-09-27 00:16:54

带有“json'auto'”的Banty-Redshift COPY将只加载它可以与表匹配的json值。这可以在匹配顶部json键的列名上完成,但它们需要精确匹配。检查这些名称是否完全匹配

更好的匹配方法是使用json_路径文件,将json键映射到红移列。可以在那里更改名称,也可以从非顶级json键加载数据

这是我所知道的最常见的原因。如果您需要更多信息,请回复,并且可能需要提供正在加载的json的一个例外

相关问题 更多 >

    热门问题