python csv模块与嵌入的JSON字符串(python+Oracle+csv+JSON)有关

2024-10-02 12:29:23 发布

您现在位置:Python中文网/ 问答频道 /正文

我在创建表之后立即将少量的基线数据导入到表中。只有一个表给我带来了麻烦,这是因为其中一个字段是JSON

我还没有找到一个语法引擎能够正确解释JSON中的转义引号和逗号。我没有试过所有的方法,当然,我愿意接受基于任何类似问题经验的建议

我不知道这是否重要,但我正在使用toadfororacle导出CSV文件,作为数据开发重建的基线。Toad没有选择来替换CSV中的分隔符,而手动更改单个CSV文件对我来说并不困难,因为维护任务将是PITA

以下是导致问题的CSV数据示例:

"RULE_ID","NAME","DISPLAY_DESC","NOTES","RULE","SOURCE_ID","RULE_META","RULE_SCOPE","ACTIVE"
265.00,"RoadKill Report Processor","Report Processor","Loads a long-run-thread for each report matched by the handler method.","MvcsReportProcessManager",41.00,"{
                \"handler\"        : \"processReports\",
                \"consumer_prototype\" :
                \"RoadKill_report_processor.AssetDataReportHostConsumer\",
                \"match_expression\" : \"^MVCS_.*\",
                \"schedule\"    : [
                        \"0:30-4:00|mon-sun|*|*\",
                        \"!*|*|1|jan\",
                        \"!*|*|25|dec\",
                        \"!*|thu|22-28|nov\"
                    ],
                \"wake_interval\" : \"30m\",
                \"interval\"      : \"24h\"
            }","INST",0.00
321.00,"RoadKill AG Processor","Asset Group Reflection","Loads a long-run-thread to download Asset Groups daily.","MvcsAssetGroupDownloader",41.00,"{
                \"handler\"        : \"replicateAssetGroups\",
                \"consumer_prototype\" :
                \"RoadKill_report_processor.AssetGroupConsumer\",
                \"schedule\"    : [
                        \"00:30-17:00|mon-sun|*|*\",
                        \"!*|*|1|jan\",
                        \"!*|*|25|dec\",
                        \"!*|thu|22-28|nov\"
                    ],
                \"wake_interval\" : \"30m\",
                \"interval\"      : \"24h\"
            }","INST",1.00
322.00,"RoadKill Asset Processor","Asset Reflection","Loads a long-run-thread to download Assets daily.","MvcsAssetAPIHostDownloader",41.00,"{
                \"handler\"        : \"replicateAssets\",
                \"consumer_prototype\" :
                \"RoadKill_report_processor.\",
                \"schedule\"    : [
                        \"00:30-17:00|mon-sun|*|*\",
                        \"!*|*|1|jan\",
                        \"!*|*|25|dec\",
                        \"!*|thu|22-28|nov\"
                    ],
                \"wake_interval\" : \"30m\",
                \"interval\"      : \"24h\"
            }","INST",1.00
323.00,"RoadKill Vuln Processor","Vuln Reflection","Loads a long-run-thread to download Vulns daily.","MvcsAssetAPIVulnDownloader",41.00,"{
                \"handler\"        : \"replicateVulns\",
                \"consumer_prototype\" :
                \"RoadKill_report_processor.AssetAPIHostDetectionConsumer\",
                \"schedule\"    : [
                        \"00:30-17:00|mon-sun|*|*\",
                        \"!*|*|1|jan\",
                        \"!*|*|25|dec\",
                        \"!*|thu|22-28|nov\"
                    ],
                \"wake_interval\" : \"30m\",
                \"interval\"      : \"24h\"
            }","INST",1.00
141.00,"RoadKill Manager","RoadKill Sync","Loads RoadKill instances and dispatches an entry point for that source + instance (one for each instance rule).","MvcsInstanceDispatchRule",41.00,"{
            \"handler\"        : \"startInstanceRules\",
            \"schedule\"    : [
                    \"0:00-23:59|mon-sun|*|*\"
                ],
            \"wake_interval\" : \"30m\"
        }","CORE",1.00

下面是python csv模块在尝试解析行时作为行返回的内容:

>>> [(o,v) for o,v in enumerate(row)]
[(0, '265.00'), (1, 'RoadKill Report Processor'), (2, 'Report Processor'), (3, 'Loads a long-run-thread for each report matched by the handler method.'), (4, 'MvcsReportProcessManager'), (5, '41.00'), (6, '{\n                \\handler\\"        : \\"processReports\\"'), (7, '')]

最后,这里是csv阅读器代码:

col_offsets = None
for f in os.listdir(testdatadir):
    #split filename.  get tablename.
    fname = os.path.basename(f)
    if fname and\
            fname.startswith('mvcs_') and\
            fname.endswith('.csv'):
        tblname = fname.split('.')[0]
        tobj = get_class_by_tablename(tblname)
        with open(testdatadir+'/'+fname, 'r') as csvfile:
            csvreader = csv.reader(csvfile, delimiter=',',
                    quotechar='"')
            for count,row in enumerate(csvreader):
                if not count:
                    col_offsets = getColumnOffsets(row)
                elif not col_offsets:
                    raise Exception('Missing column offsets.')
                else:
                    tinst = tobj(
                        **{colname.lower() : row[offset] for
                            offset,colname in col_offsets})
                    try:
                        session.add(tinst)
                    except Exception as e:
                        logger.warn(str(e))
                        logger.warn('on adding:')
                        logger.warn(str(tinst))

Tags: runreportforprocessorthreadfnamelonghandler

热门问题