Python和ElasticSearch:使用索引将CSV转换为JSON

2024-10-03 00:17:10 发布

您现在位置:Python中文网/ 问答频道 /正文

我想把一堆CSV文件转换成Python中特定的.JSON文件格式。在

这是我的CSV文件示例:

L1-CR109 Security Counter,has been forced,2019-02-26
L1-CR109 Security Counter,has been forced,2019-02-26
L1-CR109 Security Counter,has been forced,2019-02-26
L1-CR109 Security Counter,has been forced,2019-02-26

。。这是我想要的json输出:

^{pr2}$

目前,我能够生成以下json格式的结果:

[{"location": "L1-CR109 Security Counter", "door_status": "has been forced", "date": "2019-02-21"}, 
{"location": "L1-CR109 Security Counter", "door_status": "has been forced", "date": "2019-02-21"}, 
{"location": "L1-CR109 Security Counter", "door_status": "has been forced", "date": "2019-02-21"}, 
{"location": "L1-CR109 Security Counter", "door_status": "has been forced", "date": "2019-02-21"}

…这是我的Python代码:

def csv_to_json():
    in_file = '/Elastic Search/Converted Detection/Converted CSV'
    out_file = '/Elastic Search/Converted Detection/Converted JSON'

    for filename in os.listdir(in_file):
        print("\n")
        print("Converting " + filename + " file...")
        with open(in_file + "/" + filename, 'r') as f:
            if filename.endswith(".csv"):
                reader = csv.DictReader(f, fieldnames=("location", "door_status", "date"))
                out = json.dumps([row for row in reader])

                text_file = open(out_file + r'/{}.json'.format(filename[:-4]), 'w')
                text_file.write(out + "\n")

我试图寻找解决办法,但没有成功。我能知道我在密码里遗漏了什么吗?同样,我是否可以寻求建议,说明为什么弹性搜索只允许使用索引的json输出格式,而不是普通的python格式?在


Tags: injsonl1datestatuscounterlocationfilename
2条回答

下面是Python pandas包的版本:

import json
import pandas as pd

in_file = '/Elastic Search/Converted Detection/Converted CSV'
out_file = '/Elastic Search/Converted Detection/Converted JSON'
index_line = '{"index": {"_index": "test", "_type": "_doc", "_id": "1"}}\n'

阅读:

^{pr2}$

或直接从字符串:

text = "L1-CR109 Security Counter,has been forced,2019-02-26\n"*4
df = pd.read_csv(StringIO(text),header=None)

现在编写所需的格式(请注意,我添加了“date”,因此它是一个有效的JSON):

with open('outfile.json', 'w+') as outfile:
    for row in df.to_dict('records'):
       data = json.dumps(dict(list(zip(title,row.values()))))
       outfile.write(index_line+data)

这是一种方法。注意-你没有给你的日期字段起一个名字,所以我做了,使它成为有效的json)。在

import json
import csv
import sys
from collections import OrderedDict

index_line = { "index" : { "_index" : "test", "_type" : "_doc", "_id" : "1" } }
with open('input.csv', 'r') as infile, open('outfile.json', 'w+') as outfile:

    inreader = csv.reader(infile, delimiter=',', quotechar='"')

    for line in inreader:
        document = OrderedDict()
        document['location'] = line[0]
        document['door_activity'] = line[1]
        document['date'] = line[2]
        json.dump(index_line, outfile)
        outfile.write("\n")
        json.dump(document, outfile)
        outfile.write("\n")

sys.exit()

相关问题 更多 >