如何从CSV文件中获取json?

2024-09-29 17:46:51 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个CSV文件,其中包含国家名称和城市名称,如下所示

Input

Kabul                   Afghanistan
Kandahar                Afghanistan
Herat                   Afghanistan
Tirana                  Albania
Algiers                 Algeria
Luanda                  Angola
Huambo                  Angola
Cabinda                 Angola
Benguela                Angola
Lobito                  Angola
Buenos Aires            Argentina
Cordoba                 Argentina
Rosario                 Argentina
San Miguel de Tucuman   Argentina
...                     ...

我想使用任何语言或库从这些数据中获取JSON文件。我听说它可以通过Python或JavaScript实现。(大约有2600个城市。)

Output

cities = {
    "Afghanistan": ["Kabul", "Kandahar", "Herat"],
    "Albania"    : ["Tirana", "Algiers"],
    "Angola"     : ["Luanda", "Huambo", "Cabinda", "Benguela", "Lobito"],
    "Argentina"  : ["Buenos Aires", "Cordoba", "Rosario", "San Marino de Tucuman"],
    ...            ...
}

我怎么能得到这个

我试过使用pandas,但我不知道如何继续,因为我是Python的新手。有办法吗

import pandas as pd 
data = pd.read_csv("filename.csv") 

Tags: 文件名称argentinaalbaniaangolaafghanistankabultirana
2条回答

这可能行得通。 假设城市和国家之间的每条线路间隔为2条或多条线路

import fs from "fs";
const fileData = fs.readFileSync("./file.csv");
const convertToJson = (fileData) => {
    const lines = fileData.split(/\n/g) || [];
    const dictionary = {};
    lines.forEach(line => {
        const lineSplit = line.split(/\s{2,}/g);
        if (lineSplit.length == 2 && !dictionary[lineSplit[1]] ) {
            dictionary[lineSplit[1]] = [];
        }
        dictionary[lineSplit[1]].push(lineSplit[0]);
    });
    return dictionary;
}
console.log(convertToJson(fileData.toString()));

您可以使用pandasgroupby

我使用io只是为了创建最小的工作示例,但您应该使用filename

text = '''Kabul                   Afghanistan
Kandahar                Afghanistan
Herat                   Afghanistan
Tirana                  Albania
Algiers                 Algeria
Luanda                  Angola
Huambo                  Angola
Cabinda                 Angola
Benguela                Angola
Lobito                  Angola
Buenos Aires            Argentina
Cordoba                 Argentina
Rosario                 Argentina
San Miguel de Tucuman   Argentina'''

import pandas as pd
import io

#fh = "filename.csv"
#df = pd.read_csv(fh, sep='\s{2,}', names=['city', 'country'])

fh = io.StringIO(text)
df = pd.read_csv(fh, sep='\s{2,}', names=['city', 'country'])

cities = {}

for country, group in df.groupby('country'):
    cities[country] = group['city'].to_list()

print(cities)

没有pandas,但使用正常的open()read()

因为名称之间用很少的空格分隔,所以我使用regex。我无法使用标准模块csv,因为它需要单个字符作为分隔符

text = '''Kabul                   Afghanistan
Kandahar                Afghanistan
Herat                   Afghanistan
Tirana                  Albania
Algiers                 Algeria
Luanda                  Angola
Huambo                  Angola
Cabinda                 Angola
Benguela                Angola
Lobito                  Angola
Buenos Aires            Argentina
Cordoba                 Argentina
Rosario                 Argentina
San Miguel de Tucuman   Argentina'''

import re
import io

#fh = open('filename.csv')

fh = io.StringIO(text)
 
cities = {}

for line in fh:
    line = line.strip()
    city, country  = re.split(' {2,}', line)
    if country not in cities:
        cities[country] = []

    cities[country].append(city)

print(cities)

编辑:

如果您需要它作为JSON数据

import json

data = json.dumps(cities)

相关问题 更多 >

    热门问题