python子进程调用,使用grep、awk或sed比较两个CSV文件

2024-10-02 00:31:49 发布

您现在位置:Python中文网/ 问答频道 /正文

我在/tmp/目录中有两个CSV文件。你知道吗

一个CSV文件结果来自python结果,第二个CSV文件是要匹配的主文件。你知道吗

>>> import json
>>> resp = { "status":"success", "msg":"", "data":[ { "website":"https://www.blahblah.com", "severity":"low", "location":"unknown", "asn_number":"AS4134 Chinanet", "longitude":121.3997000000, "epoch_timestamp":1530868957, "id":"c1e15eccdd1f31395506fb85" }, { "website":"https://www.jhonedoe.co.uk/sample.pdf", "severity":"low", "location":"unknown", "asn_number":"AS4134 Chinanet", "longitude":120.1613998413, "epoch_timestamp":1530868957, "id":"933bf229e3e95a78d38223b2" } ] }
>>> response = json.loads(json.dumps(resp))
>>> KEYS = 'website', 'asn_number' , 'severity'
>>> x = []
>>> for attribute in response['data']:
...     csv_response = ','.join(attribute[key] for key in KEYS)
...     with open('/tmp/processed_results.csv', 'a') as score:
...             score.write(csv_response + '\n')

$cat processed_results.csv

https://www.blahblah.com,AS4134 Chinanet,low
https://www.jhonedoe.co.uk/sample.pdf,AS4134 Chinanet,low

要匹配的元文件。你知道吗

$cat master_meta.csv
http://download2.freefiles-10.de,AS24940 Hetzner Online GmbH,high
https://www.jhonedoe.co.uk/sample.pdf,AS4134 Chinanet,low
http://download2.freefiles-11.de,AS24940 Hetzner Online GmbH,high
www.solener.com,AS20718 ARSYS INTERNET S.L.,low
https://www.blahblah.com,AS4134 Chinanet,low
www.telewizjairadio.pl,AS29522 Krakowskie e-Centrum Informatyczne JUMP Dziedzic,high

我知道如何使用grep来比较这两个文件并得到匹配的行。你知道吗

$grep -Ff processed_results.csv master_meta.csv

https://www.jhonedoe.co.uk/sample.pdf,AS4134 Chinanet,low
https://www.blahblah.com,AS4134 Chinanet,low

关于如何使用pythonsubprocess call传递grep/sed/awk命令来比较两个文件并获得变量中的匹配行,有什么建议吗?你知道吗


Tags: 文件csvsamplehttpscompdfresponsewww
1条回答
网友
1楼 · 发布于 2024-10-02 00:31:49

大多数人不会称之为“好”,但如果你只是为自己使用它,你不应该太在意。你知道吗

def sh(cmd, verbose=True):
    """Returns the stdout of the shell command as an iterator over the lines,
    but the process self is blocking. 
    The lines are strip()-ed.
    """
    if verbose:
        print("[INFO]: executing: " + cmd)
    out, err = Popen(cmd, stdout=PIPE, stderr=PIPE, shell=True).communicate()
    if err:
        print("[ERROR]: while executing: " + cmd)
        print(err.decode('ascii').strip())
    return out.decode('ascii').strip()

相关问题 更多 >

    热门问题