使用Python2合并CSV行并保留来自单个任意列的数据

2024-10-16 20:48:27 发布

您现在位置:Python中文网/ 问答频道 /正文

我知道关于这个主题有很多问题,但是答案并不是很好地解释,所以很难适应我的用例。其中的here看起来很有前途,但是语法相当复杂,我很难理解和适应它。在

我需要将Nessus的原始CSV输出转换为标准格式,该格式实际上会转储许多列,只保留每个结果的严重性、IP地址和输出。我编写了一个脚本来完成这一点,但是如果在多个主机/端口上有一个查找结果,则每个主机/端口都有一个不同的行。在

我需要的是根据漏洞名称合并行,但只保留IP地址数据。在

示例输入-为方便起见缩短

High,10.10.10.10,MS12-345(this is the name),Hackers can do bad things
High,10.10.10.11,MS12-345(this is the name),Hackers can do bad things

示例输出

^{pr2}$

以下是我目前为止的剧本。如果你能让你的答案对未来的读者很容易适应(阅读:白痴证明),我将不胜感激,我相信他们也会的。在

奖金:

对于同名结果,输出字段有时不同,有时相同。如果你有时间,为什么不帮一个男人检查一下,如果输出有差异,就用同样的方式追加IP地址呢?在

import sys
import csv

def manipulate(inFile):

    with open(inFile, 'rb') as csvFile:
        fileReader = csv.reader(csvFile, dialect='excel')

        # Check for multiple instances of findings and merge the rows
        # This happens when the finding is on multiple hosts/ports

        //YOUR CODE WILL GO HERE (Probably...)

        # Place findings into lists: crits, highs, meds, lows for sorting later
        crits = []
        highs = []
        meds = []
        lows = []

        for row in fileReader:

            if row[3] == "Critical":    
                crits.append(row)
            elif row[3] == "High":
                highs.append(row)
            elif row[3] == "Medium":
                meds.append(row)
            elif row[3] == "Low":
                lows.append(row)

        # Open an output file for writing
        with open('output.csv', 'wb') as outFile: 
            fileWriter = csv.writer(outFile)

            # Add in findings from lists in order of severity. Only relevant columns included
            for c in crits:
                fileWriter.writerow( (c[3], c[4], c[7], c[12]) )

            for h in highs:
                fileWriter.writerow( (h[3], h[4], h[7], h[12]) )

            for m in meds:
                fileWriter.writerow( (m[3], m[4], m[7], m[12]) )

            for l in lows:
                fileWriter.writerow( (l[3], l[4], l[7], l[12]) )


# Input validation
if len(sys.argv) != 2:
    print 'You must provide a csv file to process'
    raw_input('Example: python nesscsv.py foo.csv')
else:
    print "Working..."
    # Store filename for use in manipulate function
    inFile = str(sys.argv[1])
    # Call manipulate function passing csv
    manipulate(inFile)

print "Done!"   
raw_input("Output in output.csv. Hit return to finish.")

Tags: csvtheinforinfilerowhighappend
1条回答
网友
1楼 · 发布于 2024-10-16 20:48:27

这里有一个解决方案,它使用OrderedDict收集行,以保持行的顺序,同时还允许按漏洞名称查找任何行。在

import sys
import csv
from collections import OrderedDict

def manipulate(inFile):

    with open(inFile, 'rb') as csvFile:
        fileReader = csv.reader(csvFile, dialect='excel')

        # Check for multiple instances of findings and merge the rows
        # This happens when the finding is on multiple hosts/ports

        # Dictionary mapping vulns to merged rows.
        # It's ordered to preserve the order of rows.
        mergedRows = OrderedDict()

        for newRow in fileReader:
            vuln = newRow[7]
            if vuln not in mergedRows:
                # Convert the host and output fields into lists so we can easily
                # append values from rows that get merged with this one.
                newRow[4] = [newRow[4], ]
                newRow[12] = [newRow[12], ]
                # Add row for new vuln to dict.
                mergedRows[vuln] = newRow
            else:
                # Look up existing row for merging.
                mergedRow = mergedRows[vuln]
                # Append values of host and output fields, if they're new.
                if newRow[4] not in mergedRow[4]:
                    mergedRow[4].append(newRow[4])
                if newRow[12] not in mergedRow[12]:
                    mergedRow[12].append(newRow[12])

        # Flatten the lists of host and output field values into strings.
        for row in mergedRows.values():
            row[4] = ' '.join(row[4])
            row[12] = ' // '.join(row[12])

        # Place findings into lists: crits, highs, meds, lows for sorting later
        crits = []
        highs = []
        meds = []
        lows = []

        for row in mergedRows.values():

            if row[3] == "Critical":
                crits.append(row)
            elif row[3] == "High":
                highs.append(row)
            elif row[3] == "Medium":
                meds.append(row)
            elif row[3] == "Low":
                lows.append(row)

        # Open an output file for writing
        with open('output.csv', 'wb') as outFile:
            fileWriter = csv.writer(outFile)

            # Add in findings from lists in order of severity. Only relevant columns included
            for c in crits:
                fileWriter.writerow( (c[3], c[4], c[7], c[12]) )

            for h in highs:
                fileWriter.writerow( (h[3], h[4], h[7], h[12]) )

            for m in meds:
                fileWriter.writerow( (m[3], m[4], m[7], m[12]) )

            for l in lows:
                fileWriter.writerow( (l[3], l[4], l[7], l[12]) )


# Input validation
if len(sys.argv) != 2:
    print 'You must provide a csv file to process'
    raw_input('Example: python nesscsv.py foo.csv')
else:
    print "Working..."
    # Store filename for use in manipulate function
    inFile = str(sys.argv[1])
    # Call manipulate function passing csv
    manipulate(inFile)

print("Done!")
raw_input("Output in output.csv. Hit return to finish.")

相关问题 更多 >