python中用于ML数据的.txt文件处理

2024-09-28 03:13:39 发布

您现在位置:Python中文网/ 问答频道 /正文

我有包含ML数据的.txt文件。每个文件有19个条目,这些条目在文本文件中反复出现。我需要在python中以这样一种方式获取这些文件,即每个条目都成为一列,并且它们相应的数值都放在该列中。 我需要对txt文件中设置的所有19个条目(如下面的“1”)执行此操作。或者只包含数字数据的文本文件也可以(如下面的“2”)。如果你能用python写一个代码,那就太棒了,否则请描述一些细节如何做到这一点

我想要的输出:

1click to view image 或 2click to view image

TXT文件:

Frame.id:263126,时间戳:697287019071,手号:1 ,手Id型:238右手,手指头编号:2,手方向:(-0.142081,0.865413,-0.480493),手掌位置:(-35.2841284.522330.828),手掌正常:(-0.686854,-0.435733,-0.581694) ,手指型(倾斜)拇指型(-36.7239301.602330.845) ,指型(倾斜)指型(-14.9321347.039280.375) ,指型(倾斜)中型(5.5661258.191321.318) ,指型(倾斜)环型(20.0886251.219320.136) ,手指型(Tiposition)小指型(27.5919259.584310.508) Frame.id:263127,时间戳:697287037765,手号:1 ,手Id型:238右手,手指头编号:2,手方向:(-0.167599,0.827034,-0.536587),手掌位置:(31.7441283.449322.619),手掌正常:(-0.776892,-0.445873,-0.444562) ,手指型(倾斜)拇指型(-38.8083304.444330.446) ,指型(倾斜)指型(-22.0532344.008273.496) ,指型(倾斜)中型(-0.161068258.09319.489) ,指型(倾斜)环型(13.8233250.236317.138) ,手指型(Tiposition)小指型(20.1892257.352305.739) Frame.id:263128,时间戳:697287057570,手号:1 ,手的Id类型:238右手,手指头的编号:2,手的方向:(-0.179139,0.817551,-0.547284),手掌位置:(30.8754280.8713115.444),手掌正常:(-0.750039,-0.473481,-0.461797) ,手指型(倾斜)拇指型(-40.4781299.689321.24) ,指型(倾斜)指型(-23.5209339.286264.435) ,指型(倾斜)中型(-0.157164254.483311.627) ,指型(倾斜)环型(14.1716247.067309.742) ,手指型(Tiposition)小指型(20.45254.358298.274) Frame.id:263129,时间戳:697287076710,手号:1 ,手的Id类型:238右手,手指头的号码:2,手的方向:(-0.191306,0.830611,-0.522961),手掌位置:(-28.6877277.055299.705),手掌正常:(-0.739979,--0.472093,-0.479124) ,手指型(倾斜)拇指型(-42.9838294.545305.915) ,指型(倾斜)指型(-26.5976335.972250.27) ,指型(倾斜)中型(-1.90294250.26294.934) ,指型(倾斜)环型(12.5955243.157292.909) ,手指型(Tiposition)小指型(18.5831250.968281.465)

文本文件链接:https://drive.google.com/file/d/1X1NdQNYlQWuNpmzGL6Wwi_SqFjIXnRbb/view?usp=sharing


Tags: 文件id时间条目方向frame手指拇指
2条回答

这不是解析文件最复杂的方法,但似乎很有效

data = []

text = []
with open('data_file_1.txt') as f:
    text = f.readlines()

text = ''.join(text) # Join all lines together, just in case
text = text.strip('\n').strip('\t') # Make whole text one line, just in case
rows = text.split('Frame.') # Separate rows of data

for row in rows[1:]: # First split is empty, start at second
    row_data = {}
    expected = 0
    store = []
    for item in row.split(','): # Split data at comma. Note: tuples will be split also
        if expected: # If in last item was a start of tuple, add rest values here
            expected -= 1
            store[1].append(float(item.strip().strip(')')))
            if not expected: # when all values are in place, save whole item to row_data
                row_data[store[0]] = store[1]
                store = []

        # In single value cases, parse and save them
        elif any(x in item for x in ['id', 'Timestamp', 'Hand_number', 'hand_Id_type', "hand_finger's_number"]):
            name, value = item.split(':')
            try:
                value = int(value)
            except ValueError:
                value = value.strip()
            row_data[name.strip()] = value

        # In cases, where there are tuple of 3 floats, parse and store for save
        elif any(x in item for x in ['Finger_type(tipposition)', 'Palm', 'direction']):
            name, value = item.strip('Finger_type(tipposition) ').split('(')
            store = [name.strip().strip(':'), [float(value)]]
            expected = 2

    data.append(row_data)

for row in data: # Final data in list of dict's
    print(row)

# Export your data in format like: .csv, JSON, XML, what is suitable for use in your application

像这样打印行

{'id': 263126, 'Timestamp': 697287019071, 'Hand_number': 1, 'hand_Id_type': '238right hand', "hand_finger's_number": 2, 'hand direction': [-0.142081, 0.865413, -0.480493], 'Palm position': [35.2841, 284.522, 330.828], 'Palm normal': [-0.686854, -0.435733, -0.581694], 'TYPE_THUMB': [-36.7239, 301.602, 330.845], 'TYPE_INDEX': [-14.9321, 347.039, 280.375], 'TYPE_MIDDLE': [5.5661, 258.191, 321.318], 'TYPE_RING': [20.0886, 251.219, 320.136], 'TYPE_PINKY': [27.5919, 259.584, 310.508]}

我做了一些简单的争论并把它做成了一个CSV。希望这对你有用:

lines = []
with open("data.txt") as f:
    lines=f.readlines()

# Columns in file    
header = "Frame.id,Timestamp,Hand_number,hand_Id_type,hand_finger's_number,hand direction,\
Palm position,Palm normal,Finger_type(tipposition) TYPE_THUMB,\
Finger_type(tipposition) TYPE_INDEX,Finger_type(tipposition) TYPE_MIDDLE,\
Finger_type(tipposition) TYPE_RING,Finger_type(tipposition) TYPE_PINKY\n"
out_lines = [header] # to store rows in output file
tmp=[] # empty list to store one row
for line in lines[1:]:
    tmp.append(line.strip()) # strip and append
    # if line has last column in row, add it to tmp, make the line
    # add the line to list of rows and clear tmp
    if line.strip().startswith(",Finger_type(tipposition) TYPE_PINKY"):
        new_line = ''.join(tmp)+'\n'
        for column in header.split(','):
            # replace header keywords and :
            new_line = new_line.replace(column.strip(), '').replace(':','')
        out_lines.append(new_line)
        tmp=[]
        
print(''.join(out_lines))

# Uncomment the below block to write the csv file     
# with open("output.csv","w+") as f:
#     f.writelines(out_lines)

相关问题 更多 >

    热门问题