擅长:python、mysql、java
<p>您可以使用以下解决方法:</p>
<pre><code>import numpy as np
import pandas as pd
with open('yahoo_arts.arff', 'r') as fp:
file_content = fp.readlines()
def parse_row(line, len_row):
line = line.replace('{', '').replace('}', '')
row = np.zeros(len_row)
for data in line.split(','):
index, value = data.split()
row[int(index)] = float(value)
return row
columns = []
len_attr = len('@attribute')
# get the columns
for line in file_content:
if line.startswith('@attribute '):
col_name = line[len_attr:].split()[0]
columns.append(col_name)
rows = []
len_row = len(columns)
# get the rows
for line in file_content:
if line.startswith('{'):
rows.append(parse_row(line, len_row))
df = pd.DataFrame(data=rows, columns=columns)
df.head()
</code></pre>
<p>输出:
<a href="https://i.stack.imgur.com/ecK8s.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/ecK8s.png" alt="enter image description here"/></a></p>