在python中将混合数据从csv上传到numpy数组中

2024-09-27 17:37:34 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个包含10列6行的csv文件,我想将其转换为numpy数组。虽然它加载了,但我现在不能使用数据,我想我错过了一个步骤。在

我现在的代码是这样的

import numpy as np

filename = "test_ioc.csv"

# open file
f=open(filename)

# initialize this 
myfile = [] 

# Convert to numpy array
mat = np.vstack([signal for signal in f.readlines()])
print mat

或者,我也这样做了:

^{pr2}$

第一个输出如下:

print a
    [['2043l0.wav,0.115,0.169,0.222,0.23,2043l0.wav,0.21,0.169,0.238,0.23']
 [ 'dn2001l0.wav,0.105,0.161,0.242,0.222,dn2001l0.wav,0.153,0.176,0.207,0.207']
 ['2694l0.wav,0.13,0.192,0.33,0.314,2694l0.wav,0.192,0.184,0.207,0.238']
 ['2641l0.wav,0.123,0.146,0,0.407,2641l0.wav,0.199,0.199,0.199,0.176']
 ['2622l0.wav,0.284,0.353,0.582,0.582,2622l0.wav,0.268,0.161,0.176,0.184']
 ['dn2047l0.wav,0.12,0.23,0.368,0.322,dn2047l0.wav,0.369,0.169,0.207,0.222']]

我真的需要把我的行进一步分割成2组4,将每行中的每个数字转换成一个float,但我对python还不熟悉,只想能够对我的数据执行一些基本操作,并使用matplotlib绘制它。谢谢你的帮助!在


Tags: 文件csv数据numpysignalnp步骤数组
2条回答

好吧,我想出了如何用熊猫来回答我的问题。在

import pandas as pd
filename = 'test_ioc.csv'
headings = 'filename', 'l0', 'l1', 'l2', 'l3', 'filename', 'r0', 'r1', 'r2', 'r3'

#data
data = pd.read_csv(filename, names=headings)

print data

问题是,您读取的是整行数据,而不是用逗号分隔数据,因此最终得到的结果是每个数组有一行,您需要在分隔符上拆分,以便拆分为单个元素:

mat = np.vstack([signal.split(",") for signal in f)])

或者让csv lib执行解析:

^{pr2}$

但是有一种简单的方法可以使用^{}读取文件:

import  numpy as np

arr = np.loadtxt("in.csv",delimiter=",",dtype=object)

print(arr)

这会给你一个数组:

[['2043l0.wav' '0.115' '0.169' '0.222' '0.23' '2043l0.wav' '0.21' '0.169'
  '0.238' '0.23']
 ['dn2001l0.wav' '0.105' '0.161' '0.242' '0.222' 'dn2001l0.wav' '0.153'
  '0.176' '0.207' '0.207']
 ['2694l0.wav' '0.13' '0.192' '0.33' '0.314' '2694l0.wav' '0.192' '0.184'
  '0.207' '0.238']
 ['2641l0.wav' '0.123' '0.146' '0' '0.407' '2641l0.wav' '0.199' '0.199'
  '0.199' '0.176']
 ['2622l0.wav' '0.284' '0.353' '0.582' '0.582' '2622l0.wav' '0.268' '0.161'
  '0.176' '0.184']
 ['dn2047l0.wav' '0.12' '0.23' '0.368' '0.322' 'dn2047l0.wav' '0.369'
  '0.169' '0.207' '0.222']]

还有一个^{},它提供了更多的选项,包括创建structured array。在

import numpy as np

headings = [('filename1', "|S20"), ('l0', float), ('l1', float), ('l2', float), ('l3', float),
            ('filename2', "|S10"), ('r0', float), ('r1', float), ('r2', float), ('r3', float)]

arr = np.genfromtxt("in.csv", delimiter=",", dtype=headings)

print(arr)
[ ('2043l0.wav', 0.115, 0.169, 0.222, 0.23, '2043l0.wav', 0.21, 0.169, 
0.238, 0.23)
 ('dn2001l0.wav', 0.105, 0.161, 0.242, 0.222, 'dn2001l0.w', 0.153, 0.176, 0.207, 0.207)
 ('2694l0.wav', 0.13, 0.192, 0.33, 0.314, '2694l0.wav', 0.192, 0.184, 0.207, 0.238)
 ('2641l0.wav', 0.123, 0.146, 0.0, 0.407, '2641l0.wav', 0.199, 0.199, 0.199, 0.176)
 ('2622l0.wav', 0.284, 0.353, 0.582, 0.582, '2622l0.wav', 0.268, 0.161, 0.176, 0.184)
 ('dn2047l0.wav', 0.12, 0.23, 0.368, 0.322, 'dn2047l0.w', 0.369, 0.169, 0.207, 0.222)]

您也可以按列名查找,如pandas、arr["filename1"]等。。在

相关问题 更多 >

    热门问题