我正在读一个大文件,以节省内存。我需要为数据帧中的每一列指定数据类型。我想从已经为数据类型创建的列表中删除。你知道吗
import pandas as pd
headers=['Record Identifier','Respondent_ID','Agency Code','Loan Type','Property Type','Loan Purpose','Owner Occupancy',
'Loan Amount','Preapprovals','Type of Action Taken','Metropolitan Statistical Area/Metropolitan Division','State Code',
'County Code','Census Tract','Applicant Ethnicity','Co-applicant Ethnicity','Applicant Race: 1','Applicant Race: 2',
'Applicant Race: 3','Applicant Race: 4','Applicant Race: 5','Co-applicant Race: 1','Co-applicant Race: 2',
'Co-applicant Race: 3','Co-applicant Race: 4','Co-applicant Race: 5','Applicant Sex','Co-applicant Sex',
'Applicant Income','Type of Purchaser','Denial Reason: 1','Denial Reason: 2','Denial Reason: 3','Rate Spread',
'HOEPA Status','Lien Status','Population','Minority Population %','FFIEC Median Family Income',
'Tract to MSA/MD Median Family Income %','Number of Owner Occupied Units','Number of 1- to 4-Family units']
dtypes=['int64','object','int64','int64','int64','int64','int64','int64','int64','int64','object','object','object','object',
'int64','int64','int64','int64','int64','int64','int64','int64','int64','int64','int64','int64','int64','int64',
'object','int64','int64','int64','int64','object','object','object','object','float64','int64','float64','int64',
'int64']
df = pd.read_csv('2017_lar.txt', sep="|", header=None, names=headers, dtype=dtypes, nrows=100)
print(df)
错误: TypeError:无法理解数据类型
您使用的参数不正确。您只能指定一个类型名,或将列标题与类型匹配的
dict
。你知道吗文件中明确说明了这一点:
因为您传递的是一个列表,所以它假设整个列表都是数据类型,这是不可理解的。你知道吗
这是一个正确的用法。你知道吗
相关问题 更多 >
编程相关推荐