我试图创建一个数据帧,其中一个字段是用函数计算的。为此,我使用以下代码:
import pandas as pd
def didSurvive(sex):
return int(sex == "female")
titanic_df = pd.read_csv("test.csv")
submission = pd.DataFrame({
"PassengerId": titanic_df["PassengerId"],
"Survived": didSurvive(titanic_df["Sex"])
})
submission.to_csv('titanic-predictions.csv', index=False)
运行此代码时,出现以下错误:
D:\Documents\kaggle\titanic>python predictor.py
File "predictor.py", line 3
def didSurvive() {
^
SyntaxError: invalid syntax
D:\Documents\kaggle\titanic>python predictor.py
D:\Documents\kaggle\titanic>python predictor.py
D:\Documents\kaggle\titanic>python predictor.py
Traceback (most recent call last):
File "predictor.py", line 10, in
"Survived": didSurvive(titanic_df["Sex"])
File "predictor.py", line 4, in didSurvive
return int(sex == "female")
File "C:\Python34\lib\site-packages\pandas\core\series.py", line 92, in wrapper
"{0}".format(str(converter)))
TypeError: cannot convert the series to
D:\Documents\kaggle\titanic>
我想我正在尝试对一系列布尔函数而不是单个布尔函数运行int()。我该怎么解决这个问题?在
要转换序列的数据类型,可以使用
astype()
函数,这应该可以:也可以在从csv文件导入期间重新格式化数据
相关问题 更多 >
编程相关推荐