未知字符问题

网友

1楼 · 编辑于 2024-10-02 00:43:42

不能在变量名中使用欧元符号：

Identifiers (also referred to as names) are described by the following lexical definitions:

identifier ::=  (letter|"_") (letter | digit | "_")*
letter     ::=  lowercase | uppercase
lowercase  ::=  "a"..."z"
uppercase  ::=  "A"..."Z"
digit      ::=  "0"..."9"

您需要使用字符串：

^{pr2}$

熊猫其实对我来说，欧元符号没有问题：

import pandas as pd

df = pd.DataFrame([[1, 2]], columns=["£", "€"])

print(df["€"])
print(df["£"])
0    2
Name: €, dtype: int64
0    1
Name: £, dtype: int64

该文件是cp1252编码的，因此您需要指定编码：

mport pandas as pd
iimport codecs
df = pd.read_csv("PPR-2015.csv",header=0,encoding="cp1252")

print(df.columns)
Index([u'Date of Sale (dd/mm/yyyy)', u'Address', u'Postal Code', u'County', 
u'Price (€)', u'Not Full Market Price', u'VAT Exclusive', u'Description of Property', u'Property Size Description'], dtype='object')

print(df[u'Price (€)'])
0     €138,000.00
1     €270,000.00
2      €67,000.00
3     €900,000.00
4     €176,000.00
5     €155,000.00
6     €100,000.00
7     €120,000.00
8     €470,000.00
9     €140,000.00
10    €592,000.00
11     €85,000.00
12    €422,500.00
13    €225,000.00
14     €55,000.00
...
17433    €262,000.00
17434    €155,000.00
17435    €750,000.00
17436     €96,291.69
17437    €112,000.00
17438    €350,000.00
17439    €190,000.00
17440     €25,000.00
17441    €100,000.00
17442     €75,000.00
17443     €46,000.00
17444    €175,000.00
17445     €48,500.00
17446    €150,000.00
17447    €400,000.00
Name: Price (€), Length: 17448, dtype: object

然后改为浮动：

df[u'Price (€)'] = df[u'Price (€)'].str.replace(ur'[€,]'), '').astype('float')

print(df['Price (€)'.decode("utf-8")])

输出：

0     138000
1     270000
2      67000
3     900000
4     176000
5     155000
6     100000
7     120000
8     470000
9     140000
10    592000
11     85000
12    422500
13    225000
14     55000
...
17433    262000.00
17434    155000.00
17435    750000.00
17436     96291.69
17437    112000.00
17438    350000.00
17439    190000.00
17440     25000.00
17441    100000.00
17442     75000.00
17443     46000.00
17444    175000.00
17445     48500.00
17446    150000.00
17447    400000.00
Name: Price (€), Length: 17448, dtype: float64

网友

2楼 · 编辑于 2024-10-02 00:43:42

在字符串上使用如下lambda过滤器

import string
s = "some\x00string. with\x15 funny characters"
filter(lambda x: x in string.printable, s)

输出是

^{pr2}$

网友

3楼 · 编辑于 2024-10-02 00:43:42

应使用^{}重命名列名：

In [189]:
df = pd.DataFrame(columns = ['price_€'])
df

Out[189]:
Empty DataFrame
Columns: [price_€]
Index: []

In [191]:
df.rename(columns = {'price_€':'price'},inplace=True)
df

Out[191]:
Empty DataFrame
Columns: [price]
Index: []

另外，df[price_€]是一种无效的选择列的方式，您需要传递一个字符串，这样df['price_€']是正确的形式。在

还有这个：

^{pr2}$

不清楚您在这里要做什么，df[0]将再次引发一个KeyError，因为要为列建立索引，您需要传递一个字符串。在

我也不明白你为什么要把这个列变成一个浮点数，你还没有解释这一部分。在

相关问题更多 >

编程相关推荐

热门问题

热门文章

未知字符问题

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >