我想挑选一列中每行的前4个单词,并根据该值使用python为另一个新创建的列分配一个新值

2024-09-30 14:19:28 发布

您现在位置:Python中文网/ 问答频道 /正文

下面给出了我的数据集前5行的图片。我想做的是创建一个名为“Parking Type”的新列,并根据另一个名为“Sign”的列将该列的值指定为“Meter”、“Ticket”和“Other”。“符号”列是字符串,其中有些字符串值包含MTR,有些字符串值包含TKT,有些字符串值中既没有MTR也没有TKT。所以我只想把值“Meter”放在“Parking Type”列中,如果“Sign”列enter image description here中的一行包含字符串“MTR”,以此类推。我是这样做的:

pSignInfringe['Parking Type']=pSignInfringe.Sign.apply(λx:“米”如果x==“1P MTR M-SAT 7:30-19:30”或x==“1/2P MTR SAT 7:30-1930”其他“票”)

但是,它将需要太多或太多的声明。有没有更好的办法?我是python新手,如果这是一个初学者问题,我很抱歉。数据帧代码如下所示:

,Area Name,Street Name,Between Street 1,Between Street 2,Side Of Street,Street Marker,Arrival Time,Departure Time,Duration of Parking Event (in seconds),Sign,In Violation?,Street ID,Device ID,Month Number
8,City Square,FLINDERS STREET,SWANSTON STREET,RUSSELL STREET,3,1630N,2012-05-19 18:20:01,2012-05-19 19:19:58,3597,1/2P MTR SAT 7:30-1930,1,670,1123,5
10,Chinatown,RUSSELL STREET,Lt BOURKE STREET,BOURKE STREET,2,770E,2012-02-25 18:30:31,2012-02-25 21:02:36,9125,2P DIS M-SUN 0:00-23:59,1,1221,504,2
11,Princes Theatre,LONSDALE STREET,RUSSELL STREET,EXHIBITION STREET,1,C2858,2011-11-17 09:00:00,2011-11-17 10:41:06,6066,1P MTR M-SAT 7:30-19:30,1,894,1996,11
15,Southbank,COVENTRY STREET,DODDS STREET,WELLS STREET,4,9317S,2012-02-20 13:50:40,2012-02-20 16:33:33,9773,2P TKT A M-F 7:30-18:30,1,547,4054,2
28,Queensberry,VICTORIA STREET,KING STREET,HAWKE STREET,3,7642N,2012-02-15 11:32:34,2012-02-15 12:09:35,2221,1/4P M-SAT 7:30-18:30,1,1381,4001,2
30,Rialto,COLLINS STREET,KING STREET,WILLIAM STREET,3,2066N,2012-09-03 09:24:51,2012-09-03 10:45:41,4850,1/2P M-SAT 7:30-19:30,1,528,1290,9
45,Victoria Market,FRANKLIN STREET,QUEEN STREET,ELIZABETH STREET,1,C6628,2011-11-11 17:42:32,2011-11-11 19:50:44,7692,2P MTR M-SAT 7:30-20:30,1,681,2812,11
53,Hardware,LONSDALE STREET,QUEEN STREET,ELIZABETH STREET,1,C2942,2012-05-05 13:17:55,2012-05-05 14:59:35,6100,1P MTR M-SAT 7:30-19:30,1,894,2019,5
55,Hyatt,EXHIBITION STREET,Lt COLLINS STREET,COLLINS STREET,1,C364,2011-01-11 08:11:48,2011-01-11 16:48:39,31011,1P MTR M-SAT 7:30-19:30,1,647,243,1
56,Banks,QUEEN STREET,FLINDERS LANE,FLINDERS STREET,5,975W,2012-03-03 12:53:27,2012-03-03 14:06:27,4380,1P MTR M-SAT 7:30-19:30,1,1171,693,3

Tags: 数据字符串streettypesatmetersignqueen
3条回答

试着做一个for循环,您可以在列表上进行理解,但我不推荐这样做,因为您是从python开始的

我已经根据你的描述做了一些代码

看一看,让我知道它是否有效

for item in Sign:
    if "MTR" in item:
        pSignInfringe['Parking Type'] = "MTR"
    else:
        pSignInfringe['Parking Type'] = "Ticket"

如果所需的"ParkingType值仅取决于“MTR”的存在,您可能会发现这更好。这将考虑MTR位于.Sign字段中的所有情况,而无需硬编码所有可能的值

pSignInfringe['Parking Type'] = pSignInfringe.Sign.apply(lambda x: "Meter" if 'MTR' in x else "Ticket")

您可以使用.str.contains,它将返回与df具有相同索引的布尔序列,然后将其用作索引器

pSignInfringe.loc[
    pSignInfringe.Sign.srt.contains('MTR'),
    'Parking Type'] = 'Meter'

注意熊猫的字符串访问器默认使用正则表达式

这样可以避免泛型apply调用,从而使代码更快

相关问题 更多 >