根据python中的字符串值数字的计数替换它们?

2024-10-01 17:28:29 发布

您现在位置:Python中文网/ 问答频道 /正文

df:
    Col_A        Month
0 March 2020      Mar
1 March 20        Mar
2 Ebg 2020        Mar
3 17 GOFE         Mar
4 APR 17          Mar
5 16 HGN          Nov
6 2015 ref        May
7 18Jun           Jul

如何替换数据帧中字符串变量的数字, 例如,我需要用2019或19替换列A中的数字。 如果列A中的数字计数或长度为4,则为19

输出:

    Col_A        Month
0 March 2019      Mar
1 March 19        Mar
2 Ebg 2019        Mar
3 19 GOFE         Mar
4 APR 19          Mar
5 19 HGN          Nov
6 2019 ref        May
7 19Jun           Jul

Tags: 数据refdf数字colnovaprmar
2条回答

以下是如何使用re

import pandas as pd
from re import sub, findall

df = pd.DataFrame(...)
df['Col_A'] = [sub('\d\d\d\d','2019',m)
               if findall('\d\d\d\d',m)
               else sub('\d\d','19',m)
               for m in df['Col_A']]

更新:另一种方式:

import pandas as pd
from re import sub, findall

df = pd.DataFrame(...)

df['Col_A'] = df.Col_A.map(lambda m: sub('[0-9]{4}','2019',m)
                           if findall('[0-9]{4}',m)
                           else sub('[0-9]{2}','19',m))


更新:cs95提供了以下简短解决方案:

import pandas as pd

df = pd.DataFrame(...)

df['Col_A'] = df['Col_A'].str.replace('\d{2}(?!\d)', '19')

由于问题中的示例包含两个和四个字符串,我假设四个数字的字符串的最后两个数字将替换为"19",两个数字的字符串将替换为"19"

下面的正则表达式可以与re.sub一起使用以进行这些替换

r'(?<!\d)(?=\d{2}(?!\d))\d{2}|(?<=(?<!\d)\d{2})\d{2}(?!\d)'

字符串:

1 2 GOFE          Mar
2 23 GOFE         Mar
3 567 GOFE        Mar
4 5678 GOFE       Mar
5 3456789 GOFE    Mar

分别成为:

1 2 GOFE          Mar
2 19 GOFE         Mar
3 567 GOFE        Mar
4 5619 GOFE       Mar
5 34567 GOFE      Mar

regex demoPython demo

Python的正则表达式引擎执行以下操作

(?<!\d)   : use negative lookbehind to assert previous
            character is not a digit
(?=       : begin positive lookahead
  \d{2}   : match 2 digits
  (?!\d)  : use negative lookahead to assert next
            character is not a digit 
)         : end non-capture group
\d{2}     : match 2 digits
|         : or
(?<=      : begin positive lookbehind
  (?<!\d) : use negative lookbehind to assert previous
            character is not a digit
  \d{2}   : match 2 digits
)         : end positive lookbehind
\d{2}     : match 2 digits
(?!\d)    : use negative lookahead to assert next
            character is not a digit 

相关问题 更多 >

    热门问题