如何使用phonenumbers Python库获取df每行中的所有电话号码?

2024-06-26 01:45:35 发布

您现在位置:Python中文网/ 问答频道 /正文

我想使用Python的phonenumber库创建一个列,其中包含数据帧中text列的每一行中可用的所有有效电话号码

complains = ['If you validate your data, your confirmation number is 1-23-456-789, for a teacher you will be debited on the 3rd of each month 41.99, you will pay for the remaining 3 services offered:n/a',
             'EMAIL VERIFYED, 12345 1st STUDENT 400 88888 2nd STUDENT 166.93 Your request has been submitted and your confirmation number is 1-234-567-777 speed is increased to 250MB $80.99 BILLING CYCLE 18',
             'ADJUSTMENT FROM NOVEMBER TO MAY $80.99 Appointment for equipment change 7878940142']

complainsdf = pd.DataFrame(complains, index =['1', '2', '3'], columns =['text'])

我尝试了下面的代码。但是我没有得到我所期望的结果

complainsdf['tel'] = complainsdf.apply(lambda row: 
    phonenumbers.PhoneNumberMatcher(row['text'], "US"), axis=1)

complainsdf['tel'][0]给出了以下输出: <phonenumbers.phonenumbermatcher.PhoneNumberMatcher at 0x2623ebfddf0>而不是预期的电话号码


Tags: thetextyounumberforyouris电话号码
1条回答
网友
1楼 · 发布于 2024-06-26 01:45:35

tel每行可以包含多个电话号码。它们存储为phonenumbers.PhoneNumberMatcher类型的对象

要提取原始电话号码,必须使用循环遍历对象。例如,您可以执行以下操作:

def get_phone_numbers(x):
    # Extract the phone numbers from the text
    nums = phonenumbers.PhoneNumberMatcher(x, "US")
    # Convert the phone number format
    return [phonenumbers.format_number(num.number, phonenumbers.PhoneNumberFormat.E164) for num in nums]

complainsdf['tel'] = complainsdf['text'].apply(get_phone_numbers)
complainsdf

                                                 text   tel
1   If you validate your data, your confirmation n...   []
2   EMAIL VERIFYED, 12345 1st STUDENT 400 88888 2n...   []
3   ADJUSTMENT FROM NOVEMBER TO MAY $80.99 Appoint...   [+17878940142]

我找到了在documentation中使用PhoneNumberFormat.E164转换格式的方法。也许你得把它适应你的情况

相关问题 更多 >