我在数据框中有一些行,如果它们包含以下字符串中包含的数字/数值,我将选择这些行
text_1="source="The previous low was 27,523. The 1.35 trillion ($22.5 million ) program could start in October. The number of people who left the country plunged 99.8 percent from a year earlier to 2,750, according to the data from the agency."
数据帧
Account Sentences
51343 The subsidies are expected to form part of a second budget.
6376 The subsidies, totalling 2.35tn, are expected to form part of a second budget. New plans to allocate $22.5 billion to a new reimbursement programme.
31 The subsidies, totalling 1.35tn, are expected to form part of a second budget. New plans to allocate $22.5 billion to a new reimbursement programme.
2624 The way to a sports fan’s heart? Behind-the-scenes content from their favourite teams.
613 The subsidies, totalling 1.43 tn, are expected to form part of a second budget. New plans to allocate $21.5 billion to a new reimbursement programme.
764 The subsidies, totalling 1.35tn, are expected to form part of a second budget. New plans to allocate $22.5 billion to a new reimbursement programme.
所需的输出将是创建三列:
我尝试做的第一件事是在句号中更改所有逗号,以避免text
和Sentences
列的行中的数字混淆。
然后,从text
中提取所有数字,以便与行中的每个数值进行比较
numb=(re.findall("\d+[,.\d]\d+", text))
for i in df['Sentences']:
print(re.findall("\d+[,.\d]\d+", i))
句子中每一行要比较的数字是:27.523, 1.35, 22.5, 2.750, 99.8
(请注意逗号应转换为句号)
现在,我应该创建一个新的专栏,其中包含了要获得的结果
Account Common Difference Match?
51343 { 27.523, 1.35, 22.5, 2.750, 99.8 } 0
6376 22.5 2.35 0.5
31 {1.35, 22.5} { 27.523, 2.750, 99.8 } 0.5
2624 { 27.523, 1.35, 22.5, 2.750, 99.8 } 0
613 { 27.523, 1.35, 22.5, 2.750, 99.8 }, {1.43, 21.5} 0
764 {1.35, 22.5} { 27.523, 2.750, 99.8 } 0.5
你认为这是可行的吗?为了得到这些结果,你能给我一些建议吗
你可以这样做:
根据
Difference
列。我不知道你是怎么得到这些值的。所以,我临时想知道什么对你有用相关问题 更多 >
编程相关推荐