如何只获取文件名而不获取扩展名?

2024-06-26 08:30:05 发布

您现在位置:Python中文网/ 问答频道 /正文

假设您有以下文件路径,您希望从这些路径获取文件名(不带扩展名):

                       relfilepath
0                  20210322636.pdf
12              factuur-f23622.pdf
14                ingram micro.pdf
19    upfront.nl domein - Copy.pdf
21           upfront.nl domein.pdf
Name: relfilepath, dtype: object

我提出了以下问题,但这给了我一个问题,第一个项目,它成为一个数字输出'20210322636.0'

from pathlib import Path


for i, row in dffinalselection.iterrows():
    dffinalselection['xmlfilename'][i] = Path(dffinalselection['relfilepath'][i]).stem
    dffinalselection['xmlfilename'] = dffinalselection['xmlfilename'].astype(str)

这是错误的,因为它应该是“20210322636”

请帮忙


Tags: 文件path路径pdf文件名nlmicroupfront
2条回答

如果列值始终是文件名/文件路径,请使用maxsplit参数1.的右侧将其拆分,并获取拆分后的第一个值

>>> df['relfilepath'].str.rsplit('.', n=1).str[0]

0                  20210322636
12              factuur-f23622
14                ingram micro
19    upfront.nl domein - Copy
21           upfront.nl domein
Name: relfilepath, dtype: object

您的操作是正确的,但对数据帧的操作是错误的

from pathlib import Path


for i, row in dffinalselection.iterrows():
    dffinalselection['xmlfilename'][i] = Path(dffinalselection['relfilepath'][i]).stem # THIS WILL NOT RELIABLY MUTATE THE DATAFRAME
    dffinalselection['xmlfilename'] = dffinalselection['xmlfilename'].astype(str) # THIS OVERWROTE EVERYTHING

相反,只要做:

from pathlib import Path

dffinalselection['xmlfilename'] = ''
for row in dffinalselection.itertuples():
    dffinalselection.at[row.index, 'xmlfilename']= Path(row.relfilepath).stem

或者

dffinalselection['xmlfilename'] = dffinalselection['relfilepath'].apply(lambda value: Path(value).stem)

相关问题 更多 >