从pubmed中搜索作者从属关系以获取pubmed id和doi的列表

pubmed-author-affiliation的Python项目详细描述


命令行工具获取一个或一个pubmed id或doi列表, 在PubMed中搜索相应的作者从属关系和 将信息输出到文件

命令行选项

-h, --helpShow help text
-i PUBMEDID, --pubmedid PUBMEDID
Search for author affiliations for single Pubmed ID
-d doi, --doi doi
Search for author affiliations for a single DOI
-f file, --infile file
File with a list of Pubmed IDs and DOIs (they can be mixed). One entry per line.
-x format, --format format
Output format. Choices=[‘json’ (default),’text’]. ‘text’ option produces tab separated table, denormalised in the sense that the pubmed ID/DOI is repeated on multiple rows if there are multiple authors with related affiliations.

示例运行

pubmed输入和json输出:

python pubmedAuthorAffiliation.py -i 27863242

输出:

{'articleTitle': 'Decoding Mammalian Ribosome-mRNA States by Translational GTPase Complexes.', 'journalTitle': 'Cell', 'authorList': [{'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Shao', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK.', 'country': 'UK', 'initials': 'S'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Murray', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK.', 'country': 'UK', 'initials': 'J'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Brown', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK.', 'country': 'UK', 'initials': 'A'}, {'firstName': 'n/a', 'institute': 'University of California', 'lastName': 'Taunton', 'affiliation': 'Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA.', 'country': 'USA', 'initials': 'J'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Ramakrishnan', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK. Electronic address: ramak@mrc-lmb.cam.ac.uk.', 'country': 'UK', 'initials': 'V'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Hegde', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK. Electronic address: rhegde@mrc-lmb.cam.ac.uk.', 'country': 'UK', 'initials': 'RS'}], 'pubmedId': '27863242', 'error': False}
{'articleTitle': 'Decoding Mammalian Ribosome-mRNA States by Translational GTPase Complexes.', 'journalTitle': 'Cell', 'authorList': [{'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Shao', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK.', 'country': 'UK', 'initials': 'S'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Murray', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK.', 'country': 'UK', 'initials': 'J'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Brown', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK.', 'country': 'UK', 'initials': 'A'}, {'firstName': 'n/a', 'institute': 'University of California', 'lastName': 'Taunton', 'affiliation': 'Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA.', 'country': 'USA', 'initials': 'J'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Ramakrishnan', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK. Electronic address: ramak@mrc-lmb.cam.ac.uk.', 'country': 'UK', 'initials': 'V'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Hegde', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK. Electronic address: rhegde@mrc-lmb.cam.ac.uk.', 'country': 'UK', 'initials': 'RS'}], 'pubmedId': '27863242', 'error': False}

DOI输入和文本输出:

python pubmedAuthorAffiliation.py -d 10.1016/j.molcel.2016.11.013 -x text

输出:

27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     L       Tafur   European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.  Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     Y       Sadian  European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.  Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     NA      Hoffmann        European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.  Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     AJ      Jakobi  European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany; European Molecular Biology Laboratory (EMBL), Hamburg Unit, Notkestrasse 85, 22607 Hamburg, Germany.     Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     R       Wetzel  European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.  Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     WJH     Hagen   European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.  Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     C       Sachse  European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.  Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     CW      Müller  European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany. Electronic address: cmueller@embl.de.    Germany European Molecular Biology Laboratory (EMBL)

混合了doi和pubmed的输入文件。写入文件的文本输出:

python pubmedAuthorAffiliation.py -f emdb-2010.txt -x text > /tmp/out.txt

在这种情况下,将忽略无法识别的行,例如:

WARNING:root:processList: id not recognized: id

代码测试

这将遍历选定的pubmed和已知工作的doi列表:

python test_pubmedAuthorAffiliation.py

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
测试偶数/奇数Java的测试   java如何编写在请求体中接受XML的swagger API   java PrimeTable面临奇怪的错误   java如何检查数组中输入的用户是否为回文?   java如何删除JButton中文本周围的框?   java阻止直接访问JSF2中的xhtml文件   java如何获取定义方法的类的名称?   while loop如何让用户只需输入数字,然后在Java中重试?   从应用程序注册中列出azure存储帐户容器时,java受众验证失败   Java线程之间的多线程数据交换   java检查数组中是否存在重复的索引值?   java正则表达式从字符串中复制第二个URL   java如何从gradle项目依赖项中排除METAINF?   java如何将JLabel[]添加到JTable?   使用kotlin播放java音频(位于internet上的文件)