子进程打开文件和管道awk命令

2024-09-30 22:27:02 发布

您现在位置:Python中文网/ 问答频道 /正文

这是我的输入文件格式:

@SRR2056440.1 1 length=100
TGTAGGTCTGAGCAGCTTGTCCTGGCTGTGTCCATGTCAGAGCAACGGCCCAAGTCTGGGTCTGGGGGGGAAGGTGTCATGGAGCCCCCTACGATTCCCA
+SRR2056440.1 1 length=100
BCBFFFEFHHHHHJJJJJJIJJJJJJJJIJHHIJJIIJJJJJIJJIJJJJJJJJFHIJJJHHHHHHFDDDBDDD>>ACDEDDDDDDDDDDDDDDDDDEDD
@SRR2056440.2 2 length=100
CTGCCGCCACCGCAGCAGCCACAGGCAGAGGAGGACGAGGACGACTGGGAATCGTAGGGGGCTCCATGACACCTTCCCCCCCAGACCCAGACTTGGGCCA
+SRR2056440.2 2 length=100
CCCFFFFFHHHHHJJJJJJJJJJJIJIJIGJGGIGGJIJJEHFEDDDDDDDDDDABDDDDDDDDDDDDDDADDDDDDDDDDDCDDDDDDBBDDCDDBDD@
@SRR2056440.3 3 length=100
TCTGCCGCCACCGCAGCAGCCACAGGCAGAGGAGGACGAGGACGACTGGGAATCGTAGGGGGCTCCATGACACCTTCCCCCCCAGACCCAGACTTGGGCC
+SRR2056440.3 3 length=100
CCCFFFFFHGHHHJJJJJIJJJJJJIJJIJJJIJJIIIGIJ<CDBCDDDDDDDDDDDDDDDDDDDDDDDDDDDDDCDDDDDDDDDDDDDDDDDDCDCBDD

这是我要执行的命令:

^{pr2}$

以及命令的输出:

100.0 0.0

我想使用子进程在python脚本中执行该命令。我试了好几次,但还是搞不懂,这是我最后一次尝试:

awk_comm = r"""'NR%4==2{sum+=length($0);nr++;sumsq+=length($0)*length($0)}END{printf"%.1f\t%.1f\n",sum/nr,sqrt(sumsq/nr-(sum/nr)**2)}'"""
cmd = ['cat', 'input.fq', '|', 'awk', awk_comm]
p2 = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
out1, err = p2.communicate()

编辑:

我看不出输出有任何错误。它被卡住了,永远在跑。在


Tags: 命令cmdlengthnrsubprocesssumcommp2
3条回答

默认情况下,Python不使用shell来运行命令……但是管道是由shell计算的!!您需要通过shell=True

p2 = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)

以下是我的工作。在

>>> awk_comm = r"""cat input.fq | awk 'NR%4==2{sum+=length($0);nr++;sumsq+=length($0)*length($0)}END{printf"%.1f\t%.1f\n",sum/nr,sqrt(sumsq/nr-(sum/nr)**2)}'"""
>>> p2 = subprocess.Popen(awk_comm, stdout=subprocess.PIPE,shell=True)
>>> res = p2.communicate()
>>> res
('100.0\t0.0\n', None)

这里没有shell=True的意思。只需设置您的subprocess.Popen对象,以执行您在其他情况下使用shell的所有操作:

# the original awk code, with whitespace added for readability
awk_command = r"""
NR%4==2 {
  sum+=length($0);
  nr++;
  sumsq+=length($0)*length($0)
}
END {
  printf "%.1f\t%.1f\n", sum/nr, sqrt(sumsq/nr-(sum/nr)**2)
}
"""

p2 = subprocess.Popen(
  ['awk', awk_command],
  stdin=open('input.fq', 'r'),  # pass a file handle to input.fq directly on awk's stdin
  stdout=subprocess.PIPE,
  stderr=subprocess.PIPE)
out1, err = p2.communicate()

相关问题 更多 >