如何通过并行计算或使用Python重新编写此代码

2024-09-30 14:18:51 发布

男 | 程序猿一只，喜欢编程写python代码。

文件条件.txt和基因.txt两者都包含8000000行，但每行的列大小不同。以下代码的计算已经运行了两个星期，但仍然没有完成。如何用r或python中的并行计算重写以下代码。问题介绍见R code runs too slow,how to rewrite this code。谢谢。在

library(compiler)
library(Matrix)
enableJIT(3)
i=0;

con <- file("condition.txt", "r")
con2<-file("gene.txt","r")
x1<-readLines(con,n=-1)
x2<-readLines(con2,n=-1)


str2mat <- function(s) {
  n <- length(s)
  ni <- sapply(s, length)
  s <- unlist(s)
  u <- unique(s)
  spMatrix(nrow=n, ncol=length(u), i=rep(1L:n, ni), j=match(s, u), x=rep(1, length(s)))
}


m1 <- str2mat(strsplit(x1, "|", fixed=TRUE))
m2 <- str2mat(strsplit(x2, "|", fixed=TRUE))
n1 <- rowSums(m1)
n2 <- rowSums(m2)
num <- tcrossprod(m1)*tcrossprod(m2)
n12 <- n1*n2
den <- outer(n12, n12, pmin)
use <- num/den > 0.6
diag(use) <- FALSE
use[lower.tri(use)] <- FALSE
out <- which(use, arr.ind=TRUE)

使用较小的输入文件（20行）进行Rprof分析的结果如下：

^{pr2}$

Tags：文件代码 txt true use library code con

0条回答

目前没有回答

如何通过并行计算或使用Python重新编写此代码

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何通过并行计算或使用Python重新编写此代码

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >