从电子表格读取数据并用python构建矩阵 - 问答 - Python中文网

从电子表格读取数据并用python构建矩阵

2024-09-28 20:57:06 发布

您现在位置：Python中文网/ 问答频道 /正文

男 | 程序猿一只，喜欢编程写python代码。

有没有办法让python“读取”文档，排除不必要的元素并构建1和0的邻接矩阵？我有一个电子表格，有500个访问过的页面，包括链接、大纲链接和悬挂页面（需要从搜索中排除）

我想到了一个粗糙的伪代码，看起来像这样：

for each visited page vp
 for each outlink of vp
  if link relative
   revolve link
  if ink to visited page
  write 1
  else
 if link dangling
  ignore it
 else
   write 0

有没有可能在Python中实现这个想法？或者使用Matlab或R会更有用

爬网程序结果链接： http://www.dcs.bbk.ac.uk/~martin/sewn/ls3/sewn_2016_labsheet_3_full_crawl.txt http://www.dcs.bbk.ac.uk/~martin/sewn/ls3/sewn_2016_labsheet_3_full_crawl.xlsx

Tags： http for if 链接 www page link 页面

1条回答

网友

1楼 · 发布于 2024-09-28 20:57:06

有没有办法让python“读取”文档，排除不必要的元素并构建1和0的邻接矩阵

是的

请参阅https://docs.python.org/2/tutorial/inputoutput.html

开始打开和阅读文档的最简单方法：

f = open('workfile', 'r')
fileLines = f.readlines()

#do something with your lines
#properly adapt your pseudocode to
#the extracted data

f.close()

你剩下的问题超出范围了

相关问题更多 >

编程相关推荐

热门问题

热门文章