我正在尝试用diff补丁成功地建立一个文本文件。 从一个空的文本文件开始,我需要应用600多个补丁,最终得到最终的文档(一个我编写的文本+跟踪mercurial的更改)。对于文件中的每个更改都需要添加额外的信息,因此我不能简单地在命令行中使用diff和patch。在
我花了一整天的时间编写(和重写)一个工具,它可以解析diff文件并相应地对文本文件进行更改,但是其中一个diff文件使我的程序以我无法理解的方式运行。在
为每个diff文件调用此函数:
# filename = name of the diff file
# date = extra information to be added as a prefix to each added line
def process_diff(filename, date):
# that's the file all the patches will be applied to
merge_file = open("thesis_merged.txt", "r")
# map its content to a list to manipulate it in memory
merge_file_lines = []
for line in merge_file:
line = line.rstrip()
merge_file_lines.append(line)
merge_file.close()
# open for writing:
merge_file = open("thesis_merged.txt", "w")
# that's the diff file, containing all the changes
diff_file = open(filename, "r")
print "-", filename, "-" * 20
# also map it to a list
diff_file_lines = []
for line in diff_file:
line = line.rstrip()
if not line.startswith("\\ No newline at end of file"): # useless information ... or not?
diff_file_lines.append(line)
# ignore header:
#--- thesis_words_0.txt 2010-12-04 18:16:26.020000000 +0100
#+++ thesis_words_1.txt 2010-12-04 18:16:26.197000000 +0100
diff_file_lines = diff_file_lines[2:]
hunks = []
for i, line in enumerate(diff_file_lines):
if line.startswith("@@"):
hunks.append( get_hunk(diff_file_lines, i) )
for hunk in hunks:
head = hunk[0]
# @@ -252,10 +251,9 @@
tmp = head[3:-3].split(" ") # [-252,10] [+251,9]
line_nr_minus = tmp[0].split(",")[0]
line_nr_minus = int(line_nr_minus[1:]) # 252
line_nr_plus = tmp[1].split(",")[0]
line_nr_plus = int(line_nr_plus[1:]) # 251
for j, line in enumerate(hunk[1:]):
if line.startswith("-"):
# delete line from the file in memory
del merge_file_lines[line_nr_minus-1]
plus_counter = 0 # counts the number of added lines
for k, line in enumerate(hunk[1:]):
if line.startswith("+"):
# insert line, one after another
merge_file_lines.insert((line_nr_plus-1)+plus_counter, line[1:])
plus_counter += 1
for line in merge_file_lines:
# write the updated file back to the disk
merge_file.write(line.rstrip() + "\n")
merge_file.close()
diff_file.close()
print "\n\n"
def get_hunk(lines, i):
hunk = []
hunk.append(lines[i])
# @@ -252,10 +251,9 @@
lines = lines[i+1:]
for line in lines:
if line.startswith("@@"):
# next hunk begins, so stop here
break
else:
hunk.append(line)
return hunk
diff文件如下所示——这里是麻烦制造者:
^{pr2}$输出:
[...]
Per
definition
the
"generative"
generative
means
"having
the
ability
to
originate,
produce,
or
procreate."
<http://www.thefreedictionary.com/generative>
that
[...]
前面所有的补丁都像预期的那样复制文本。我已经重写了很多次了,但是这个错误的行为仍然存在——所以现在我是个无能的人。在
我将非常感谢您的提示和提示,如何做这不同的。提前谢谢你!在
编辑:
-最后每一行都应该是这样的:{date_and_time_of_text_change}word
它基本上是跟踪单词添加到文本中的日期和时间。在
尝试使用来自python-patch的解析器-至少您可以手动逐个应用hunks来查看哪个失败了。API不稳定,但解析器是稳定的,所以您可以复制补丁.py从主干/到你的项目。不过,如果能得到一些关于所需API的建议,那就太好了。在
代码中确实有一个错误,我没有正确地解释diff文件(当一个diff文件中有多个hunk时,没有意识到需要进行换行)
相关问题 更多 >
编程相关推荐