<p>使用sed可以做到这一点,但是由于regex和文件名不是固定的,sed不能很好地处理shell变量,awk是更好的工具。我们要运行的awk代码可能如下所示:</p>
<pre><code>{
head = ""
tail = $0
while(match(tail, re)) { # while there's a match in the
# part of the line we haven't
# yet inspected
print substr(tail, RSTART, RLENGTH) > file # print the match to the
# file
head = head substr(tail, 1, RSTART - 1) # split off the parts before
tail = substr(tail, RSTART + RLENGTH) # and after the match
}
print head tail # print what's left in the end
}
</code></pre>
<p>使用合适的参数<code>re</code>和<code>file</code>。<strong>感谢@EdMorton</strong>,他指出了原代码的一个问题,并提出了这一修正案。你知道吗</p>
<p>为了让这个问题变得可以调用,让我们在它周围放一个小shell样板:</p>
<pre><code>#!/bin/sh
if [ $# -ne 2 ]; then
echo "Usage: $0 regex filename"
exit 1
fi
awk -v re="$1" -v file="$2" '
{
head = ""
tail = $0
while(match(tail, re)) {
print substr(tail, RSTART, RLENGTH) > file
head = head substr(tail, 1, RSTART - 1)
tail = substr(tail, RSTART + RLENGTH)
}
print head tail
}'
</code></pre>
<p>把它放在一个文件<code>magic_script</code>,<code>chmod +x</code>里,就这样了。当然,你也可以直接打电话给awk</p>
<pre><code>awk -v re=' [ab][a-z]+' -v file=removed.txt '{ head = ""; tail = $0; while(match(tail, re)) { print substr(tail, RSTART, RLENGTH) > file; head = head substr(tail, 1, RSTART - 1); tail = substr(tail, RSTART + RLENGTH); } print head tail }'
</code></pre>