<p>使用str.translate公司要清除文本并使用update添加到集合,请执行以下操作:</p>
<pre><code>set_d = set()
with open(file,'r') as f:
for line in f:
lst = (x.strip() for x in line.split("|")[1].translate(None,"\"'[]").split(","
set_d.update(lst)
</code></pre>
<p>输出一组唯一的单个域:</p>
<pre><code>set(['vmit.it', 'tcmpraktijk-jingshen.nl', 'umbertominnella.it', 'studioguizzardi.it', 'telestreet.it', 'watec-peru.com', 'bsacimeeting.org', 'webdesignhostingindia.com', 'wsava2015.com', 'iipmstudents.in', 'maurominnella.com', 'ellen-siemer.nl', 'picsmeeting.com', 'iipmalumni.com', 'iipmclubs.in', 'israelinnovation.co.il'])
</code></pre>
<p>您可以将其写入新文件:</p>
<pre><code>set_d = set()
with open(file,'r') as f,open("out.txt","w") as out:
for line in f:
lst = (x.strip() for x in line.split("|")[1].translate(None,"\"'[]").split(","))
set_d.update(lst)
for line in set_d:
out.write("{}\n".format(line))
</code></pre>
<p>输出:</p>
<pre><code>$ cat out.txt
vmit.it
tcmpraktijk-jingshen.nl
umbertominnella.it
studioguizzardi.it
telestreet.it
watec-peru.com
bsacimeeting.org
webdesignhostingindia.com
wsava2015.com
iipmstudents.in
maurominnella.com
ellen-siemer.nl
picsmeeting.com
iipmalumni.com
iipmclubs.in
israelinnovation.co.il
</code></pre>
<p>您的代码不会分离到单独的域中,您的json调用实际上没有任何帮助。将代码更改为update将输出如下内容:</p>
<pre><code>{" 'maurominnella.com']", " 'wsava2015.com'", "'webdesignhostingindia.com'", " 'iipmclubs.in']", " 'ellen-siemer.nl'']", " 'umbertominnella.it'", " 'picsmeeting.com']", "['israelinnovation.co.il'", "['vmit.it'", " 'iipmstudents.in'", "['tcmpraktijk-jingshen.nl'", " 'studioguizzardi.it'", "['iipmalumni.com'", " 'watec-peru.com'", " 'bsacimeeting.org'", " 'telestreet.it'"}
</code></pre>
<p>也不要使用list作为变量名,因为它会隐藏python<code>list</code></p>