我试图删除标签,并创建一个新的文件,但我不知道如何做到这一点。我给出了一个包含XML标记的文件,我想使用strip和split生成一个list/string。我不能使用XML解析器或任何其他库。你知道吗
以下是文本文件:
<team> <name>Denver Broncos</name> <players> <player> <jno>50</jno> <fname>Zaire</fname> <lname>Anderson</lname> <height>5-11</height> <weight>220</weight> <age>24</age> <position>ILB</position> <school>Nebraska</school> </player> <player> <jno>48</jno> <fname>Shaquil</fname> <lname>Barrett</lname> <height>6-2</height> <weight>250</weight> <age>23</age> <position>OLB</position> <school>Colorado State</school> </player> <player> <jno>35</jno> <fname>Kapri</fname> <lname>Bibbs</lname> <height>5-11</height> <weight>203</weight> <age>23</age> <position>RB</position> <school>Colorado State</school> </player> </players> </team>
我想使用string/list生成如下句子:
Here is the roster for the Denver Broncos. There are 3 players on the team. Zaire Anderson, ILB, wears #50. He is 5 foot 11 inches tall, and weighs 220 pounds. He is 24 years old. He went to Nebraska. Shaquil Barrett, OLB, wears #48. He is 6 foot 2 inches tall, and weighs 250 pounds. He is 23 years old. He went to Colorado State. Kapri Bibbs, RB, wears #48. He is 5 foot 11 inches tall, and weighs 203 pounds. He is 23 years old. He went to Colorado State.
def test(filename):
f=open(filename,"r")
line = f.readline()
f2 = open("BearsRoster.txt", "w")
print line
myList = []
stringl = ""
for i in line:
if i == ("<"):
while i != ">":
line.remove(i)
else:
stringl = stringl + i
myList.append(stringl)
stringl = ""
else:
stringl = stringl + i
print myList
for i in myList:
print i
print myList
if i[0] == "<" or " ":
myList.remove(i)
显然这个代码是不正确的。我的想法是遍历字符串并尝试剥离<xxxxx>
代码。我只是不知道该怎么处理。之后,我想把这句话我张贴。你知道吗
要删除标记,请使用变量
skip=True/False
控制何时将char复制到新字符串。你知道吗当你找到
<
然后设置skip=True
,当你找到>
然后设置skip=False
如果您需要来自标记的数据,那么您必须构建解析器—识别开始标记和结束标记,记住标记名称,并可能使用标记构建树。所以你需要更多的工作。你知道吗
相关问题 更多 >
编程相关推荐