Python打印来自两个词典

2024-09-27 04:29:23 发布

您现在位置:Python中文网/ 问答频道 /正文

我把对话分为两本字典,每本字典都包含那个人说的单词(我有两个人)。我必须打印4列(关键字、第一个目录中的数字(第一人称使用该词的次数)、第二个目录中的数字以及它们的数量),并按关键字排序。有人能帮我吗?输出必须如下所示:

african   1  0  1
air-speed 1  0  0
an        1  1  2
arthur    1  0  1
...

如你所见,我有som文本

text = """Bridgekeeper: Hee hee heh. Stop. What... is your name?
King Arthur: It is 'Arthur', King of the Britons.
Bridgekeeper: What... is your quest?
King Arthur: To seek the Holy Grail.
Bridgekeeper: What... is the air-speed velocity of an unladen swallow?
King Arthur: What do you mean? An African or European swallow?"""

bridgekeeper_w和arthur_w的输出:

print (bridgekeeper_w) 

{'hee': 2, 'heh': 1, 'stop': 1, 'what': 3, 'is': 3, 'your': 2, 'name': 1, 'quest': 1, 'the': 1, 'air-speed': 1, 'velocity': 1, 'of': 1, 'an': 1, 'unladen': 1, 'swallow': 1}

print (arthur_w)
{'king': 4, 'it': 1, 'is': 1, 'arthur': 1, 'of': 1, 'the': 2, 'britons': 1, 'to': 1, 'seek': 1, 'holy': 1, 'grail': 1, 'what': 1, 'do': 1, 'you': 1, 'mean': 1, 'an': 1, 'african': 1, 'or': 1, 'european': 1, 'swallow': 1}

现在我需要这个(关键字、第一个单词的数字、第二个单词的数字和计数):

african   1  0  1
air-speed 1  0  0
an        1  1  2
arthur    1  0  1
...
``

Tags: oftheanis数字关键字air单词
3条回答

如果您已经有两个字典,主要问题是如何循环字典中的键。但这并不难

for key in sorted(set(list(bridgekeeper_w.keys()) + list(arthur_w.keys()))):
    b_count = 0 if key not in bridgekeeper_w else bridgekeeper_w[key]
    a_count = 0 if key not in arthur_w else arthur_w[key]
    print('%-20s %3i %3i %3i' % (key, b_count, a_count, b_count+a_count))

如果字典的完整性不重要,一个更优雅的解决方案可能是将缺少的键添加到其中一个字典中,然后简单地循环它的所有键

for key in arthur_w.keys():
    if key not in bridgekeeper_w:
        bridgekeeper_w[key] = 0

for key, b_count in sorted(bridgekeeper_w.items()):
    a_count = 0 if key not in arthur_w else arthur_w[key]
    print('%-20s %3i %3i %3i' % (key, b_count, a_count, b_count+a_count))

这消除了第一个解决方案中相当繁琐且稍微复杂的set(list(keys()...)),代价是遍历其中一个字典两次

实现以下数据帧的步骤很少-

  1. 根据“\n”新行字符溢出字符串
  2. 将结果初始化为defaultdict(list),然后拆分“:”上的每一行,使用索引0处的值作为键,使用索引1处的值作为值
  3. 通过join将每个键的值列表转换回字符串
  4. 去除杂音
  5. 使用计数器计算字符串中每个单词的值

最后,我们将有这样一个JSON-

{'Bridgekeeper': Counter({'Hee': 1,
          'hee': 1,
          'heh': 1,
          'Stop': 1,
          'What': 3,
          'is': 3,
          'your': 2,
          'name': 1,
          'quest': 1,
          'the': 1,
          'airspeed': 1,
          'velocity': 1,
          'of': 1,
          'an': 1,
          'unladen': 1,
          'swallow': 1}),

如果我们将JSON加载到数据帧中,它可以很容易地转换为所需的输出

from collections import defaultdict
import string
from collections import Counter
import pandas as pd

result = defaultdict(list)
for row in text.split('\n'):
    result[row.split(':')[0].strip()].append(row.split(':')[1].strip())

result = {key:(' '.join(value)).translate(str.maketrans('', '', string.punctuation)) for key,value in result.items()}
result = {key:Counter(value.split(' ')) for key,value in result.items()}
df = pd.DataFrame(result).fillna(0).astype(int)
df['sum'] = df['Bridgekeeper'] + df['King Arthur']
df.to_csv('out.csv', sep='\t')

输出数据帧-

          Bridgekeeper  King Arthur  sum
Hee                  1            0    1
hee                  1            0    1
heh                  1            0    1
Stop                 1            0    1
What                 3            1    4
is                   3            1    4
your                 2            0    2
name                 1            0    1
quest                1            0    1
the                  1            2    3
airspeed             1            0    1
velocity             1            0    1
of                   1            1    2
an                   1            0    1
unladen              1            0    1
swallow              1            1    2
It                   0            1    1
Arthur               0            1    1
King                 0            1    1
Britons              0            1    1
To                   0            1    1
seek                 0            1    1
Holy                 0            1    1
Grail                0            1    1
do                   0            1    1
you                  0            1    1
mean                 0            1    1
An                   0            1    1

或者是没有第三方库的解决方案:

bridgekeeper_d = {'hee': 2, 'heh': 1, 'stop': 1, 'what': 3, 'is': 3, 'your': 2, 'name': 1, 'quest': 1, 'the': 1, 'air-speed': 1, 'velocity': 1, 'of': 1, 'an': 1, 'unladen': 1, 'swallow': 1}
arthur_d = {'king': 4, 'it': 1, 'is': 1, 'arthur': 1, 'of': 1, 'the': 2, 'britons': 1, 'to': 1, 'seek': 1, 'holy': 1, 'grail': 1, 'what': 1, 'do': 1, 'you': 1, 'mean': 1, 'an': 1, 'african': 1, 'or': 1, 'european': 1, 'swallow': 1}
joined = dict.fromkeys(list(bridgekeeper_d.keys()) + list(arthur_d.keys()), {})

for key, value in bridgekeeper_d.items():
    joined[key]["bridgekeeper"] = value

for key, value in arthur_d.items():
    joined[key]["arthur"] = value
# At this point, joined looks like this:
# {
#     'hee': {'bridgekeeper': 1, 'arthur': 1},
#     'heh': {'bridgekeeper': 1, 'arthur': 1},
#     'stop': {'bridgekeeper': 1, 'arthur': 1},
#     'what': {'bridgekeeper': 1, 'arthur': 1}
#     ...
# }

for key, dic in joined.items():
    print("%-15s %d %d %d" % (key, dic["bridgekeeper"], dic["arthur"], dic["bridgekeeper"] + dic["arthur"]))

输出:

hee             1 1 2
heh             1 1 2
stop            1 1 2
what            1 1 2
is              1 1 2
your            1 1 2
name            1 1 2
quest           1 1 2
the             1 1 2
air-speed       1 1 2
velocity        1 1 2
of              1 1 2
an              1 1 2
unladen         1 1 2
swallow         1 1 2
king            1 1 2
it              1 1 2
arthur          1 1 2
britons         1 1 2
to              1 1 2
seek            1 1 2
holy            1 1 2
grail           1 1 2
do              1 1 2
you             1 1 2
mean            1 1 2
african         1 1 2
or              1 1 2
european        1 1 2

相关问题 更多 >

    热门问题