来自BYU PCC实验室的Opendomain会话数据集
chitchat-dataset的Python项目详细描述
聊天数据集
来自BYU的开放域会话数据集 Perception, Control & Cognition实验室Chit-Chat Challenge。在
安装
pip3 install chitchat_dataset
或只需下载原始数据集:
^{pr2}$使用
importchitchat_datasetascccdataset=ccc.Dataset()# Dataset is a subclass of dict()forconvo_id,convoindataset.items():print(convo_id,convo)
其他语言请参见^{
统计
- 7168次对话
- 258145句话
- 1315名独特的参与者
格式
{"prompt":"What's the most interesting thing you've learned recently?","ratings":{"witty":"1","int":5,"upbeat":5},"start":"2018-04-20T01:57:41","messages":[[{"text":"Hello","timestamp":"2018-04-19T19:57:51","sender":"22578ac2-6317-44d5-8052-0a59076e0b96"}],[{"text":"I learned that the Queen of England's last corgi died","timestamp":"2018-04-19T19:58:14","sender":"bebad07e-15df-48c3-a04f-67db828503e3"}],[{"text":"Wow that sounds so sad","timestamp":"2018-04-19T19:58:18","sender":"22578ac2-6317-44d5-8052-0a59076e0b96"},{"text":"was it a cardigan welsh corgi","timestamp":"2018-04-19T19:58:22","sender":"22578ac2-6317-44d5-8052-0a59076e0b96"},{"text":"?","timestamp":"2018-04-19T19:58:24","sender":"22578ac2-6317-44d5-8052-0a59076e0b96"}]]}
如何引证
如果您扩展或使用这项工作,请引用介绍它的论文:
@article{myers2020conversational,
title={Conversational Scaffolding: An Analogy-Based Approach to Response Prioritization in Open-Domain Dialogs},
author={Myers, Will and Etchart, Tyler and Fulda, Nancy},
year={2020}
}
- 项目
标签: