列车和测试中的Json数据拆分

[{"category": "CRIME", "headline": "There Were 2 Mass Shootings In Texas Last Week, But Only 1 On TV", "authors": "Melissa Jeltsen", "link": "https://www.huffingtonpost.com/entry/texas-amanda-painter-mass-shooting_us_5b081ab4e4b0802d69caad89", "short_description": "She left her husband. He killed their children. Just another day in America.", "date": "2018-05-26"} , {"category": "ENTERTAINMENT", "headline": "Will Smith Joins Diplo And Nicky Jam For The 2018 World Cup's Official Song", "authors": "Andy McDonald", "link": "https://www.huffingtonpost.com/entry/will-smith-joins-diplo-and-nicky-jam-for-the-official-2018-world-cup-song_us_5b09726fe4b0fdb2aa541201", "short_description": "Of course it has a song.", "date": "2018-05-26"} ]

1条回答

网友

1楼 · 发布于 2024-06-29 00:58:01

我是这样做的：我首先使用train_test_split来获得train（70%）和test（30%）集，然后在test上使用相同的命令来获得test（50%）和validation（50%）集

from sklearn.model_selection import train_test_split
   
with open('file_name') as f:
    lines = f.readlines()
    
train, test = train_test_split(lines, test_size=0.3)
val, test = train_test_split(test, test_size=0.5)

希望这有帮助

相关问题更多 >

编程相关推荐

热门问题

热门文章