处理大量的线程和数据库连接（Python）我能做些什么来节省资源？

from streamscrobbler import streamscrobbler from concurrent.futures import ThreadPoolExecutor import re from sqlalchemy import * #get song name from station def manageStation(station_id, station_link): current_song = getCurrentSong(station_link) current_song = current_song.replace("'", "") current_song = current_song.replace("\"", "") current_song = current_song.replace("/", "") current_song = current_song.replace("\\", "") current_song = current_song.replace("%", "") if current_song: with db.connect() as con: rs = con.execute("INSERT INTO station_songs VALUES('" + str(station_id) + "', '" + current_song + "', '') ON DUPLICATE KEY UPDATE song_name = '" + current_song + "';") return "" def getCurrentSong(stream_url): streamscrobblerobj = streamscrobbler() stationinfo = streamscrobblerobj.getServerInfo(stream_url) metadata = stationinfo.get("metadata") regex = re.search('\'song\': \'(.*?)\'' , str(metadata)) if regex: return regex.group(1) return "" def update() : print 'update starting' global db db = create_engine('mysql://root:pass@localhost:3306/radio') global threadExecutor threadExecutor = ThreadPoolExecutor(max_workers=20000) with db.connect() as con: rs = con.execute("SELECT id, link FROM station_table") for row in rs.fetchall(): threadExecutor.submit(manageStation, row[0], row[1])

1条回答

网友

1楼 · 发布于 2024-10-01 17:32:52

您不需要为每个任务提供一个真正的线程，因为大多数时候，线程将等待来自套接字（web请求）的IO。在

您可以尝试使用green threads使用类似于gevent的方法，使用如下架构：

from gevent import monkey; monkey.patch_socket()

NUM_GLETS = 20    
STATION_URLS = (
   'http://station1.com',
   ...
)

pool = gevent.Pool(NUM_GLETS)
tasks = [pool.spawn(analyze_station, url) for url in STATION_URLS]
pool.join(tasks)

其中analyze_station是获取和分析特定站点的代码。在

结果应该是一个单线程程序，但是不是阻塞每个web请求，而是在套接字等待数据时运行另一个绿色线程。这比为大多数空闲工作生成真正的线程要高效得多。在

相关问题更多 >

编程相关推荐

热门问题

热门文章