Python：如何使机器学习预测在生产中运行得更快？问题的回答

Python：如何使机器学习预测在生产中运行得更快？

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

我在scikit learn中创建了一个机器学习模型，我需要在生产中使用实时数据部署该模型。功能如下所示，例如： <pre><code> date event_id user_id feature1 feature2 featureX... 2017-01-27 100 5555 1.23 2 2.99 2017-01-27 100 4444 2.55 5 3.16 2017-01-27 100 3333 0.45 3 1.69 2017-01-27 105 1212 3.96 4 0.0 2017-01-27 105 2424 1.55 2 5.56 2017-01-27 105 3636 0.87 4 10.28 </code></pre> 所以，每天都有不同的活动。从数据库中提取一个SCID作为预测，然后从数据库中提取预测数据： ^{pr2}$ 然后我将预测与df相匹配，并根据需要将其作为输出发送到API或文件。在 当事件启动时，<code>featureX</code>会不断更新，这是我从API获得的。为了进行更新，我使用遍历每个<code>event_id</code>和<code>user_id</code>的循环，并用新的<code>featureX</code>值更新{<cd4>}，重新计算并再次发送到输出。在 为此，我做了这样的事情： <pre><code># get list of unique event ids events = set(df['event_id'].tolist()) try: while True: start = time.time() for event in events: featureX = request.get(API_URL + event) featureX_json = featureX.json() for user in featureX_json['users']: df.loc[df.user_id == user['user_id'], 'featureX'] = user['featureX'] df_X = df.drop(['date', 'event_id', 'user_id'], axis=1) df['prediction'] = loaded_model.predict_proba(df_X) # send to API or write to file end = time.time() print('recalculation time {} secs'.format(end - start)) except KeyboardInterrupt: print('exiting !') </code></pre> 这对我来说很好，但是整个预测更新在服务器上需要大约4秒，我需要它在1秒以下。我在想办法在<code>while loop</code>中做些什么来获得我需要的加速？在 json示例已根据请求添加到<code>event_id = 100</code>URL<code>http://myapi/api/event_users/<event_id></code>： <pre><code>{ "count": 3, "users": [ { "user_id": 4444, "featureY": 34, "featureX": 4.49, "created": "2017-01-17T13:00:09.065498Z" }, { "user_id": 3333, "featureY": 22, "featureX": 1.09, "created": "2017-01-17T13:00:09.065498Z" }, { "user_id": 5555, "featureY": 58, "featureX": 9.54, "created": "2017-01-17T13:00:09.065498Z" } ] } </code></pre>

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

Python：如何使机器学习预测在生产中运行得更快？

1 个回答

相关Python问题