如何将这个SQL代码转换为涉及延迟函数的等价pandas代码？问题的回答

如何将这个SQL代码转换为涉及延迟函数的等价pandas代码？

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

我有一个包含病人ID和住院时间的数据框。我想过滤出患者在前一次入院（但包括第一次入院）后30天内入院的行。有了SQL，我可以使用<code>lag</code>函数来实现这一点： <pre><code>case -- mark the first hospital adm when dense_rank() over (partition by adm.subject_id order by adm.admittime) = 1 then true -- mark subsequent hospital adms if its been atleast a month since previous admission. when round((cast(extract(epoch from adm.admittime - lag(adm.admittime, 1) over (partition by adm.subject_id order by adm.admittime))/(60*60*24) as numeric)), 2) >= 30.0 then true else false end as include_adm </code></pre> 我该怎么处理熊猫呢？基本上，我想从以下数据帧中筛选出一行患者ID 30： <pre><code>id admit_time 0 30 2018-10-03 1 30 2018-10-29 2 13 2017-11-01 3 13 2018-02-27 </code></pre> 得到 <pre><code>id admit_time 0 30 2018-10-03 1 13 2017-11-01 2 13 2018-02-27 </code></pre> 因为患者第二次入院是在第一次入院后30天内。但13号病人的入院日期相差超过30天，因此两次入院都保留了下来。你知道吗 我上面展示的是一个示例数据帧。真正的数据帧由更多的列和行组成。更具体地说，其中一列是该患者在特定时间的临床记录。因此，行中有许多重复的信息，除了临床注释。例如，上面的数据帧： <pre><code>id admit_time note 0 30 2018-10-03 note_content1 1 30 2018-10-03 note_content2 2 30 2018-10-29 note_content1 3 30 2018-10-29 note_content2 4 13 2017-11-01 note_content1 5 13 2018-02-27 note_content2 6 13 2018-02-27 note_content2 </code></pre> 过滤后应产生以下数据帧： <pre><code>id admit_time note 0 30 2018-10-03 note_content1 1 30 2018-10-03 note_content2 2 13 2017-11-01 note_content1 3 13 2018-02-27 note_content1 4 13 2018-02-27 note_content2 </code></pre>

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

如何将这个SQL代码转换为涉及延迟函数的等价pandas代码？

1 个回答

相关Python问题