
2024-09-28 05:42:21 发布

您现在位置:Python中文网/ 问答频道 /正文








pattern = re.compile(r'(\d\d\d\d\-*\s*\&*\d+\-*\d*:[A-Za-z0-9\s\,\(\)\;\"\-]*\.*)')  



2013-04-13: BS-440: 10 egg masses observed in vernal pool habitat. Observer noted 3 of the AMJE masses had firm jelly, akin to a 3-wk old AMMA mass, but "bumpier" on outside (membrane and embryo-spacing in the masses were AMJE-like). BS-443: 3 egg masses observed in vernal pool habitat. A few egg masses may have been missed due to poor light conditions. Smith-019: 250 egg masses observed in vernal pool habitat. Observer searched only portions abutting the road (SW margin of pool). Many AMJE masses observed attached to herbaceous vegetation and difficult to differentiate from one another. AMJE egg-mass count is a rough estimate within area searched. 2017-01-01: 23 individuals observed. Egg masses were not present. 2018-07-04: BS-440: All individuals took a break from breeding for the long holiday weekend.


2013-04-13: BS-440: 10 egg masses observed in vernal pool habitat. Observer noted 3 of the AMJE masses had firm jelly, akin to a 3-wk old AMMA mass, but "bumpier" on outside (membrane and embryo-spacing in the masses were AMJE-like). BS-443: 3 egg masses observed in vernal pool habitat. A few egg masses may have been missed due to poor light conditions. Smith-019: 250 egg masses observed in vernal pool habitat. Observer searched only portions abutting the road (SW margin of pool). Many AMJE masses observed attached to herbaceous vegetation and difficult to differentiate from one another. AMJE egg-mass count is a rough estimate within area searched.

2017-01-01: 23 individuals observed. Egg masses were not present.

2018-07-04: BS-440: All individuals took a break from breeding for the long holiday weekend.

Tags: theto数据in文本bsegg模式


s = re.sub(r'\s+(?=\d{4}-*\s*&*\d+-*\d*:)', "\n\n", s)





output = re.compile(" (?=\d{4}-\d{2}-\d{2})").split(text)

Code demo


\d{4}-\d\d-\d\d:           # date with colon
.*?                        # the minimal amount of any characters required to match
(?=                        # positive lookahead (match text but don't consume it)
   \d{4}-\d\d-\d\d:        # date with colon
  |                        # or
   $                       # end of text
)                          # end lookahead




['2013-04-13: BS-440: 10 egg masses observed in vernal pool habitat.
  Observer noted 3 of the AMJE masses had firm jelly, akin to a 3-wk
  old AMMA mass, but "bumpier" on outside (membrane and embryo-spacing
  in the masses were AMJE-like). BS-443: 3 egg masses observed in
  vernal pool habitat. A few egg masses may have been missed due to
  poor light conditions. Smith-019: 250 egg masses observed in
  vernal pool habitat. Observer searched only portions abutting the 
  road (SW margin of pool). Many AMJE masses observed attached
  to herbaceous vegetation and difficult to differentiate from
  one another. AMJE egg-mass count is a rough estimate within
  area searched. ',
 '2017-01-01: 23 individuals observed. Egg masses were not present. ',
 '2018-07-04: BS-440: All individuals took a break from breeding for
  the long holiday weekend.']

相关问题 更多 >
