打印出文件中第一次出现的单词,以字母表中的每个字母开头

2024-09-22 10:21:09 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个档案,里面有形容词a-Z的列表

如何打印第一个以A开头的单词,然后打印第一个以B开头的单词。。。一直到Z?
我想格雷普可能是个好办法。但对其他人开放,awk,python…other。在

一些示例输出:

$ cat adjectives.txt | head
Adamant: unyielding; a very hard substance
Adroit: clever, resourceful
Amatory: sexual
Animistic: quality of recurrence or reversion to earlier form
Antic: clownish, frolicsome
Arcadian: serene
Baleful: deadly, foreboding
Bellicose: quarrelsome (its synonym belligerent can also be a noun)
Bilious: unpleasant, peevish
Boorish: crude, insensitive

$ cat adjectives.txt | grep '^[ABCDE]' | head
Adamant: unyielding; a very hard substance
Adroit: clever, resourceful
Amatory: sexual
Animistic: quality of recurrence or reversion to earlier form
Antic: clownish, frolicsome
Arcadian: serene
Baleful: deadly, foreboding
Bellicose: quarrelsome (its synonym belligerent can also be a noun)
Bilious: unpleasant, peevish
Boorish: crude, insensitive

所以我的示例输出是:

^{pr2}$

here完整归档

$ cat adjectives.txt
Adamant: unyielding; a very hard substance
Adroit: clever, resourceful
Amatory: sexual
Animistic: quality of recurrence or reversion to earlier form
Antic: clownish, frolicsome
Arcadian: serene
Baleful: deadly, foreboding
Bellicose: quarrelsome (its synonym belligerent can also be a noun)
Bilious: unpleasant, peevish
Boorish: crude, insensitive
Calamitous: disastrous
Caustic: corrosive, sarcastic; a corrosive substance
Cerulean: sky blue
Comely: attractive
Concomitant: accompanying
Contumacious: rebellious
Corpulent: obese
Crapulous: immoderate in appetite
Defamatory: maliciously misrepresenting
Didactic: conveying information or moral instruction
Dilatory: causing delay, tardy
Dowdy: shabby, old-fashioned; an unkempt woman
Efficacious: producing a desired effect
Effulgent: brilliantly radiant
Egregious: conspicuous, flagrant
Endemic: prevalent, native, peculiar to an area
Equanimous: even, balanced
Execrable: wretched, detestable
Fastidious: meticulous, overly delicate
Feckless: weak, irresponsible
Fecund: prolific, inventive
Friable: brittle
Fulsome: abundant, overdone, effusive
Garrulous: wordy, talkative
Guileless: naive
Gustatory: having to do with taste or eating
Heuristic: learning through trial-and-error or problem solving
Histrionic: affected, theatrical
Hubristic: proud, excessively self-confident
Incendiary: inflammatory, spontaneously combustible, hot
Insidious: subtle, seductive, treacherous
Insolent: impudent, contemptuous
Intransigent: uncompromising
Inveterate: habitual, persistent
Invidious: resentful, envious, obnoxious
Irksome: annoying
Jejune: dull, puerile
Jocular: jesting, playful
Judicious: discreet
Lachrymose: tearful
Limpid: simple, transparent, serene
Loquacious: talkative
Luminous: clear, shining
Mannered: artificial, stilted
Mendacious: deceptive
Meretricious: whorish, superficially appealing, pretentious
Minatory: menacing
Mordant: biting, incisive, pungent
Munificent: lavish, generous
Nefarious: wicked
Noxious: harmful, corrupting
Obtuse: blunt, stupid
Parsimonious: frugal, restrained
Pendulous: suspended, indecisive
Pernicious: injurious, deadly
Pervasive: widespread
Petulant: rude, ill humored
Platitudinous: resembling or full of dull or banal comments
Precipitate: steep, speedy
Propitious: auspicious, advantageous, benevolent
Puckish: impish
Querulous: cranky, whining
Quiescent: inactive, untroublesome
Rebarbative: irritating, repellent
Recalcitrant: resistant, obstinate
Redolent: aromatic, evocative
Rhadamanthine: harshly strict
Risible: laughable
Ruminative: contemplative
Sagacious: wise, discerning
Salubrious: healthful
Sartorial: relating to attire, especially tailored fashions
Sclerotic: hardening
Serpentine: snake-like, winding, tempting or wily
Spasmodic: having to do with or resembling a spasm, excitable,
intermittent
Strident: harsh, discordant; obtrusively loud
Taciturn: closemouthed, reticent
Tenacious: persistent, cohesive,
Tremulous: nervous, trembling, timid, sensitive
Trenchant: sharp, penetrating, distinct
Turbulent: restless, tempestuous
Turgid: swollen, pompous
Ubiquitous: pervasive, widespread
Uxorious: inordinately affectionate or compliant with a wife
Verdant: green, unripe
Voluble: glib, given to speaking
Voracious: ravenous, insatiable
Wheedling: flattering
Withering: devastating
Zealous: eager, devoted

Tags: oroftotxtcatverycleverhard
3条回答

awk救命!在

$ awk '!a[tolower(substr($0,1,1))]++' file

这将为每个初始字符创建一个计数器,并且只在计数为零时(即第一个实例)打印。tolower()是为了使它不区分大小写,如果不需要,可以删除它。substr($0,1,1)从行中提取第一个字符。有一个隐式循环将对输入文件的所有行重复此操作。在

通过稍微改变脚本

^{pr2}$

您可以获得第二条记录(如果存在),或者使用<3代替前2条记录==2。在

如果文件已经排序并且大小写一致,则可以选择更简单的脚本

$ uniq -w1 file

uniq命令提取比较值的第一个实例,这里仅限于第一个字符。因此,它将一次提取所有字母中的第一个。如果大小写不一致,请添加-i忽略大小写标志。在

扫描一次文件就足够了,不需要多次扫描。。。在

也许,用bash:

for i in {A..Z}; do grep -m1 ^$i adjectives.txt; done

Python版本:

import itertools

with open('adjectives.txt') as fp:
    # Group lines by first letter. If the lines weren't already sorted, 
    # you could replace fp with sorted(fp).
    groups = itertools.groupby(fp, key=lambda line: line[0])

    for first_letter, group in groups:
        print(next(group), end='')

相关问题 更多 >