Python:“胡言乱语的句子生成器”,行为不端的怪人

2024-09-27 00:21:28 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试用Python创建一个简单的“胡言乱语生成器”程序,它打印一个由字符、空格和末尾的标点符号组成的随机胡言乱语字符串(换句话说,就是一个完整的句子)。它基本上已经起作用了,但我遇到了一个奇怪的问题,我就是无法集中注意力。在

不知何故,我胡言乱语字符串中的最后一个“单词”总是比它应该的长,尽管我的代码明确地限制任何单词的长度超过11个字符。在看了代码之后,天知道有多少次我还是不知道是什么原因造成的。有趣的是,它只在长字符串时才变得明显,而短句(最多50个字符)看起来基本上是好的。在

以下是我在Windows powershell中运行时得到的两个示例输出:

首先是50个字符:

How many gibberish characters would you like to print out? 50

Uxlouasieyt uoygigjas eayouiumza gyfejmu th egkyaulheeb.

第二,300个字符:

How many gibberish characters would you like to print out? 300

Yhiaztexj ekkexe iiuiyx itozlyui zao cegyeuyiml aofzyyreet cofi owzycwobla rreyblioca rla tpocnelavj ytpa x eefra gnyoe yfxyhnivme miert ywy ykhi ee gup eui ttuoi oeoyaf uenyecb apluo yli xmy uiyaoneewe jyxymxal y dzaiglu uo eqkiyeiz ke oxayuiayzf yyi iqoezu ekuioyotly viyslaybiiwvymitoeagrejvavihigpyoxawefunodgu!

注意句子中的最后一个单词是如何随着字符串的变长而逐渐变长的,而所有排除的单词都保持在11个字符以内。这就好像在乱七八糟的列表中添加空格的那部分代码在某一点之后被忽略了。但为什么呢?在

以下是完整的代码:

import random

def gibberishgen():
    alphabet_vowels = ['a','e','i','o','u','y',]
    alphabet_consonants = ['b','c','d','f','g','h','j','k','l','m','n','p','q','r','s','t','v','w','x','z']
    gibberish_list = []

    while True:
        gibberishamount = raw_input("How many gibberish characters would you like to print out? ")
        if gibberishamount.isdigit():
            break
        else:
            print "Please give me a number!"

    # fill the gibberish_list with characters
    lasttwochars = ['','']
    for char in range(1, int(gibberishamount)+1):
        nextcharvowel = random.choice(alphabet_vowels)
        nextcharconsonant = random.choice(alphabet_consonants)
        if lasttwochars[0] in alphabet_consonants and lasttwochars[1] in alphabet_consonants:   # because I don't want more than 2 consonants in a row
            nextchar = nextcharvowel
        else:
            roll = random.randint(1,10)
            if roll > 5:
                nextchar = nextcharvowel
            else:   
                nextchar = nextcharconsonant
        gibberish_list.append(nextchar)
        lasttwochars.append(nextchar)
        lasttwochars.pop(0)

    # insert spaces at randomized intervals to separate the "words" from each other
    last_whitespace = 0
    for index in range(0, len(gibberish_list)+1):
        randspace = random.randint(1,10)
        if index >= last_whitespace + 3 and randspace <= 2:     # make sure words don't get too short on average
            gibberish_list.insert(index, ' ')
            last_whitespace = index
        elif index > last_whitespace + 10:                      # ...or too long
            gibberish_list.insert(index, ' ')
            last_whitespace = index

    punctlist = ['.', '!', '?']

    gibberishstring = ''.join(gibberish_list)
    finalstring = gibberishstring.capitalize() + random.choice(punctlist)
    print "\n", finalstring, "\n"

gibberishgen()

如果有人向我解释这里发生了什么事,我将不胜感激。我只学了两个月的python,所以很可能我错过了一些显而易见的东西。在

也可以随时指出任何不好的语法或练习你发现的。在


Tags: 字符串inindexrandom单词listlastprint
2条回答

当您在gibberish_list中插入空格时,它会变得越来越长,但是当您开始迭代时,循环会停在与gibberish_list中最后一个字符相对应的字符索引处,因此它永远不会到达列表的末尾,这一点随着插入的空格越多(即对于更长的字符串)变得更加明显。在

这里有一个稍微扩展的版本:

它可以在Python2.x和3.x中工作,并使用真实的字母和单词长度频率。在

from itertools import islice
from random import choice, randint
import sys

if sys.hexversion < 0x3000000:
    inp = raw_input
    rng = xrange
else:
    inp = input
    rng = range


LETTERS = (    # relative character frequencies
    "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabb"
    "bbbbbbbbbbcccccccccccccccccccccdddddddddddddddddddddddddddddddde"
    "eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee"
    "eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeefffffffffffffffffgggggggggggggggg"
    "hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhiiiiiiiiiiiiiiiiii"
    "iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiijjkkkkkklllllllllllllllllllll"
    "llllllllllmmmmmmmmmmmmmmmmmmmnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn"
    "nnnnnnnnnnnnnnnnoooooooooooooooooooooooooooooooooooooooooooooooo"
    "ooooooooopppppppppppppppqrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr"
    "rrrrrrsssssssssssssssssssssssssssssssssssssssssssssssstttttttttt"
    "ttttttttttttttttttttttttttttttttttttttttttttttttttttttttttuuuuuu"
    "uuuuuuuuuuuuuuuvvvvvvvvwwwwwwwwwwwwwwwwwwxxxyyyyyyyyyyyyyyyzz"
)

CONSONANTS  = ''.join(ch for ch in LETTERS if ch not in "aeiouy")
VOWELS      = ''.join(ch for ch in LETTERS if ch     in "aeiouy")
PUNCTUATION = "....??!"

is_cons     = set(CONSONANTS).__contains__    # is_cons(x) == x in set(CONSONANTS)

WORDLEN = [     # relative word-length frequencies
    2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,
    2,  2,  2,  2,  2,  3,  3,  3,  3,  3,  3,  3,
    3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,
    3,  3,  3,  3,  4,  4,  4,  4,  4,  4,  4,  4,
    4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,
    5,  5,  5,  5,  5,  5,  5,  5,  5,  5,  5,  5,
    5,  5,  6,  6,  6,  6,  6,  6,  6,  6,  6,  6,
    7,  7,  7,  7,  7,  7,  7,  7,  8,  8,  8,  8,
    8,  9,  9,  9, 10, 10, 10, 11, 11, 12
]

wordlen = lambda: choice(WORDLEN)

def get_int(prompt):
    while True:
        try:
            return int(inp(prompt))
        except ValueError:
            pass

def gibberish():
    """
    Generate an infinite sequence of random letters,
      allowing no more than two consecutive consonants
    """
    a = choice(LETTERS); yield a
    b = choice(LETTERS); yield b
    while True:
        c = choice(VOWELS if is_cons(a) and is_cons(b) else LETTERS)
        yield c
        a, b = b, c

def take_n(iterable, n):
    return list(islice(iterable, n))

def add_spaces(iterable, make_word_length):
    iterable = iter(iterable)
    while True:
        for i in rng(make_word_length()):
            yield next(iterable)
        yield ' '

def gibberish_sentence():
    length   = get_int("How many characters of gibberish would you like? ")
    chars    = take_n(gibberish(), length)              # make that many chars
    chars    = add_spaces(chars, wordlen)               # add spaces to make "words"
    sentence = ''.join(chars).rsplit(' ', 1)[0]         # crop at last space (don't leave a part-word at the end)
    return sentence.capitalize() + choice(PUNCTUATION)  # capitalize and add punctuation

def main():
    print(gibberish_sentence())

if __name__=="__main__":
    main()

样本输出:

^{pr2}$

相关问题 更多 >

    热门问题