Python中按字节长度拆分字符串

text = data[key] index = 1 while text: length = 4000 while len(text[0:length].encode('utf-8')) > 4000: length -= 1 data['{}{}'.format(key, index)] = text[0:length] text = text[length:] index += 1 del data[key]

2条回答

网友

1楼 · 编辑于 2024-09-30 01:18:55

最后我将G. Andersons link与我的代码组合在一起。它的效率更高，因为它不会对每个长度检查进行编码。在

    encoded_text = data[key].encode('utf-8')
    index = 1
    while encoded_text:
        length = min(4000, len(encoded_text))
        if len(encoded_text) > 4000:
            while (encoded_text[length] & 0xc0) == 0x80:
                length -= 1
        data['{}{}'.format(key, index)] = encoded_text[:length].decode('utf-8')
        encoded_text = encoded_text[length:]
        index += 1
    del data[key]

我还考虑过使用encode('unicode-escape')来解决unicode问题，但这可能会使我的字符串长度增加一倍以上。在

网友

2楼 · 编辑于 2024-09-30 01:18:55

检查您针对CLOB的建议是最新的还是基于有关使用定位器访问LOB的旧信息。在

在cxu-Oracle中，对于“小”clob的最佳实践是将它们表示为字符串：您的代码将是简单而高效的。参见示例https://github.com/oracle/python-cx_Oracle/blob/master/samples/ReturnLobsAsStrings.py

另一个解决方案是使用支持32K VARCHAR2的Oracle DB的最新版本。在

相关问题更多 >

编程相关推荐

热门问题

热门文章

Python中按字节长度拆分字符串

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >