擅长:python、mysql、java
<p>涉及<a href="/questions/tagged/unicode" class="post-tag" title="show questions tagged 'unicode'" rel="tag">unicode</a>的解决方案是根据UAX#10匹配文本。您可以在数据库或Python(可能使用PyICU)中执行此操作。下面是一些简短的代码:</p>
<pre class="lang-perl prettyprint-override"><code>#!/usr/bin/env perl
use 5.010;
use utf8;
use open qw(:std :encoding(UTF-8));
use Unicode::Collate qw();
my $c = Unicode::Collate->new(normalization => undef, level => 1);
my @g = qw(Gursu Gürsu Gursü Gürsü);
for my $o (@g) {
for my $i (@g) {
say "$i matches $o" if -1 != $c->index($o, $i, 0);
}
}
__END__
Gursu matches Gursu
Gürsu matches Gursu
Gursü matches Gursu
Gürsü matches Gursu
Gursu matches Gürsu
Gürsu matches Gürsu
Gursü matches Gürsu
Gürsü matches Gürsu
Gursu matches Gursü
Gürsu matches Gursü
Gursü matches Gursü
Gürsü matches Gursü
Gursu matches Gürsü
Gürsu matches Gürsü
Gursü matches Gürsü
Gürsü matches Gürsü
</code></pre>