擅长:python、mysql、java
<p>试试看</p>
<pre><code>A = LOAD 'addr_input/addr.dat' USING PigStorage(',') AS (A : chararray, B :chararray , C: chararray , ID : chararray, ID_TYPE : chararray);
B = DISTINCT A;
Z= GROUP B BY (A,B,C);
DESCRIBE Z
O = foreach Z { f1 = foreach B generate $0,$1,$2,$3,$4; generate flatten(f1);}
dump O
</code></pre>
<p>对于您提供的输入,这是输出</p>
<pre><code>(aa,bb,cc,1,zip)
(aa,bb,cc,2,street)
(lll,ccc,ddd,6,city)
(lll,ccc,xxx,7,country)
(mmm,nnn,cc,3,county)
(mmm,nnn,cc,4,zip)
(mmm,nnn,cc,5,state)
</code></pre>
<p>这就是你要找的吗?你知道吗</p>