java在KOI8\R中检索html
我想检索一些用KOI8_R编码的html。如何在不损坏字符的情况下检索它
import java.io.*;
import java.net.URL;
import java.net.URLConnection;
public class htmlget {
public static void main(String[] args) throws Exception {
String test = "http://koi8.pp.ru/";
URL website = new URL(test);
URLConnection yc = website.openConnection();
StringBuilder fileData = new StringBuilder(1000);
BufferedReader in = new BufferedReader(
new InputStreamReader(
yc.getInputStream(),"KOI8_R"));
char[] buf = new char[1024];
int numRead=0;
while((numRead=in.read(buf)) != -1){
fileData.append(buf, 0, numRead);
}
in.close();
String text = fileData.toString();
BufferedWriter out = new BufferedWriter(
new OutputStreamWriter(new FileOutputStream("foo.txt"),"KOI8_R"));
out.write(text);
OutputStreamWriter wrt = new OutputStreamWriter(System.out, "KOI8_R");
wrt.write(text);
wrt.close();
out.close();
}
}
控制台和文件显示俄文字符为“ÓÅÏÎÎÎΔ
# 1 楼答案