有 Java 编程相关的问题?

你可以在下面搜索框中键入要查询的问题!

java在KOI8\R中检索html

我想检索一些用KOI8_R编码的html。如何在不损坏字符的情况下检索它

import java.io.*;
import java.net.URL;
import java.net.URLConnection;

public class htmlget {

  public static void main(String[] args) throws Exception {
String test = "http://koi8.pp.ru/";
      URL website = new URL(test);
         URLConnection yc = website.openConnection();
         StringBuilder fileData = new StringBuilder(1000);
         BufferedReader in = new BufferedReader(
                                 new InputStreamReader(
                                 yc.getInputStream(),"KOI8_R"));

         char[] buf = new char[1024];
         int numRead=0;
         while((numRead=in.read(buf)) != -1){
             fileData.append(buf, 0, numRead);
         }
         in.close();

        String text = fileData.toString();
        BufferedWriter out = new BufferedWriter(
                new OutputStreamWriter(new FileOutputStream("foo.txt"),"KOI8_R"));      
    out.write(text);
         OutputStreamWriter wrt = new OutputStreamWriter(System.out, "KOI8_R");
                 wrt.write(text);
                 wrt.close();
                 out.close();
}

}

控制台和文件显示俄文字符为“ÓÅÏÎÎÎΔ


共 (1) 个答案

  1. # 1 楼答案

    (...)
            in.close();
    
            String text = new String(fileData.toString().getBytes(), "KOI8_R");
            BufferedWriter out = new BufferedWriter(new OutputStreamWriter(
                    new FileOutputStream("foo.txt"), "KOI8_R"));
            out.write(text);
    (...)