有 Java 编程相关的问题?

你可以在下面搜索框中键入要查询的问题!

文件io缓冲随机访问文件java

RandomAccessFile对文件的随机访问速度非常慢。您经常阅读有关在其上实现缓冲层的内容,但在线找不到这样做的代码

所以我的问题是:你们知道这个类的任何开源实现,会共享一个指针还是共享你们自己的实现

如果这个问题能够成为关于这个问题的有用链接和代码的集合,那就太好了。我相信,很多人都分享了这个问题,SUN也从来没有正确地解决过这个问题

请不要参考MemoryMapping,因为文件可能比整数大很多。最大值


共 (6) 个答案

  1. # 1 楼答案

    您可以使用以下代码从RandomAccessFile生成BufferedInputStream:

     RandomAccessFile raf = ...
     FileInputStream fis = new FileInputStream(raf.getFD());
     BufferedInputStream bis = new BufferedInputStream(fis);
    

    一些需要注意的事情

    1. 关闭FileInputStream将关闭RandomAccessFile,反之亦然
    2. RandomAccessFile和FileInputStream指向同一个位置,因此从FileInputStream读取将使RandomAccessFile的文件指针前进,反之亦然

    也许你想用这种方式

    RandomAccessFile raf = ...
    FileInputStream fis = new FileInputStream(raf.getFD());
    BufferedInputStream bis = new BufferedInputStream(fis);
    
    //do some reads with buffer
    bis.read(...);
    bis.read(...);
    
    //seek to a a different section of the file, so discard the previous buffer
    raf.seek(...);
    bis = new BufferedInputStream(fis);
    bis.read(...);
    bis.read(...);
    
  2. # 2 楼答案

    如果您在64位机器上运行,那么内存映射文件是最佳方法。只需将整个文件映射到一个大小相等的缓冲区数组中,然后根据需要为每个记录选择一个缓冲区(即,edalorzo的答案,但是您需要重叠的缓冲区,这样就不会有跨越边界的记录)

    如果您在32位JVM上运行,那么您就只能使用RandomAccessFile。但是,您可以使用它读取包含整个记录的byte[],然后使用ByteBuffer从该数组中检索单个值。在最坏的情况下,您应该需要进行两次文件访问:一次用于检索记录的位置/大小,另一次用于检索记录本身

    但是,请注意,如果创建了大量byte[],则可能会开始对垃圾收集器施加压力,如果在整个文件中跳出,则仍将保持IO绑定

  3. # 3 楼答案

    import java.io.File;
    import java.io.FileNotFoundException;
    import java.io.IOException;
    import java.io.RandomAccessFile;
    
    /**
     * Adds caching to a random access file.
     * 
     * Rather than directly writing down to disk or to the system which seems to be
     * what random access file/file channel do, add a small buffer and write/read from
     * it when possible. A single buffer is created, which means reads or writes near 
     * each other will have a speed up. Read/writes that are not within the cache block 
     * will not be speed up. 
     * 
     *
     */
    public class BufferedRandomAccessFile implements AutoCloseable {
    
        private static final int DEFAULT_BUFSIZE = 4096;
    
        /**
         * The wrapped random access file, we will hold a cache around it.
         */
        private final RandomAccessFile raf;
    
        /**
         * The size of the buffer
         */
        private final int bufsize;
    
        /**
         * The buffer.
         */
        private final byte buf[];
    
    
        /**
         * Current position in the file.
         */
        private long pos = 0;
    
        /**
         * When the buffer has been read, this tells us where in the file the buffer
         * starts at.
         */
        private long bufBlockStart = Long.MAX_VALUE;
    
    
        // Must be updated on write to the file
        private long actualFileLength = -1;
    
        boolean changeMadeToBuffer = false;
    
        // Must be update as we write to the buffer.
        private long virtualFileLength = -1;
    
        public BufferedRandomAccessFile(File name, String mode) throws FileNotFoundException {
            this(name, mode, DEFAULT_BUFSIZE);
        }
    
        /**
         * 
         * @param file
         * @param mode how to open the random access file.
         * @param b size of the buffer
         * @throws FileNotFoundException
         */
        public BufferedRandomAccessFile(File file, String mode, int b) throws FileNotFoundException {
            this(new RandomAccessFile(file, mode), b);
        }
    
        public BufferedRandomAccessFile(RandomAccessFile raf) throws FileNotFoundException {
            this(raf, DEFAULT_BUFSIZE);
        }
    
        public BufferedRandomAccessFile(RandomAccessFile raf, int b) {
            this.raf = raf;
            try {
                this.actualFileLength = raf.length();
            } catch (IOException e) {
                throw new RuntimeException(e);
            }
            this.virtualFileLength = actualFileLength;
            this.bufsize = b;
            this.buf = new byte[bufsize];
        }
    
        /**
         * Sets the position of the byte at which the next read/write should occur.
         * 
         * @param pos
         * @throws IOException
         */
        public void seek(long pos) throws IOException{
            this.pos = pos;
        }
    
        /**
         * Sets the length of the file.
         */
        public void setLength(long fileLength) throws IOException {
            this.raf.setLength(fileLength);
            if(fileLength < virtualFileLength) {
                virtualFileLength = fileLength;
            }
        }
    
        /**
         * Writes the entire buffer to disk, if needed.
         */
        private void writeBufferToDisk() throws IOException {
            if(!changeMadeToBuffer) return;
            int amountOfBufferToWrite = (int) Math.min((long) bufsize, virtualFileLength - bufBlockStart);
            if(amountOfBufferToWrite > 0) {
                raf.seek(bufBlockStart);
                raf.write(buf, 0, amountOfBufferToWrite);
                this.actualFileLength = virtualFileLength;
            }
            changeMadeToBuffer = false;
        }
    
        /**
         * Flush the buffer to disk and force a sync.
         */
        public void flush() throws IOException {
            writeBufferToDisk();
            this.raf.getChannel().force(false);
        }
    
        /**
         * Based on pos, ensures that the buffer is one that contains pos
         * 
         * After this call it will be safe to write to the buffer to update the byte at pos,
         * if this returns true reading of the byte at pos will be valid as a previous write
         * or set length has caused the file to be large enough to have a byte at pos.
         * 
         * @return true if the buffer contains any data that may be read. Data may be read so long as
         * a write or the file has been set to a length that us greater than the current position.
         */
        private boolean readyBuffer() throws IOException {
            boolean isPosOutSideOfBuffer = pos < bufBlockStart || bufBlockStart + bufsize <= pos;
    
            if (isPosOutSideOfBuffer) {
    
                writeBufferToDisk();
    
                // The buffer is always positioned to start at a multiple of a bufsize offset.
                // e.g. for a buf size of 4 the starting positions of buffers can be at 0, 4, 8, 12..
                // Work out where the buffer block should start for the given position. 
                long bufferBlockStart = (pos / bufsize) * bufsize;
    
                assert bufferBlockStart >= 0;
    
                // If the file is large enough, read it into the buffer.
                // if the file is not large enough we have nothing to read into the buffer,
                // In both cases the buffer will be ready to have writes made to it.
                if(bufferBlockStart < actualFileLength) {
                    raf.seek(bufferBlockStart);
                    raf.read(buf);
                }
    
                bufBlockStart = bufferBlockStart;
            }
    
            return pos < virtualFileLength;
        }
    
        /**
         * Reads a byte from the file, returning an integer of 0-255, or -1 if it has reached the end of the file.
         * 
         * @return
         * @throws IOException 
         */
        public int read() throws IOException {
            if(readyBuffer() == false) {
                return -1;
            }
            try {
                return (buf[(int)(pos - bufBlockStart)]) & 0x000000ff ; 
            } finally {
                pos++;
            }
        }
    
        /**
         * Write a single byte to the file.
         * 
         * @param b
         * @throws IOException
         */
        public void write(byte b) throws IOException {
            readyBuffer(); // ignore result we don't care.
            buf[(int)(pos - bufBlockStart)] = b;
            changeMadeToBuffer = true;
            pos++;
            if(pos > virtualFileLength) {
                virtualFileLength = pos;
            }
        }
    
        /**
         * Write all given bytes to the random access file at the current possition.
         * 
         */
        public void write(byte[] bytes) throws IOException {
            int writen = 0;
            int bytesToWrite = bytes.length;
            {
                readyBuffer();
                int startPositionInBuffer = (int)(pos - bufBlockStart);
                int lengthToWriteToBuffer = Math.min(bytesToWrite - writen, bufsize - startPositionInBuffer);
                assert  startPositionInBuffer + lengthToWriteToBuffer <= bufsize;
    
                System.arraycopy(bytes, writen,
                                buf, startPositionInBuffer,
                                lengthToWriteToBuffer);
                pos += lengthToWriteToBuffer;
                if(pos > virtualFileLength) {
                    virtualFileLength = pos;
                }
                writen += lengthToWriteToBuffer;
                this.changeMadeToBuffer = true;
            }
    
            // Just write the rest to the random access file
            if(writen < bytesToWrite) {
                writeBufferToDisk();
                int toWrite = bytesToWrite - writen;
                raf.write(bytes, writen, toWrite);
                pos += toWrite;
                if(pos > virtualFileLength) {
                    virtualFileLength = pos;
                    actualFileLength = virtualFileLength;
                }
            }
        }
    
        /**
         * Read up to to the size of bytes,
         * 
         * @return the number of bytes read.
         */
        public int read(byte[] bytes) throws IOException {
            int read = 0;
            int bytesToRead = bytes.length;
            while(read < bytesToRead) {
    
                //First see if we need to fill the cache
                if(readyBuffer() == false) {
                    //No more to read;
                    return read;
                }
    
                //Now read as much as we can (or need from cache and place it
                //in the given byte[]
                int startPositionInBuffer = (int)(pos - bufBlockStart);
                int lengthToReadFromBuffer = Math.min(bytesToRead - read, bufsize - startPositionInBuffer);
    
                System.arraycopy(buf, startPositionInBuffer, bytes, read, lengthToReadFromBuffer);
    
                pos += lengthToReadFromBuffer;
                read += lengthToReadFromBuffer;
            }
    
            return read;
        }
    
        public void close() throws IOException {
            try {
                this.writeBufferToDisk();
            } finally {
                raf.close();
            }
        }
    
        /**
         * Gets the length of the file.
         * 
         * @return
         * @throws IOException
         */
        public long length() throws IOException{
            return virtualFileLength;
        }
    
    }
    
  4. # 6 楼答案

    我看不出有什么理由不使用java。尼奥。MappedByteBuffer,即使文件大于整数。最大值

    显然,不允许为整个文件定义一个MappedByteBuffer。但您可以让多个MappedByteBuffer访问文件的不同区域

    FileChannel中位置和大小的定义。map的类型为long,这意味着可以提供大于整数的值。MAX_VALUE,唯一需要注意的是缓冲区的大小将不会大于整数。最大值

    因此,您可以定义以下几种地图:

    buffer[0] = fileChannel.map(FileChannel.MapMode.READ_WRITE,0,2147483647L);
    buffer[1] = fileChannel.map(FileChannel.MapMode.READ_WRITE,2147483647L, Integer.MAX_VALUE);
    buffer[2] = fileChannel.map(FileChannel.MapMode.READ_WRITE, 4294967294L, Integer.MAX_VALUE);
    ...
    

    总之,大小不能大于整数。最大值,但起始位置可以在文件中的任何位置

    在书Java NIO中,作者罗恩·希钦斯说:

    Accessing a file through the memory-mapping mechanism can be far more efficient than reading or writing data by conventional means, even when using channels. No explicit system calls need to be made, which can be time-consuming. More importantly, the virtual memory system of the operating system automatically caches memory pages. These pages will be cached using system memory andwill not consume space from the JVM's memory heap.

    Once a memory page has been made valid (brought in from disk), it can be accessed again at full hardware speed without the need to make another system call to get the data. Large, structured files that contain indexes or other sections that are referenced or updated frequently can benefit tremendously from memory mapping. When combined with file locking to protect critical sections and control transactional atomicity, you begin to see how memory mapped buffers can be put to good use.

    我真的怀疑你会发现第三方API做得比这更好。也许你可以在这个架构上找到一个API来简化工作

    你不认为这种方法应该对你有用吗