mmap —- 内存映射文件支持


内存映射(mmap)文件对象的行为既像 bytearray 又像 文件对象。 你可以在大部分接受 bytearray 的地方使用 mmap 对象;例如,你可以使用 re 模块来搜索一个内存映射文件。 你也可以通过执行 obj[index] = 97 来修改单个字节,或者通过对切片赋值来修改一个子序列: obj[i1:i2] = b'…'。 你还可以在文件的当前位置开始读取和写入数据,并使用 seek() 前往另一个位置。

A memory-mapped file is created by the mmap constructor, which isdifferent on Unix and on Windows. In either case you must provide a filedescriptor for a file opened for update. If you wish to map an existing Pythonfile object, use its fileno() method to obtain the correct value for thefileno parameter. Otherwise, you can open the file using theos.open() function, which returns a file descriptor directly (the filestill needs to be closed when done).

注解

If you want to create a memory-mapping for a writable, buffered file, youshould flush() the file first. This is necessary to ensurethat local modifications to the buffers are actually available to themapping.

For both the Unix and Windows versions of the constructor, access may bespecified as an optional keyword parameter. access accepts one of fourvalues: ACCESSREAD, ACCESS_WRITE, or ACCESS_COPY tospecify read-only, write-through or copy-on-write memory respectively, orACCESS_DEFAULT to defer to _prot. access can be used on both Unixand Windows. If access is not specified, Windows mmap returns awrite-through mapping. The initial memory values for all three access typesare taken from the specified file. Assignment to an ACCESS_READmemory map raises a TypeError exception. Assignment to anACCESS_WRITE memory map affects both memory and the underlying file.Assignment to an ACCESS_COPY memory map affects memory but does notupdate the underlying file.

在 3.7 版更改: Added ACCESS_DEFAULT constant.

To map anonymous memory, -1 should be passed as the fileno along with the length.

  • class mmap.mmap(fileno, length, tagname=None, access=ACCESS_DEFAULT[, offset])
  • (Windows version) Maps length bytes from the file specified by thefile handle fileno, and creates a mmap object. If length is largerthan the current size of the file, the file is extended to contain length_bytes. If _length is 0, the maximum length of the map is the currentsize of the file, except that if the file is empty Windows raises anexception (you cannot create an empty mapping on Windows).

tagname, if specified and not None, is a string giving a tag name forthe mapping. Windows allows you to have many different mappings againstthe same file. If you specify the name of an existing tag, that tag isopened, otherwise a new tag of this name is created. If this parameter isomitted or None, the mapping is created without a name. Avoiding theuse of the tag parameter will assist in keeping your code portable betweenUnix and Windows.

offset may be specified as a non-negative integer offset. mmap referenceswill be relative to the offset from the beginning of the file. offset_defaults to 0. _offset must be a multiple of the ALLOCATIONGRANULARITY.

  • class mmap.mmap(fileno, length, flags=MAP_SHARED, prot=PROT_WRITE|PROT_READ, access=ACCESS_DEFAULT[, offset])
  • (Unix version) Maps length bytes from the file specified by the filedescriptor fileno, and returns a mmap object. If length is 0, themaximum length of the map will be the current size of the file whenmmap is called.

flags specifies the nature of the mapping. MAP_PRIVATE creates aprivate copy-on-write mapping, so changes to the contents of the mmapobject will be private to this process, and MAP_SHARED creates amapping that's shared with all other processes mapping the same areas ofthe file. The default value is MAP_SHARED.

prot, if specified, gives the desired memory protection; the two mostuseful values are PROTREAD and PROT_WRITE, to specifythat the pages may be read or written. _prot defaults toPROT_READ | PROT_WRITE.

access may be specified in lieu of flags and prot as an optionalkeyword parameter. It is an error to specify both flags, prot andaccess. See the description of access above for information on how touse this parameter.

offset may be specified as a non-negative integer offset. mmap referenceswill be relative to the offset from the beginning of the file. offset_defaults to 0. _offset must be a multiple of ALLOCATIONGRANULARITYwhich is equal to PAGESIZE on Unix systems.

To ensure validity of the created memory mapping the file specifiedby the descriptor fileno is internally automatically synchronizedwith physical backing store on Mac OS X and OpenVMS.

This example shows a simple way of using mmap:

  1. import mmap
  2.  
  3. # write a simple example file
  4. with open("hello.txt", "wb") as f:
  5. f.write(b"Hello Python!\n")
  6.  
  7. with open("hello.txt", "r+b") as f:
  8. # memory-map the file, size 0 means whole file
  9. mm = mmap.mmap(f.fileno(), 0)
  10. # read content via standard file methods
  11. print(mm.readline()) # prints b"Hello Python!\n"
  12. # read content via slice notation
  13. print(mm[:5]) # prints b"Hello"
  14. # update content using slice notation;
  15. # note that new content must have same size
  16. mm[6:] = b" world!\n"
  17. # ... and read again using standard file methods
  18. mm.seek(0)
  19. print(mm.readline()) # prints b"Hello world!\n"
  20. # close the map
  21. mm.close()

mmap can also be used as a context manager in a withstatement:

  1. import mmap
  2.  
  3. with mmap.mmap(-1, 13) as mm:
  4. mm.write(b"Hello world!")

3.2 新版功能: Context manager support.

The next example demonstrates how to create an anonymous map and exchangedata between the parent and child processes:

  1. import mmap
  2. import os
  3.  
  4. mm = mmap.mmap(-1, 13)
  5. mm.write(b"Hello world!")
  6.  
  7. pid = os.fork()
  8.  
  9. if pid == 0: # In a child process
  10. mm.seek(0)
  11. print(mm.readline())
  12.  
  13. mm.close()

Memory-mapped file objects support the following methods:

  • close()
  • Closes the mmap. Subsequent calls to other methods of the object willresult in a ValueError exception being raised. This will not closethe open file.

  • closed

  • True if the file is closed.

3.2 新版功能.

  • find(sub[, start[, end]])
  • Returns the lowest index in the object where the subsequence sub isfound, such that sub is contained in the range [start, end].Optional arguments start and end are interpreted as in slice notation.Returns -1 on failure.

在 3.5 版更改: 现在支持可写的 字节类对象

  • flush([offset[, size]])
  • Flushes changes made to the in-memory copy of a file back to disk. Withoutuse of this call there is no guarantee that changes are written back beforethe object is destroyed. If offset and size are specified, onlychanges to the given range of bytes will be flushed to disk; otherwise, thewhole extent of the mapping is flushed. offset must be a multiple of thePAGESIZE or ALLOCATIONGRANULARITY.

(Windows version) A nonzero value returned indicates success; zeroindicates failure.

(Unix version) A zero value is returned to indicate success. Anexception is raised when the call failed.

  • move(dest, src, count)
  • Copy the count bytes starting at offset src to the destination indexdest. If the mmap was created with ACCESS_READ, then calls tomove will raise a TypeError exception.

  • read([n])

  • Return a bytes containing up to n bytes starting from thecurrent file position. If the argument is omitted, None or negative,return all bytes from the current file position to the end of themapping. The file position is updated to point after the bytes that werereturned.

在 3.3 版更改: Argument can be omitted or None.

  • read_byte()
  • Returns a byte at the current file position as an integer, and advancesthe file position by 1.

  • readline()

  • Returns a single line, starting at the current file position and up to thenext newline.

  • resize(newsize)

  • Resizes the map and the underlying file, if any. If the mmap was createdwith ACCESS_READ or ACCESS_COPY, resizing the map willraise a TypeError exception.

  • rfind(sub[, start[, end]])

  • Returns the highest index in the object where the subsequence sub isfound, such that sub is contained in the range [start, end].Optional arguments start and end are interpreted as in slice notation.Returns -1 on failure.

在 3.5 版更改: 现在支持可写的 字节类对象

  • seek(pos[, whence])
  • Set the file's current position. whence argument is optional anddefaults to os.SEEK_SET or 0 (absolute file positioning); othervalues are os.SEEK_CUR or 1 (seek relative to the currentposition) and os.SEEK_END or 2 (seek relative to the file's end).

  • size()

  • Return the length of the file, which can be larger than the size of thememory-mapped area.

  • tell()

  • Returns the current position of the file pointer.

  • write(bytes)

  • Write the bytes in bytes into memory at the current position of thefile pointer and return the number of bytes written (never less thanlen(bytes), since if the write fails, a ValueError will beraised). The file position is updated to point after the bytes thatwere written. If the mmap was created with ACCESS_READ, thenwriting to it will raise a TypeError exception.

在 3.5 版更改: 现在支持可写的 字节类对象

在 3.6 版更改: The number of bytes written is now returned.

  • writebyte(_byte)
  • Write the integer byte into memory at the currentposition of the file pointer; the file position is advanced by 1. Ifthe mmap was created with ACCESS_READ, then writing to it willraise a TypeError exception.