gzip, zipfile, tarfile 模块:处理压缩文件
In [1]:
- import os, shutil, glob
- import zlib, gzip, bz2, zipfile, tarfile
gzip
zilb 模块
zlib
提供了对字符串进行压缩和解压缩的功能:
In [2]:
- orginal = "this is a test string"
- compressed = zlib.compress(orginal)
- print compressed
- print zlib.decompress(compressed)
- x�+��,V�D�����⒢̼tS��
- this is a test string
同时提供了两种校验和的计算方法:
In [3]:
- print zlib.adler32(orginal) & 0xffffffff
- 1407780813
In [4]:
- print zlib.crc32(orginal) & 0xffffffff
- 4236695221
gzip 模块
gzip
模块可以产生 .gz
格式的文件,其压缩方式由 zlib
模块提供。
我们可以通过 gzip.open
方法来读写 .gz
格式的文件:
In [5]:
- content = "Lots of content here"
- with gzip.open('file.txt.gz', 'wb') as f:
- f.write(content)
读:
In [6]:
- with gzip.open('file.txt.gz', 'rb') as f:
- file_content = f.read()
- print file_content
- Lots of content here
将压缩文件内容解压出来:
In [7]:
- with gzip.open('file.txt.gz', 'rb') as f_in, open('file.txt', 'wb') as f_out:
- shutil.copyfileobj(f_in, f_out)
此时,目录下应有 file.txt
文件,内容为:
In [8]:
- with open("file.txt") as f:
- print f.read()
- Lots of content here
In [9]:
- os.remove("file.txt.gz")
bz2 模块
bz2
模块提供了另一种压缩文件的方法:
In [10]:
- orginal = "this is a test string"
- compressed = bz2.compress(orginal)
- print compressed
- print bz2.decompress(compressed)
- BZh91AY&SY*�v ��@"� 10"zi��FLT`�軒)�P�˰
- this is a test string
zipfile 模块
产生一些 file.txt
的复制:
In [11]:
- for i in range(10):
- shutil.copy("file.txt", "file.txt." + str(i))
将这些复制全部压缩到一个 .zip
文件中:
In [12]:
- f = zipfile.ZipFile('files.zip','w')
- for name in glob.glob("*.txt.[0-9]"):
- f.write(name)
- os.remove(name)
- f.close()
解压这个 .zip
文件,用 namelist
方法查看压缩文件中的子文件名:
In [13]:
- f = zipfile.ZipFile('files.zip','r')
- print f.namelist()
- ['file.txt.9', 'file.txt.6', 'file.txt.2', 'file.txt.1', 'file.txt.5', 'file.txt.4', 'file.txt.3', 'file.txt.7', 'file.txt.8', 'file.txt.0']
使用 f.read(name)
方法来读取 name
文件中的内容:
In [14]:
- for name in f.namelist():
- print name, "content:", f.read(name)
- f.close()
- file.txt.9 content: Lots of content here
- file.txt.6 content: Lots of content here
- file.txt.2 content: Lots of content here
- file.txt.1 content: Lots of content here
- file.txt.5 content: Lots of content here
- file.txt.4 content: Lots of content here
- file.txt.3 content: Lots of content here
- file.txt.7 content: Lots of content here
- file.txt.8 content: Lots of content here
- file.txt.0 content: Lots of content here
可以用 extract(name)
或者 extractall()
解压单个或者全部文件。
tarfile 模块
支持 .tar
格式文件的读写:
例如可以这样将 file.txt
写入:
In [15]:
- f = tarfile.open("file.txt.tar", "w")
- f.add("file.txt")
- f.close()
清理生成的文件:
In [16]:
- os.remove("file.txt")
- os.remove("file.txt.tar")
- os.remove("files.zip")