3.3 文件和操作系统



  1. In [207]: path = 'examples/segismundo.txt'
  2. In [208]: f = open(path)


  1. for line in f:
  2. pass


  1. In [209]: lines = [x.rstrip() for x in open(path)]
  2. In [210]: lines
  3. Out[210]:
  4. ['Sueña el rico en su riqueza,',
  5. 'que más cuidados le ofrece;',
  6. '',
  7. 'sueña el pobre que padece',
  8. 'su miseria y su pobreza;',
  9. '',
  10. 'sueña el que a medrar empieza,',
  11. 'sueña el que afana y pretende,',
  12. 'sueña el que agravia y ofende,',
  13. '',
  14. 'y en el mundo, en conclusión,',
  15. 'todos sueñan lo que son,',
  16. 'aunque ninguno lo entiende.',
  17. '']


  1. In [211]: f.close()


  1. In [212]: with open(path) as f:
  2. .....: lines = [x.rstrip() for x in f]


如果输入f =open(path,’w’),就会有一个新文件被创建在examples/segismundo.txt,并覆盖掉该位置原来的任何数据。另外有一个x文件模式,它可以创建可写的文件,但是如果文件路径存在,就无法创建。表3-3列出了所有的读/写模式。

表3-3 Python的文件模式


  1. In [213]: f = open(path)
  2. In [214]: f.read(10)
  3. Out[214]: 'Sueña el r'
  4. In [215]: f2 = open(path, 'rb') # Binary mode
  5. In [216]: f2.read(10)
  6. Out[216]: b'Sue\xc3\xb1a el '


  1. In [217]: f.tell()
  2. Out[217]: 11
  3. In [218]: f2.tell()
  4. Out[218]: 10


  1. In [219]: import sys
  2. In [220]: sys.getdefaultencoding()
  3. Out[220]: 'utf-8'


  1. In [221]: f.seek(3)
  2. Out[221]: 3
  3. In [222]: f.read(1)
  4. Out[222]: 'ñ'


  1. In [223]: f.close()
  2. In [224]: f2.close()


  1. In [225]: with open('tmp.txt', 'w') as handle:
  2. .....: handle.writelines(x for x in open(path) if len(x) > 1)
  3. In [226]: with open('tmp.txt') as f:
  4. .....: lines = f.readlines()
  5. In [227]: lines
  6. Out[227]:
  7. ['Sueña el rico en su riqueza,\n',
  8. 'que más cuidados le ofrece;\n',
  9. 'sueña el pobre que padece\n',
  10. 'su miseria y su pobreza;\n',
  11. 'sueña el que a medrar empieza,\n',
  12. 'sueña el que afana y pretende,\n',
  13. 'sueña el que agravia y ofende,\n',
  14. 'y en el mundo, en conclusión,\n',
  15. 'todos sueñan lo que son,\n',
  16. 'aunque ninguno lo entiende.\n']


表3-4 Python重要的文件方法或属性



  1. In [230]: with open(path) as f:
  2. .....: chars = f.read(10)
  3. In [231]: chars
  4. Out[231]: 'Sueña el r'


  1. In [232]: with open(path, 'rb') as f:
  2. .....: data = f.read(10)
  3. In [233]: data
  4. Out[233]: b'Sue\xc3\xb1a el '


  1. In [234]: data.decode('utf8')
  2. Out[234]: 'Sueña el '
  3. In [235]: data[:4].decode('utf8')
  4. ---------------------------------------------------------------------------
  5. UnicodeDecodeError Traceback (most recent call last)
  6. <ipython-input-235-300e0af10bb7> in <module>()
  7. ----> 1 data[:4].decode('utf8')
  8. UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc3 in position 3: unexpecte
  9. d end of data


  1. In [236]: sink_path = 'sink.txt'
  2. In [237]: with open(path) as source:
  3. .....: with open(sink_path, 'xt', encoding='iso-8859-1') as sink:
  4. .....: sink.write(source.read())
  5. In [238]: with open(sink_path, encoding='iso-8859-1') as f:
  6. .....: print(f.read(10))
  7. Sueña el r


  1. In [240]: f = open(path)
  2. In [241]: f.read(5)
  3. Out[241]: 'Sueña'
  4. In [242]: f.seek(4)
  5. Out[242]: 4
  6. In [243]: f.read(1)
  7. ---------------------------------------------------------------------------
  8. UnicodeDecodeError Traceback (most recent call last)
  9. <ipython-input-243-7841103e33f5> in <module>()
  10. ----> 1 f.read(1)
  11. /miniconda/envs/book-env/lib/python3.6/codecs.py in decode(self, input, final)
  12. 319 # decode input (taking the buffer into account)
  13. 320 data = self.buffer + input
  14. --> 321 (result, consumed) = self._buffer_decode(data, self.errors, final
  15. )
  16. 322 # keep undecoded input until the next call
  17. 323 self.buffer = data[consumed:]
  18. UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb1 in position 0: invalid s
  19. tart byte
  20. In [244]: f.close()
