数组类型

数组类型

In [1]:

from numpy import *

之前已经看过整数数组和布尔数组，除此之外还有浮点数数组和复数数组。

复数数组

产生一个复数数组：

In [2]:

a = array([1 + 1j, 2, 3, 4])

Python会自动判断数组的类型：

In [3]:

a.dtype

Out[3]:

dtype('complex128')

对于复数我们可以查看它的实部和虚部：

In [4]:

a.real

Out[4]:

array([ 1.,  2.,  3.,  4.])

In [5]:

a.imag

Out[5]:

array([ 1.,  0.,  0.,  0.])

还可以设置它们的值：

In [6]:

a.imag = [1,2,3,4]

查看 a：

In [7]:

Out[7]:

array([ 1.+1.j,  2.+2.j,  3.+3.j,  4.+4.j])

查看复共轭：

In [8]:

a.conj()

Out[8]:

array([ 1.-1.j,  2.-2.j,  3.-3.j,  4.-4.j])

事实上，这些属性方法可以用在浮点数或者整数数组上：

In [9]:

a = array([0.,1,2,3])
a.dtype

Out[9]:

dtype('float64')

In [10]:

a.real

Out[10]:

array([ 0.,  1.,  2.,  3.])

In [11]:

a.imag

Out[11]:

array([ 0.,  0.,  0.,  0.])

In [12]:

a.conj()

Out[12]:

array([ 0.,  1.,  2.,  3.])

但这里，虚部是只读的，并不能修改它的值：

In [13]:

# 会报错
a.imag = [1,2,3,4]

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-13-3db28f506ec9> in <module>()
      1 # 会报错
----> 2 a.imag = [1,2,3,4]
 
TypeError: array does not have imaginary part to set

指定数组类型

之前已经知道，构建数组的时候，数组会根据传入的内容自动判断类型：

In [14]:

a = array([0,1.0,2,3])

对于浮点数，默认为双精度：

In [15]:

a.dtype

Out[15]:

dtype('float64')

查看所用字节（8 bytes * 4）：

In [16]:

a.nbytes

Out[16]:

当然，我们也可以在构建的时候指定类型：

In [17]:

a = array([0,1.0,2,3],
         dtype=float32)

此时类型为单精度浮点数：

In [18]:

a.dtype

Out[18]:

dtype('float32')

查看所用字节（4 bytes * 4）：

In [19]:

a.nbytes

Out[19]:

除此之外，还可以指定有无符号，例如无符号整数：

In [20]:

a = array([0,1,2,3],
         dtype=uint8)
a.dtype

Out[20]:

dtype('uint8')

uint8 只使用一个字节，表示 0 到 255 的整数。

还可以从二进制数据中读取。

先写入二进制数据：

In [21]:

a = array([102,111,212], 
          dtype=uint8)
a.tofile('foo.dat')

从数据中读入，要指定类型：

In [22]:

b = frombuffer('foo', 
               dtype=uint8)
b

Out[22]:

array([102, 111, 111], dtype=uint8)

清理数据文件：

In [23]:

import os
os.remove('foo.dat')

0-255 的数字可以表示ASCⅡ码，我们可以用 ord 函数来查看字符的ASCⅡ码值：

In [24]:

ord('f')

Out[24]:

In [25]:

ord('S')

Out[25]:

Numpy 类型

具体如下：

基本类型	可用的Numpy类型	备注
布尔型	`bool`	占1个字节
整型	`int8, int16, int32, int64, int128, int`	`int` 跟C语言中的 `long` 一样大
无符号整型	`uint8, uint16, uint32, uint64, uint128, uint`	`uint` 跟C语言中的 `unsigned long` 一样大
浮点数	`float16, float32, float64, float, longfloat`	默认为双精度 `float64` ，`longfloat` 精度大小与系统有关
复数	`complex64, complex128, complex, longcomplex`	默认为 `complex128` ，即实部虚部都为双精度
字符串	`string, unicode`	可以使用 `dtype=S4` 表示一个4字节字符串的数组
对象	`object`	数组中可以使用任意值
Records	`void`
时间	`datetime64, timedelta64`

任意类型的数组：

In [26]:

a = array([1,1.2,'hello', [10,20,30]], 
          dtype=object)

乘法：

In [27]:

a * 2

Out[27]:

array([2, 2.4, 'hellohello', [10, 20, 30, 10, 20, 30]], dtype=object)

类型转换

转换数组的类型：

In [28]:

a = array([1.5, -3], 
         dtype=float32)
a

Out[28]:

array([ 1.5, -3. ], dtype=float32)

asarray 函数

使用 asarray 函数：

In [29]:

asarray(a, dtype=float64)

Out[29]:

array([ 1.5, -3. ])

In [30]:

asarray(a, dtype=uint8)

Out[30]:

array([  1, 253], dtype=uint8)

asarray 不会修改原来数组的值：

In [31]:

Out[31]:

array([ 1.5, -3. ], dtype=float32)

但当类型相同的时候，asarray 并不会产生新的对象，而是使用同一个引用：

In [32]:

b = asarray(a, dtype=float32)

In [33]:

b is a

Out[33]:

True

这么做的好处在与，asarray 不仅可以作用于数组，还可以将其他类型转化为数组。

有些时候为了保证我们的输入值是数组，我们需要将其使用 asarray 转化，当它已经是数组的时候，并不会产生新的对象，这样保证了效率。

In [34]:

asarray([1,2,3,4])

Out[34]:

array([1, 2, 3, 4])

astype 方法

astype 方法返回一个新数组：

In [35]:

a.astype(float64)

Out[35]:

array([ 1.5, -3. ])

In [36]:

a.astype(uint8)

Out[36]:

array([  1, 253], dtype=uint8)

astype也不会改变原来数组的值：

In [37]:

Out[37]:

array([ 1.5, -3. ], dtype=float32)

另外，astype 总是返回原来数组的一份复制，即使转换的类型是相同的：

In [38]:

b = a.astype(float32)
print a
print b

[ 1.5 -3. ]
[ 1.5 -3. ]

In [39]:

a is b

Out[39]:

False

view 方法

In [40]:

a = array((1,2,3,4), dtype=int32)
a

Out[40]:

array([1, 2, 3, 4])

view 会将 a 在内存中的表示看成是 uint8 进行解析：

In [41]:

b = a.view(uint8)
b

Out[41]:

array([1, 0, 0, 0, 2, 0, 0, 0, 3, 0, 0, 0, 4, 0, 0, 0], dtype=uint8)

In [42]:

a[0] = 2**30
a

Out[42]:

array([1073741824,          2,          3,          4])

修改 a 会修改 b 的值，因为共用一块内存：

In [43]:

Out[43]:

array([ 0,  0,  0, 64,  2,  0,  0,  0,  3,  0,  0,  0,  4,  0,  0,  0], dtype=uint8)

原文: https://nbviewer.jupyter.org/github/lijin-THU/notes-python/blob/master/03-numpy/03.04-array-types.ipynb

03.04 数组类型