多维数组

与大多数技术计算语言一样,Julia 提供原生的数组实现。 大多数技术计算语言非常重视其数组实现,但需要付出使用其它容器的代价。Julia 用同样的方式来处理数组。就像和其它用 Julia 写的代码一样,Julia 的数组库几乎完全是用 Julia 自身实现的,它的性能源自编译器。这样一来,用户就可以通过继承 AbstractArray 的方式来创建自定义数组类型。 实现自定义数组类型的更多详细信息,请参阅manual section on the AbstractArray interface

数组是存储在多维网格中对象的集合。在最一般的情况下, 数组中的对象可能是 Any 类型。 对于大多数计算上的需求,数组中对象的类型应该更加具体,例如 Float64Int32

一般来说,与许多其他科学计算语言不同,Julia 不希望为了性能而以向量化的方式编写程序。Julia 的编译器使用类型推断,并为标量数组索引生成优化的代码,从而能够令用户方便地编写可读性良好的程序,而不牺牲性能,并且时常会减少内存使用。

在 Julia 中,所有函数的参数都是 passed by sharing。一些科学计算语言用传值的方式传递数组,尽管这样做可以防止数组在被调函数中被意外地篡改,但这也会导致不必要的数组拷贝。通常,以一个 ! 结尾的函数名表示它会对自己的一个或者多个参数的值进行修改或者销毁(例如,请比较 sortsort!)。被调函数必须进行显式拷贝,以确保它们不会无意中修改输入参数。很多 “non-mutating” 函数在实现的时候,都会先进行显式拷贝,然后调用一个以 ! 结尾的同名函数,最后返回之前拷贝的副本。

基本函数

函数描述
eltype(A)A 中元素的类型
length(A)A 中元素的数量
ndims(A)A 的维数
size(A)一个包含 A 各个维度上元素数量的元组
size(A,n)An 维中的元素数量
axes(A)一个包含 A 有效索引的元组
axes(A,n)n 维有效索引的范围
eachindex(A)一个访问 A 中每一个位置的高效迭代器
stride(A,k)在第 k 维上的间隔(stride)(相邻元素间的线性索引距离)
strides(A)包含每一维上的间隔(stride)的元组

构造和初始化

Julia 提供了许多用于构造和初始化数组的函数。在下列函数中,参数 dims ... 可以是一个包含维数大小的元组,也可以表示用任意个参数传递的一系列维数大小值。大部分函数的第一个参数都表示数组的元素类型 T 。如果类型 T 被省略,那么将默认为 Float64

函数描述
Array{T}(undef, dims…)一个没有初始化的密集 Array
zeros(T, dims…)一个全零 Array
ones(T, dims…)一个元素均为 1 的 Array
trues(dims…)一个每个元素都为 trueBitArray
falses(dims…)一个每个元素都为 falseBitArray
reshape(A, dims…)一个包含跟 A 相同数据但维数不同的数组
copy(A)拷贝 A
deepcopy(A)深拷贝,即拷贝 A,并递归地拷贝其元素
similar(A, T, dims…)一个与A具有相同类型(这里指的是密集,稀疏等)的未初始化数组,但具有指定的元素类型和维数。第二个和第三个参数都是可选的,如果省略则默认为元素类型和 A 的维数。
reinterpret(T, A)A 具有相同二进制数据的数组,但元素类型为 T
rand(T, dims…)一个随机 Array,元素值是 $[0, 1)$ 半开区间中的均匀分布且服从一阶独立同分布 [1]
randn(T, dims…)一个随机 Array,元素为标准正态分布,服从独立同分布
Matrix{T}(I, m, n)m-by-n 单位阵。 需要 using LinearAlgebra for I.
range(start, stop=stop, length=n)startstop 的带有 n 个线性间隔元素的范围
fill!(A, x)用值 x 填充数组 A
fill(x, dims…)一个被值 x 填充的 Array

要查看各种方法,我们可以将不同维数传递给这些构造函数,请考虑以下示例:

  1. julia> zeros(Int8, 2, 3)
  2. 2×3 Array{Int8,2}:
  3. 0 0 0
  4. 0 0 0
  5. julia> zeros(Int8, (2, 3))
  6. 2×3 Array{Int8,2}:
  7. 0 0 0
  8. 0 0 0
  9. julia> zeros((2, 3))
  10. 2×3 Array{Float64,2}:
  11. 0.0 0.0 0.0
  12. 0.0 0.0 0.0

此处, (2, 3) 是一个元组 Tuple 并且第一个参数——元素类型是可选的, 默认值为 Float64.

Array literals

数组也可以直接用方括号来构造; 语法为 [A, B, C, ...] 创建一个一维数组(即一个矢量),该一维数组的元素用逗号分隔。所创建的数组中元素的类型(eltype) 自动由括号内参数的类型确定。如果所有参数类型都相同,则该类型称为数组的 eltype。 如果所有元素都有相同的promotion type,那么个元素都由convert转换成该类型并且该类型为数组的 eltype. 否则, 生成一个可以包含任意类型的异构数组—— Vector{Any} ;该构造方法包含字符 [],此时构造过程无参数给出。

  1. julia> [1,2,3] # An array of `Int`s
  2. 3-element Array{Int64,1}:
  3. 1
  4. 2
  5. 3
  6. julia> promote(1, 2.3, 4//5) # This combination of Int, Float64 and Rational promotes to Float64
  7. (1.0, 2.3, 0.8)
  8. julia> [1, 2.3, 4//5] # Thus that's the element type of this Array
  9. 3-element Array{Float64,1}:
  10. 1.0
  11. 2.3
  12. 0.8
  13. julia> []
  14. Any[]

Concatenation

If the arguments inside the square brackets are separated by semicolons (;) or newlines instead of commas, then their contents are vertically concatenated together instead of the arguments being used as elements themselves.

  1. julia> [1:2, 4:5] # Has a comma, so no concatenation occurs. The ranges are themselves the elements
  2. 2-element Array{UnitRange{Int64},1}:
  3. 1:2
  4. 4:5
  5. julia> [1:2; 4:5]
  6. 4-element Array{Int64,1}:
  7. 1
  8. 2
  9. 4
  10. 5
  11. julia> [1:2
  12. 4:5
  13. 6]
  14. 5-element Array{Int64,1}:
  15. 1
  16. 2
  17. 4
  18. 5
  19. 6

Similarly, if the arguments are separated by tabs or spaces, then their contents are horizontally concatenated together.

  1. julia> [1:2 4:5 7:8]
  2. 2×3 Array{Int64,2}:
  3. 1 4 7
  4. 2 5 8
  5. julia> [[1,2] [4,5] [7,8]]
  6. 2×3 Array{Int64,2}:
  7. 1 4 7
  8. 2 5 8
  9. julia> [1 2 3] # Numbers can also be horizontally concatenated
  10. 1×3 Array{Int64,2}:
  11. 1 2 3

Using semicolons (or newlines) and spaces (or tabs) can be combined to concatenate both horizontally and vertically at the same time.

  1. julia> [1 2
  2. 3 4]
  3. 2×2 Array{Int64,2}:
  4. 1 2
  5. 3 4
  6. julia> [zeros(Int, 2, 2) [1; 2]
  7. [3 4] 5]
  8. 3×3 Array{Int64,2}:
  9. 0 0 1
  10. 0 0 2
  11. 3 4 5

More generally, concatenation can be accomplished through the cat function. These syntaxes are shorthands for function calls that themselves are convenience functions:

语法函数描述
cat沿着 s 的第 k 维拼接数组
[A; B; C; …]vcatshorthand for cat(A...; dims=1)</td></tr><tr><td><code>[A B C ...]</code></td><td><a href="$d3cf701a5816c418.md#Base.hcat"><code>hcat</code></a></td><td>shorthand forcat(A…; dims=2)
[A B; C D; …]hvcatsimultaneous vertical and horizontal concatenation

Typed array literals

可以用 T[A, B, C, ...] 的方式声明一个元素为某种特定类型的数组。该方法定义一个元素类型为 T 的一维数组并且初始化元素为 A, B, C, ….。比如,Any[x, y, z] 会构建一个异构数组,该数组可以包含任意类型的元素。

类似的,拼接也可以用类型为前缀来指定结果的元素类型。

  1. julia> [[1 2] [3 4]]
  2. 1×4 Array{Int64,2}:
  3. 1 2 3 4
  4. julia> Int8[[1 2] [3 4]]
  5. 1×4 Array{Int8,2}:
  6. 1 2 3 4

Comprehensions

(数组)推导提供了构造数组的通用且强大的方法。其语法类似于数学中的集合构造的写法:

  1. A = [ F(x,y,...) for x=rx, y=ry, ... ]

这种形式的含义是 F(x,y,...) 取其给定列表中变量 xy 等的每个值进行计算。值可以指定为任何可迭代对象,但通常是 1:n2:(n-1) 之类的范围,或者像 [1.2, 3.4, 5.7] 这样的显式数组值。结果是一个 N 维密集数组,其维数是变量范围 rxry 等的维数串联。每次 F(x,y,...) 计算返回一个标量。

下面的示例计算当前元素和沿一维网格其左,右相邻元素的加权平均值:

  1. julia> x = rand(8)
  2. 8-element Array{Float64,1}:
  3. 0.843025
  4. 0.869052
  5. 0.365105
  6. 0.699456
  7. 0.977653
  8. 0.994953
  9. 0.41084
  10. 0.809411
  11. julia> [ 0.25*x[i-1] + 0.5*x[i] + 0.25*x[i+1] for i=2:length(x)-1 ]
  12. 6-element Array{Float64,1}:
  13. 0.736559
  14. 0.57468
  15. 0.685417
  16. 0.912429
  17. 0.8446
  18. 0.656511

The resulting array type depends on the types of the computed elements just like array literals do. In order to control the type explicitly, a type can be prepended to the comprehension. For example, we could have requested the result in single precision by writing:

  1. Float32[ 0.25*x[i-1] + 0.5*x[i] + 0.25*x[i+1] for i=2:length(x)-1 ]

生成器表达式

也可以在没有方括号的情况下编写(数组)推导,从而产生称为生成器的对象。可以迭代此对象以按需生成值,而不是预先分配数组并存储它们(请参阅 迭代)。例如,以下表达式在不分配内存的情况下对一个序列进行求和:

  1. julia> sum(1/n^2 for n=1:1000)
  2. 1.6439345666815615

在参数列表中使用具有多个维度的生成器表达式时,需要使用括号将生成器与后续参数分开:

  1. julia> map(tuple, 1/(i+j) for i=1:2, j=1:2, [1:4;])
  2. ERROR: syntax: invalid iteration specification

for 后面所有逗号分隔的表达式都被解释为范围。 添加括号让我们可以向 map 中添加第三个参数:

  1. julia> map(tuple, (1/(i+j) for i=1:2, j=1:2), [1 3; 2 4])
  2. 2×2 Array{Tuple{Float64,Int64},2}:
  3. (0.5, 1) (0.333333, 3)
  4. (0.333333, 2) (0.25, 4)

Generators are implemented via inner functions. Just like inner functions used elsewhere in the language, variables from the enclosing scope can be “captured” in the inner function. For example, sum(p[i] - q[i] for i=1:n) captures the three variables p, q and n from the enclosing scope. Captured variables can present performance challenges; see performance tips.

通过编写多个 for 关键字,生成器和推导中的范围可以取决于之前的范围:

  1. julia> [(i,j) for i=1:3 for j=1:i]
  2. 6-element Array{Tuple{Int64,Int64},1}:
  3. (1, 1)
  4. (2, 1)
  5. (2, 2)
  6. (3, 1)
  7. (3, 2)
  8. (3, 3)

在这些情况下,结果都是一维的。

可以使用 if 关键字过滤生成的值:

  1. julia> [(i,j) for i=1:3 for j=1:i if i+j == 4]
  2. 2-element Array{Tuple{Int64,Int64},1}:
  3. (2, 2)
  4. (3, 1)

索引

索引 n 维数组 A 的一般语法是:

  1. X = A[I_1, I_2, ..., I_n]

其中每个 I_k 可以是标量整数,整数数组或任何其他支持的索引类型。这包括 Colon (:) 来选择整个维度中的所有索引,形式为 a:ca:b:c 的范围来选择连续或跨步的子区间,以及布尔数组以选择索引为 true 的元素。

如果所有索引都是标量,则结果 X 是数组 A 中的单个元素。否则,X 是一个数组,其维数与所有索引的维数之和相同。

如果所有索引 I_k 都是向量,则 X 的形状将是 (length(I_1), length(I_2), ..., length(I_n)),其中,X 中位于 i_1, i_2, ..., i_n 处的元素为 A[I_1[i_1], I_2[i_2], ..., I_n[i_n]]

例如:

  1. julia> A = reshape(collect(1:16), (2, 2, 2, 2))
  2. 2×2×2×2 Array{Int64,4}:
  3. [:, :, 1, 1] =
  4. 1 3
  5. 2 4
  6. [:, :, 2, 1] =
  7. 5 7
  8. 6 8
  9. [:, :, 1, 2] =
  10. 9 11
  11. 10 12
  12. [:, :, 2, 2] =
  13. 13 15
  14. 14 16
  15. julia> A[1, 2, 1, 1] # all scalar indices
  16. 3
  17. julia> A[[1, 2], [1], [1, 2], [1]] # all vector indices
  18. 2×1×2×1 Array{Int64,4}:
  19. [:, :, 1, 1] =
  20. 1
  21. 2
  22. [:, :, 2, 1] =
  23. 5
  24. 6
  25. julia> A[[1, 2], [1], [1, 2], 1] # a mix of index types
  26. 2×1×2 Array{Int64,3}:
  27. [:, :, 1] =
  28. 1
  29. 2
  30. [:, :, 2] =
  31. 5
  32. 6

请注意最后两种情况下得到的数组大小为何是不同的。

如果 I_1 是二维矩阵,则 Xn+1 维数组,其形状为 (size(I_1, 1), size(I_1, 2), length(I_2), ..., length(I_n))。矩阵会添加一个维度。

例如:

  1. julia> A = reshape(collect(1:16), (2, 2, 2, 2));
  2. julia> A[[1 2; 1 2]]
  3. 2×2 Array{Int64,2}:
  4. 1 2
  5. 1 2
  6. julia> A[[1 2; 1 2], 1, 2, 1]
  7. 2×2 Array{Int64,2}:
  8. 5 6
  9. 5 6

位于 i_1, i_2, i_3, ..., i_{n+1} 处的元素值是 A[I_1[i_1, i_2], I_2[i_3], ..., I_n[i_{n+1}]]。所有使用标量索引的维度都将被丢弃,例如,假设 J 是索引数组,那么 A[2,J,3] 的结果是一个大小为 size(J) 的数组、其第 j 个元素由 A[2, J[j], 3] 填充。

作为此语法的特殊部分,end 关键字可用于表示索引括号内每个维度的最后一个索引,由索引的最内层数组的大小决定。没有 end 关键字的索引语法相当于调用getindex

  1. X = getindex(A, I_1, I_2, ..., I_n)

例如:

  1. julia> x = reshape(1:16, 4, 4)
  2. 4×4 reshape(::UnitRange{Int64}, 4, 4) with eltype Int64:
  3. 1 5 9 13
  4. 2 6 10 14
  5. 3 7 11 15
  6. 4 8 12 16
  7. julia> x[2:3, 2:end-1]
  8. 2×2 Array{Int64,2}:
  9. 6 10
  10. 7 11
  11. julia> x[1, [2 3; 4 1]]
  12. 2×2 Array{Int64,2}:
  13. 5 9
  14. 13 1

Indexed Assignment

在 n 维数组 A 中赋值的一般语法是:

  1. A[I_1, I_2, ..., I_n] = X

其中每个 I_k 可以是标量整数,整数数组或任何其他支持的索引类型。这包括 Colon (:) 来选择整个维度中的所有索引,形式为 a:ca:b:c 的范围来选择连续或跨步的子区间,以及布尔数组以选择索引为 true 的元素。

如果所有 I_k 都为整数,则数组 AI_1, I_2, ..., I_n 位置的值将被 X 的值覆盖,必要时将 convert 为数组 Aeltype

如果任一 I_k 选择了一个以上的位置,则等号右侧的 X 必须为一个与 A[I_1, I_2, ..., I_n] 形状一致的数组或一个具有相同元素数的向量。数组 AI_1[i_1], I_2[i_2], ..., I_n[i_n] 位置的值将被 X[I_1, I_2, ..., I_n] 的值覆盖,必要时会转换类型。逐元素的赋值运算符 .= 可以用于将 X 沿选择的位置 broadcast

  1. A[I_1, I_2, ..., I_n] .= X

就像在索引中一样,end关键字可用于表示索引括号中每个维度的最后一个索引,由被赋值的数组大小决定。 没有end关键字的索引赋值语法相当于调用setindex!

  1. setindex!(A, X, I_1, I_2, ..., I_n)

例如:

  1. julia> x = collect(reshape(1:9, 3, 3))
  2. 3×3 Array{Int64,2}:
  3. 1 4 7
  4. 2 5 8
  5. 3 6 9
  6. julia> x[3, 3] = -9;
  7. julia> x[1:2, 1:2] = [-1 -4; -2 -5];
  8. julia> x
  9. 3×3 Array{Int64,2}:
  10. -1 -4 7
  11. -2 -5 8
  12. 3 6 -9

支持的索引类型

在表达式 A[I_1, I_2, ..., I_n] 中,每个 I_k 可以是标量索引,标量索引数组,或者用 to_indices 转换成的表示标量索引数组的对象:

  1. 标量索引。默认情况下,这包括:
    • 非布尔的整数
    • CartesianIndex {N}s,其行为类似于跨越多个维度的 N 维整数元组(详见下文)s, which behave like an N-tuple of integers spanning multiple dimensions (see below for more details)
  2. 标量索引数组。这包括:
    • 整数向量和多维整数数组
    • [] 这样的空数组,它不选择任何元素
    • a:ca:b:c 的范围,从 ac(包括)选择连续或间隔的部分元素
    • 任何自定义标量索引数组,它是 AbstractArray 的子类型
    • CartesianIndex{N} 数组(详见下文)
  3. 一个表示标量索引数组的对象,可以通过to_indices转换为这样的对象。 默认情况下,这包括:
    • Colon() (:),表示整个维度内或整个数组中的所有索引
    • 布尔数组,选择其中值为 true 的索引对应的元素(更多细节见下文)

一些例子:

  1. julia> A = reshape(collect(1:2:18), (3, 3))
  2. 3×3 Array{Int64,2}:
  3. 1 7 13
  4. 3 9 15
  5. 5 11 17
  6. julia> A[4]
  7. 7
  8. julia> A[[2, 5, 8]]
  9. 3-element Array{Int64,1}:
  10. 3
  11. 9
  12. 15
  13. julia> A[[1 4; 3 8]]
  14. 2×2 Array{Int64,2}:
  15. 1 7
  16. 5 15
  17. julia> A[[]]
  18. Int64[]
  19. julia> A[1:2:5]
  20. 3-element Array{Int64,1}:
  21. 1
  22. 5
  23. 9
  24. julia> A[2, :]
  25. 3-element Array{Int64,1}:
  26. 3
  27. 9
  28. 15
  29. julia> A[:, 3]
  30. 3-element Array{Int64,1}:
  31. 13
  32. 15
  33. 17

笛卡尔索引

特殊的 CartesianIndex{N} 对象表示一个标量索引,其行为类似于张成多个维度的 N 维整数元组。例如:

  1. julia> A = reshape(1:32, 4, 4, 2);
  2. julia> A[3, 2, 1]
  3. 7
  4. julia> A[CartesianIndex(3, 2, 1)] == A[3, 2, 1] == 7
  5. true

如果单独考虑,这可能看起来相对微不足道;CartesianIndex 只是将多个整数聚合成一个表示单个多维索引的对象。 但是,当与其他索引形式和迭代器组合产生多个 CartesianIndex 时,这可以生成非常优雅和高效的代码。请参阅下面的迭代,有关更高级的示例,请参阅关于多维算法和迭代博客文章

也支持 CartesianIndex {N} 的数组。它们代表一组标量索引,每个索引都跨越 N 个维度,从而实现一种有时也称为逐点索引的索引形式。例如,它可以从上面的 A 的第一「页」访问对角元素:

  1. julia> page = A[:,:,1]
  2. 4×4 Array{Int64,2}:
  3. 1 5 9 13
  4. 2 6 10 14
  5. 3 7 11 15
  6. 4 8 12 16
  7. julia> page[[CartesianIndex(1,1),
  8. CartesianIndex(2,2),
  9. CartesianIndex(3,3),
  10. CartesianIndex(4,4)]]
  11. 4-element Array{Int64,1}:
  12. 1
  13. 6
  14. 11
  15. 16

这可以通过 dot broadcasting 以及普通整数索引(而不是把从 A 中提取第一“页”作为单独的步骤)更加简单地表达。它甚至可以与 : 结合使用,同时从两个页面中提取两个对角线:

  1. julia> A[CartesianIndex.(axes(A, 1), axes(A, 2)), 1]
  2. 4-element Array{Int64,1}:
  3. 1
  4. 6
  5. 11
  6. 16
  7. julia> A[CartesianIndex.(axes(A, 1), axes(A, 2)), :]
  8. 4×2 Array{Int64,2}:
  9. 1 17
  10. 6 22
  11. 11 27
  12. 16 32

Warning

CartesianIndex and arrays of CartesianIndex are not compatible with the end keyword to represent the last index of a dimension. Do not use end in indexing expressions that may contain either CartesianIndex or arrays thereof.

Logical indexing

Often referred to as logical indexing or indexing with a logical mask, indexing by a boolean array selects elements at the indices where its values are true. Indexing by a boolean vector B is effectively the same as indexing by the vector of integers that is returned by findall(B). Similarly, indexing by a N-dimensional boolean array is effectively the same as indexing by the vector of CartesianIndex{N}s where its values are true. A logical index must be a vector of the same length as the dimension it indexes into, or it must be the only index provided and match the size and dimensionality of the array it indexes into. It is generally more efficient to use boolean arrays as indices directly instead of first calling findall.

  1. julia> x = reshape(1:16, 4, 4)
  2. 4×4 reshape(::UnitRange{Int64}, 4, 4) with eltype Int64:
  3. 1 5 9 13
  4. 2 6 10 14
  5. 3 7 11 15
  6. 4 8 12 16
  7. julia> x[[false, true, true, false], :]
  8. 2×4 Array{Int64,2}:
  9. 2 6 10 14
  10. 3 7 11 15
  11. julia> mask = map(ispow2, x)
  12. 4×4 Array{Bool,2}:
  13. 1 0 0 0
  14. 1 0 0 0
  15. 0 0 0 0
  16. 1 1 0 1
  17. julia> x[mask]
  18. 5-element Array{Int64,1}:
  19. 1
  20. 2
  21. 4
  22. 8
  23. 16

Number of indices

Cartesian indexing

The ordinary way to index into an N-dimensional array is to use exactly N indices; each index selects the position(s) in its particular dimension. For example, in the three-dimensional array A = rand(4, 3, 2), A[2, 3, 1] will select the number in the second row of the third column in the first “page” of the array. This is often referred to as cartesian indexing.

Linear indexing

When exactly one index i is provided, that index no longer represents a location in a particular dimension of the array. Instead, it selects the ith element using the column-major iteration order that linearly spans the entire array. This is known as linear indexing. It essentially treats the array as though it had been reshaped into a one-dimensional vector with vec.

  1. julia> A = [2 6; 4 7; 3 1]
  2. 3×2 Array{Int64,2}:
  3. 2 6
  4. 4 7
  5. 3 1
  6. julia> A[5]
  7. 7
  8. julia> vec(A)[5]
  9. 7

A linear index into the array A can be converted to a CartesianIndex for cartesian indexing with CartesianIndices(A)[i] (see CartesianIndices), and a set of N cartesian indices can be converted to a linear index with LinearIndices(A)[i_1, i_2, ..., i_N] (see LinearIndices).

  1. julia> CartesianIndices(A)[5]
  2. CartesianIndex(2, 2)
  3. julia> LinearIndices(A)[2, 2]
  4. 5

It’s important to note that there’s a very large assymmetry in the performance of these conversions. Converting a linear index to a set of cartesian indices requires dividing and taking the remainder, whereas going the other way is just multiplies and adds. In modern processors, integer division can be 10-50 times slower than multiplication. While some arrays — like Array itself — are implemented using a linear chunk of memory and directly use a linear index in their implementations, other arrays — like Diagonal — need the full set of cartesian indices to do their lookup (see IndexStyle to introspect which is which). As such, when iterating over an entire array, it’s much better to iterate over eachindex(A) instead of 1:length(A). Not only will the former be much faster in cases where A is IndexCartesian, but it will also support OffsetArrays, too.

Omitted and extra indices

In addition to linear indexing, an N-dimensional array may be indexed with fewer or more than N indices in certain situations.

Indices may be omitted if the trailing dimensions that are not indexed into are all length one. In other words, trailing indices can be omitted only if there is only one possible value that those omitted indices could be for an in-bounds indexing expression. For example, a four-dimensional array with size (3, 4, 2, 1) may be indexed with only three indices as the dimension that gets skipped (the fourth dimension) has length one. Note that linear indexing takes precedence over this rule.

  1. julia> A = reshape(1:24, 3, 4, 2, 1)
  2. 3×4×2×1 reshape(::UnitRange{Int64}, 3, 4, 2, 1) with eltype Int64:
  3. [:, :, 1, 1] =
  4. 1 4 7 10
  5. 2 5 8 11
  6. 3 6 9 12
  7. [:, :, 2, 1] =
  8. 13 16 19 22
  9. 14 17 20 23
  10. 15 18 21 24
  11. julia> A[1, 3, 2] # Omits the fourth dimension (length 1)
  12. 19
  13. julia> A[1, 3] # Attempts to omit dimensions 3 & 4 (lengths 2 and 1)
  14. ERROR: BoundsError: attempt to access 3×4×2×1 reshape(::UnitRange{Int64}, 3, 4, 2, 1) with eltype Int64 at index [1, 3]
  15. julia> A[19] # Linear indexing
  16. 19

When omitting all indices with A[], this semantic provides a simple idiom to retrieve the only element in an array and simultaneously ensure that there was only one element.

Similarly, more than N indices may be provided if all the indices beyond the dimensionality of the array are 1 (or more generally are the first and only element of axes(A, d) where d is that particular dimension number). This allows vectors to be indexed like one-column matrices, for example:

  1. julia> A = [8,6,7]
  2. 3-element Array{Int64,1}:
  3. 8
  4. 6
  5. 7
  6. julia> A[2,1]
  7. 6

迭代

迭代整个数组的推荐方法是

  1. for a in A
  2. # Do something with the element a
  3. end
  4. for i in eachindex(A)
  5. # Do something with i and/or A[i]
  6. end

当你需要每个元素的值而不是索引时,使用第一个构造。 在第二个构造中,如果 A 是具有快速线性索引的数组类型,i 将是 Int; 否则,它将是一个 CartesianIndex

  1. julia> A = rand(4,3);
  2. julia> B = view(A, 1:3, 2:3);
  3. julia> for i in eachindex(B)
  4. @show i
  5. end
  6. i = CartesianIndex(1, 1)
  7. i = CartesianIndex(2, 1)
  8. i = CartesianIndex(3, 1)
  9. i = CartesianIndex(1, 2)
  10. i = CartesianIndex(2, 2)
  11. i = CartesianIndex(3, 2)

for i = 1:length(A) 相比,eachindex 提供了一种迭代任何数组类型的有效方法。

Array traits

如果你编写一个自定义的 AbstractArray 类型,你可以用以下代码指定它使用快速线性索引

  1. Base.IndexStyle(::Type{<:MyArray}) = IndexLinear()

此设置将导致 myArray 上的 eachindex 迭代使用整数。如果未指定此特征,则使用默认值 IndexCartesian()

Array and Vectorized Operators and Functions

以下运算符支持对数组操作

  1. 一元运算符 – -, +
  2. 二元运算符 – -, +, *, /, \, ^
  3. 比较操作符 – ==, !=, (isapprox),

另外,为了便于数学上和其他运算的向量化,Julia 提供了点语法(dot syntax) f.(args...),例如,sin.(x)min.(x,y),用于数组或数组和标量的混合上的按元素运算(广播运算);当与其他点调用(dot call)结合使用时,它们的额外优点是能「融合」到单个循环中,例如,sin.(cos.(x))

此外,每个二元运算符支持相应的点操作版本,可以应用于此类融合 broadcasting 操作的数组(以及数组和标量的组合),例如 z .== sin.(x .* y)

请注意,类似 == 的比较运算在作用于整个数组时,得到一个布尔结果。使用像 .== 这样的点运算符进行按元素的比较。(对于像 < 这样的比较操作,只有按元素运算的版本 .< 适用于数组。)

还要注意 max.(a,b)maximum(a) 之间的区别,max.(a,b)ab 的每个元素 broadcasts maxmaximum(a) 寻找在 a 中的最大值。min.(a,b)minimum(a) 也有同样的关系。

广播

有时需要在不同尺寸的数组上执行元素对元素的操作,例如将矩阵的每一列加一个向量。一种低效的方法是将向量复制成矩阵的大小:

  1. julia> a = rand(2,1); A = rand(2,3);
  2. julia> repeat(a,1,3)+A
  3. 2×3 Array{Float64,2}:
  4. 1.20813 1.82068 1.25387
  5. 1.56851 1.86401 1.67846

当维度较大的时候,这种方法将会十分浪费,所以 Julia 提供了广播 broadcast,它将会将参数中低维度的参数扩展,使得其与其他维度匹配,且不会使用额外的内存,并将所给的函数逐元素地应用。

  1. julia> broadcast(+, a, A)
  2. 2×3 Array{Float64,2}:
  3. 1.20813 1.82068 1.25387
  4. 1.56851 1.86401 1.67846
  5. julia> b = rand(1,2)
  6. 1×2 Array{Float64,2}:
  7. 0.867535 0.00457906
  8. julia> broadcast(+, a, b)
  9. 2×2 Array{Float64,2}:
  10. 1.71056 0.847604
  11. 1.73659 0.873631

Dotted operators such as .+ and .* are equivalent to broadcast calls (except that they fuse, as described above). There is also a broadcast! function to specify an explicit destination (which can also be accessed in a fusing fashion by .= assignment). In fact, f.(args...) is equivalent to broadcast(f, args...), providing a convenient syntax to broadcast any function (dot syntax). Nested “dot calls” f.(...) (including calls to .+ etcetera) automatically fuse into a single broadcast call.

Additionally, broadcast is not limited to arrays (see the function documentation); it also handles scalars, tuples and other collections. By default, only some argument types are considered scalars, including (but not limited to) Numbers, Strings, Symbols, Types, Functions and some common singletons like missing and nothing. All other arguments are iterated over or indexed into elementwise.

  1. julia> convert.(Float32, [1, 2])
  2. 2-element Array{Float32,1}:
  3. 1.0
  4. 2.0
  5. julia> ceil.(UInt8, [1.2 3.4; 5.6 6.7])
  6. 2×2 Array{UInt8,2}:
  7. 0x02 0x04
  8. 0x06 0x07
  9. julia> string.(1:3, ". ", ["First", "Second", "Third"])
  10. 3-element Array{String,1}:
  11. "1. First"
  12. "2. Second"
  13. "3. Third"

Sometimes, you want a container (like an array) that would normally participate in broadcast to be “protected” from broadcast’s behavior of iterating over all of its elements. By placing it inside another container (like a single element Tuple) broadcast will treat it as a single value.

  1. julia> ([1, 2, 3], [4, 5, 6]) .+ ([1, 2, 3],)
  2. ([2, 4, 6], [5, 7, 9])
  3. julia> ([1, 2, 3], [4, 5, 6]) .+ tuple([1, 2, 3])
  4. ([2, 4, 6], [5, 7, 9])

实现

Julia 中的基本数组类型是抽象类型 AbstractArray{T,N}。它通过维数 N 和元素类型 T 进行参数化。AbstractVectorAbstractMatrix 是一维和二维情况下的别名。AbstractArray 对象的操作是使用更高级别的运算符和函数定义的,其方式独立于底层存储。这些操作可以正确地被用于任何特定数组实现的回退操作。

AbstractArray 类型包含任何模糊类似的东西,它的实现可能与传统数组完全不同。例如,可以根据请求而不是存储来计算元素。但是,任何具体的 AbstractArray{T,N} 类型通常应该至少实现 size(A)(返回 Int 元组),getindex(A,i)getindex(A,i1,...,iN);可变数组也应该实现 setindex!。建议这些操作具有几乎为常数的时间复杂性,或严格说来 Õ(1) 复杂性,否则某些数组函数可能出乎意料的慢。具体类型通常还应提供 similar(A,T=eltype(A),dims=size(A)) 方法,用于为 copy 分配类似的数组和其他位于当前数组空间外的操作。无论在内部如何表示 AbstractArray{T,N}T 是由 整数 索引返回的对象类型(A[1, ..., 1],当 A 不为空),N 应该是 size 返回的元组的长度。有关定义自定义 AbstractArray 实现的更多详细信息,请参阅接口章节中的数组接口导则

DenseArray is an abstract subtype of AbstractArray intended to include all arrays where elements are stored contiguously in column-major order (see additional notes in Performance Tips). The Array type is a specific instance of DenseArray; Vector and Matrix are aliases for the 1-d and 2-d cases. Very few operations are implemented specifically for Array beyond those that are required for all AbstractArrays; much of the array library is implemented in a generic manner that allows all custom arrays to behave similarly.

SubArrayAbstractArray 的特例,它通过与原始数组共享内存而不是复制它来执行索引。 使用view 函数创建 SubArray,它的调用方式与getindex 相同(作用于数组和一系列索引参数)。 view 的结果看起来与 getindex 的结果相同,只是数据保持不变。 view 将输入索引向量存储在 SubArray 对象中,该对象稍后可用于间接索引原始数组。 通过将 @views 宏放在表达式或代码块之前,该表达式中的任何 array [...] 切片将被转换为创建一个 SubArray 视图。

BitArray 是节省空间“压缩”的布尔数组,每个比特(bit)存储一个布尔值。 它们可以类似于 Array{Bool} 数组(每个字节(byte)存储一个布尔值),并且可以分别通过 Array(bitarray)BitArray(array) 相互转换。

An array is “strided” if it is stored in memory with well-defined spacings (strides) between its elements. A strided array with a supported element type may be passed to an external (non-Julia) library like BLAS or LAPACK by simply passing its pointer and the stride for each dimension. The stride(A, d) is the distance between elements along dimension d. For example, the builtin Array returned by rand(5,7,2) has its elements arranged contiguously in column major order. This means that the stride of the first dimension — the spacing between elements in the same column — is 1:

  1. julia> A = rand(5,7,2);
  2. julia> stride(A,1)
  3. 1

The stride of the second dimension is the spacing between elements in the same row, skipping as many elements as there are in a single column (5). Similarly, jumping between the two “pages” (in the third dimension) requires skipping 5*7 == 35 elements. The strides of this array is the tuple of these three numbers together:

  1. julia> strides(A)
  2. (1, 5, 35)

In this particular case, the number of elements skipped in memory matches the number of linear indices skipped. This is only the case for contiguous arrays like Array (and other DenseArray subtypes) and is not true in general. Views with range indices are a good example of non-contiguous strided arrays; consider V = @view A[1:3:4, 2:2:6, 2:-1:1]. This view V refers to the same memory as A but is skipping and re-arranging some of its elements. The stride of the first dimension of V is 3 because we’re only selecting every third row from our original array:

  1. julia> V = @view A[1:3:4, 2:2:6, 2:-1:1];
  2. julia> stride(V, 1)
  3. 3

This view is similarly selecting every other column from our original A — and thus it needs to skip the equivalent of two five-element columns when moving between indices in the second dimension:

  1. julia> stride(V, 2)
  2. 10

The third dimension is interesting because its order is reversed! Thus to get from the first “page” to the second one it must go backwards in memory, and so its stride in this dimension is negative!

  1. julia> stride(V, 3)
  2. -35

This means that the pointer for V is actually pointing into the middle of A‘s memory block, and it refers to elements both backwards and forwards in memory. See the interface guide for strided arrays for more details on defining your own strided arrays. StridedVector and StridedMatrix are convenient aliases for many of the builtin array types that are considered strided arrays, allowing them to dispatch to select specialized implementations that call highly tuned and optimized BLAS and LAPACK functions using just the pointer and strides.

It is worth emphasizing that strides are about offsets in memory rather than indexing. If you are looking to convert between linear (single-index) indexing and cartesian (multi-index) indexing, see LinearIndices and CartesianIndices.

  • 1iid,独立同分布