Guidelines for shell commands in the GitLab codebase

Guidelines for shell commands in the GitLab codebase
Guidelines for shell commands in the GitLab codebase

Guidelines for shell commands in the GitLab codebase

原文：https://docs.gitlab.com/ee/development/shell_commands.html

References
Use File and FileUtils instead of shell commands
Always use the configurable Git binary path for Git commands
Bypass the shell by splitting commands into separate tokens
Separate options from arguments with –
Do not use the backticks
Avoid user input at the start of path strings
Guard against path traversal
Properly anchor regular expressions to the start and end of strings

Guidelines for shell commands in the GitLab codebase

本文档包含使用 GitLab 代码库中的进程和文件的准则. 这些准则旨在使您的代码更加可靠和安全.

References

Use File and FileUtils instead of shell commands

有时，当还有 Ruby API 可以通过外壳调用基本的 Unix 命令时. 使用 Ruby API（如果存在）. http://www.ruby-doc.org/stdlib-2.0.0/libdoc/fileutils/rdoc/FileUtils.html#module-FileUtils-label-Module+Functions

# Wrong
system "mkdir -p tmp/special/directory"
# Better (separate tokens)
system *%W(mkdir -p tmp/special/directory)
# Best (do not use a shell command)
FileUtils.mkdir_p "tmp/special/directory"
# Wrong
contents = `cat #{filename}`
# Correct
contents = File.read(filename)
# Sometimes a shell command is just the best solution. The example below has no
# user input, and is hard to implement correctly in Ruby: delete all files and
# directories older than 120 minutes under /some/path, but not /some/path
# itself.
Gitlab::Popen.popen(%W(find /some/path -not -path /some/path -mmin +120 -delete))

这种编码风格可能阻止了 CVE-2013-4490.

Always use the configurable Git binary path for Git commands

# Wrong
system(*%W(git branch -d -- #{branch_name}))
# Correct
system(*%W(#{Gitlab.config.git.bin_path} branch -d -- #{branch_name}))

Bypass the shell by splitting commands into separate tokens

当我们将 shell 命令作为单个字符串传递给 Ruby 时，Ruby 将让/bin/sh评估整个字符串. 本质上，我们要求外壳程序评估单行脚本. 这会造成外壳注入攻击的风险. 最好自己将 shell 命令拆分为令牌. 有时，我们使用外壳程序的脚本功能来更改工作目录或设置环境变量. 所有这些都可以直接从 Ruby 安全地实现

# Wrong
system "cd /home/git/gitlab && bundle exec rake db:#{something} RAILS_ENV=production"
# Correct
system({'RAILS_ENV' => 'production'}, *%W(bundle exec rake db:#{something}), chdir: '/home/git/gitlab')
# Wrong
system "touch #{myfile}"
# Better
system "touch", myfile
# Best (do not run a shell command at all)
FileUtils.touch myfile

这种编码风格可能阻止了 CVE-2013-4546.

Separate options from arguments with –

使用--使系统命令的参数解析器可以清楚了解选项和参数之间的区别. 许多但并非所有 Unix 命令都支持此功能.

要了解什么--不，请考虑以下问题.

# Example
$ echo hello > -l
$ cat -l
cat: illegal option -- l
usage: cat [-benstuv] [file ...]

在上面的示例中， cat的参数解析器假定-l是一个选项. 在上面的例子中的解决方案是明确告诉cat那-l实在是一个论点，不是一种选择. 许多 Unix 命令行工具都遵循用--分隔选项和参数的约定.

# Example (continued)
$ cat -- -l
hello

在 GitLab 代码库中，我们总是通过对支持它的命令使用--来避免选项/参数的歧义.

# Wrong
system(*%W(#{Gitlab.config.git.bin_path} branch -d #{branch_name}))
# Correct
system(*%W(#{Gitlab.config.git.bin_path} branch -d -- #{branch_name}))

这种编码风格可能阻止了 CVE-2013-4582.

Do not use the backticks

用反引号捕获 shell 命令的输出很不错，但是您不得不将命令作为一个字符串传递给 shell. 上面我们解释了这是不安全的. 在主要的 GitLab 代码库中，解决方案是改用Gitlab::Popen.popen .

# Wrong
logs = `cd #{repo_dir} && #{Gitlab.config.git.bin_path} log`
# Correct
logs, exit_status = Gitlab::Popen.popen(%W(#{Gitlab.config.git.bin_path} log), repo_dir)
# Wrong
user = `whoami`
# Correct
user, exit_status = Gitlab::Popen.popen(%W(whoami))

在其他存储库（如 GitLab Shell）中，您也可以使用IO.popen .

# Safe IO.popen example
logs = IO.popen(%W(#{Gitlab.config.git.bin_path} log), chdir: repo_dir) { |p| p.read }

请注意，与Gitlab::Popen.popen不同， IO.popen不会捕获标准错误.

Avoid user input at the start of path strings

可以使用各种在 Ruby 中打开和读取文件的方法来读取进程的标准输出而不是文件. 以下两个命令大致相同：

`touch /tmp/pawned-by-backticks`
File.read('|touch /tmp/pawned-by-file-read')

关键是打开一个以” | “开头的”文件” | . 受影响的方法包括 Kernel＃open，File :: read，File :: open，IO :: open 和 IO :: read.

您可以通过确保攻击者无法控制要打开的文件名字符串的开头来防止”打开”和”读取”这种行为. 例如，下面的内容足以防止意外地使用|启动 shell 命令| ：

# we assume repo_path is not controlled by the attacker (user)
path = File.join(repo_path, user_input)
# path cannot start with '|' now.
File.read(path)

如果必须使用用户输入的相对路径，请在路径前添加./ .

前缀用户提供的路径还提供了针对以-开头的路径的额外保护（请参阅上面有关使用--的讨论）.

Guard against path traversal

路径遍历是一种安全措施，程序（GitLab）试图限制用户对磁盘上某个目录的访问，但用户设法利用../路径符号来打开该目录之外的文件.

# Suppose the user gave us a path and they are trying to trick us
user_input = '../other-repo.git/other-file'
# We look up the repo path somewhere
repo_path = 'repositories/user-repo.git'
# The intention of the code below is to open a file under repo_path, but
# because the user used '..' they can 'break out' into
# 'repositories/other-repo.git'
full_path = File.join(repo_path, user_input)
File.open(full_path) do # Oops!

防止这种情况发生的好方法是根据 Ruby 的File.absolute_path将完整路径与其”绝对路径”进行File.absolute_path .

full_path = File.join(repo_path, user_input)
if full_path != File.absolute_path(full_path)
  raise "Invalid path: #{full_path.inspect}"
end
File.open(full_path) do # Etc.

这样的检查可以避免 CVE-2013-4583.

Properly anchor regular expressions to the start and end of strings

当使用正则表达式来验证作为参数传递给 shell 命令的用户输入时，请确保使用\A和\z定位符来指定字符串的开头和结尾，而不是^和$ ，或者不要使用定位符所有.

如果您不这样做，攻击者可能会使用它来执行具有潜在有害影响的命令.

例如，当如下所示验证项目的import_url时，用户可以诱使 GitLab 从本地文件系统上的 Git 存储import_url .

validates :import_url, format: { with: URI.regexp(%w(ssh git http https)) }
# URI.regexp(%w(ssh git http https)) roughly evaluates to /(ssh|git|http|https):(something_that_looks_like_a_url)/

假设用户提交以下内容作为其导入 URL：

file://git:/tmp/lol

由于使用的正则表达式中没有锚，因此值中的git:/tmp/lol将匹配，并且验证将通过.

导入时，GitLab 将执行以下命令，将import_url作为参数传递：

git clone file://git:/tmp/lol

Git 只会忽略git:部分，将路径解释为file:///tmp/lol ，然后将存储库导入到新项目中. 此操作可能使攻击者可以访问系统中的任何存储库（无论是否私有）.