json 模块:处理 JSON 数据

JSON (JavaScript Object Notation) 是一种轻量级的数据交换格式,易于人阅读和编写,同时也易于机器解析和生成。

JSON 基础

JSON 的基础结构有两种:键值对 (name/value pairs) 和数组 (array)。

JSON 具有以下形式:

  • object - 对象,用花括号表示,形式为(数据是无序的):
    • { pair_1, pair_2, …, pair_n }
  • pair - 键值对,形式为:
    • string : value
  • array - 数组,用中括号表示,形式为(数据是有序的):
    • [value_1, value_2, …, value_n ]
  • value - 值,可以是
    • string 字符串
    • number 数字
    • object 对象
    • array 数组
    • true / false / null 特殊值
  • string 字符串 例子:
  1. {
  2. "name": "echo",
  3. "age": 24,
  4. "coding skills": ["python", "matlab", "java", "c", "c++", "ruby", "scala"],
  5. "ages for school": {
  6. "primary school": 6,
  7. "middle school": 9,
  8. "high school": 15,
  9. "university": 18
  10. },
  11. "hobby": ["sports", "reading"],
  12. "married": false
  13. }

JSON 与 Python 的转换

假设我们已经将上面这个 JSON 对象写入了一个字符串:

In [1]:

  1. import json
  2. from pprint import pprint
  3.  
  4. info_string = """
  5. {
  6. "name": "echo",
  7. "age": 24,
  8. "coding skills": ["python", "matlab", "java", "c", "c++", "ruby", "scala"],
  9. "ages for school": {
  10. "primary school": 6,
  11. "middle school": 9,
  12. "high school": 15,
  13. "university": 18
  14. },
  15. "hobby": ["sports", "reading"],
  16. "married": false
  17. }
  18. """

我们可以用 json.loads() (load string) 方法从字符串中读取 JSON 数据:

In [2]:

  1. info = json.loads(info_string)
  2.  
  3. pprint(info)
  1. {u'age': 24,
  2. u'ages for school': {u'high school': 15,
  3. u'middle school': 9,
  4. u'primary school': 6,
  5. u'university': 18},
  6. u'coding skills': [u'python',
  7. u'matlab',
  8. u'java',
  9. u'c',
  10. u'c++',
  11. u'ruby',
  12. u'scala'],
  13. u'hobby': [u'sports', u'reading'],
  14. u'married': False,
  15. u'name': u'echo'}

此时,我们将原来的 JSON 数据变成了一个 Python 对象,在我们的例子中这个对象是个字典(也可能是别的类型,比如列表):

In [3]:

  1. type(info)

Out[3]:

  1. dict

可以使用 json.dumps() 将一个 Python 对象变成 JSON 对象:

In [4]:

  1. info_json = json.dumps(info)
  2.  
  3. print info_json
  1. {"name": "echo", "age": 24, "married": false, "ages for school": {"middle school": 9, "university": 18, "high school": 15, "primary school": 6}, "coding skills": ["python", "matlab", "java", "c", "c++", "ruby", "scala"], "hobby": ["sports", "reading"]}

从中我们可以看到,生成的 JSON 字符串中,数组的元素顺序是不变的(始终是 ["python", "matlab", "java", "c", "c++", "ruby", "scala"]),而对象的元素顺序是不确定的。

生成和读取 JSON 文件

pickle 类似,我们可以直接从文件中读取 JSON 数据,也可以将对象保存为 JSON 格式。

  • json.dump(obj, file) 将对象保存为 JSON 格式的文件
  • json.load(file) 从 JSON 文件中读取数据

In [5]:

  1. with open("info.json", "w") as f:
  2. json.dump(info, f)

可以查看 info.json 的内容:

In [6]:

  1. with open("info.json") as f:
  2. print f.read()
  1. {"name": "echo", "age": 24, "married": false, "ages for school": {"middle school": 9, "university": 18, "high school": 15, "primary school": 6}, "coding skills": ["python", "matlab", "java", "c", "c++", "ruby", "scala"], "hobby": ["sports", "reading"]}

从文件中读取数据:

In [7]:

  1. with open("info.json") as f:
  2. info_from_file = json.load(f)
  3.  
  4. pprint(info_from_file)
  1. {u'age': 24,
  2. u'ages for school': {u'high school': 15,
  3. u'middle school': 9,
  4. u'primary school': 6,
  5. u'university': 18},
  6. u'coding skills': [u'python',
  7. u'matlab',
  8. u'java',
  9. u'c',
  10. u'c++',
  11. u'ruby',
  12. u'scala'],
  13. u'hobby': [u'sports', u'reading'],
  14. u'married': False,
  15. u'name': u'echo'}

删除生成的文件:

In [8]:

  1. import os
  2. os.remove("info.json")

原文: https://nbviewer.jupyter.org/github/lijin-THU/notes-python/blob/master/11-useful-tools/11.03-json.ipynb