9. 使用查询方法提高布尔索引的可读性

  1. # 读取employee数据,确定选取的部门和列
  2. In[65]: employee = pd.read_csv('data/employee.csv')
  3. depts = ['Houston Police Department-HPD', 'Houston Fire Department (HFD)']
  4. select_columns = ['UNIQUE_ID', 'DEPARTMENT', 'GENDER', 'BASE_SALARY']
  5. # 创建查询字符串,并执行query方法
  6. In[66]: qs = "DEPARTMENT in @depts " \
  7. "and GENDER == 'Female' " \
  8. "and 80000 <= BASE_SALARY <= 120000"
  9. emp_filtered = employee.query(qs)
  10. emp_filtered[select_columns].head()
  11. Out[66]:

9. 使用查询方法提高布尔索引的可读性 - 图1

更多

  1. # 若要不使用部门列表,也可以使用下面的方法
  2. In[67]: top10_depts = employee.DEPARTMENT.value_counts().index[:10].tolist()
  3. qs = "DEPARTMENT not in @top10_depts and GENDER == 'Female'"
  4. employee_filtered2 = employee.query(qs)
  5. employee_filtered2[['DEPARTMENT', 'GENDER']].head()
  6. Out[67]:

9. 使用查询方法提高布尔索引的可读性 - 图2