5. 快速选取标量
# 通过将行标签赋值给一个变量,用loc选取
In[37]: college = pd.read_csv('data/college.csv', index_col='INSTNM')
cn = 'Texas A & M University-College Station'
college.loc[cn, 'UGDS_WHITE']
Out[37]: 0.66099999999999992
# at可以实现同样的功能
In[38]: college.at[cn, 'UGDS_WHITE']
Out[38]: 0.66099999999999992
# 用魔术方法%timeit,对速度进行比较
In[39]: %timeit college.loc[cn, 'UGDS_WHITE']
Out[39]: 9.93 µs ± 274 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In[40]: %timeit college.at[cn, 'UGDS_WHITE']
Out[40]: 6.69 µs ± 223 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
.iat
和.at
只接收标量值,是专门用来取代.iloc
和.loc
选取标量的,可以节省大概2.5微秒。
# 用get_loc找到整数位置,再进行速度比较
In[41]: row_num = college.index.get_loc(cn)
col_num = college.columns.get_loc('UGDS_WHITE')
In[42]: row_num, col_num
Out[42]: (3765, 10)
In[43]: %timeit college.iloc[row_num, col_num]
Out[43]: 11.1 µs ± 426 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In[44]: %timeit college.iat[row_num, col_num]
Out[44]: 7.47 µs ± 109 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In[45]: %timeit college.iloc[5, col_num]
Out[45]: 10.8 µs ± 467 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In[46]: %timeit college.iat[5, col_num]
Out[46]: 7.12 µs ± 297 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
更多
# Series对象也可以使用.iat和.at选取标量
In[47]: state = college['STABBR']
In[48]: state.iat[1000]
Out[48]: 'IL'
In[49]: state.at['Stanford University']
Out[49]: 'CA'