八、 改变形状

详情请参阅 层次索引改变形状

Stack

  1. In [95]: tuples = list(zip(*[['bar', 'bar', 'baz', 'baz',
  2. ....: 'foo', 'foo', 'qux', 'qux'],
  3. ....: ['one', 'two', 'one', 'two',
  4. ....: 'one', 'two', 'one', 'two']]))
  5. ....:
  6. In [96]: index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
  7. In [97]: df = pd.DataFrame(np.random.randn(8, 2), index=index, columns=['A', 'B'])
  8. In [98]: df2 = df[:4]
  9. In [99]: df2
  10. Out[99]:
  11. A B
  12. first second
  13. bar one 0.029399 -0.542108
  14. two 0.282696 -0.087302
  15. baz one -1.575170 1.771208
  16. two 0.816482 1.100230
  1. In [100]: stacked = df2.stack()
  2. In [101]: stacked
  3. Out[101]:
  4. first second
  5. bar one A 0.029399
  6. B -0.542108
  7. two A 0.282696
  8. B -0.087302
  9. baz one A -1.575170
  10. B 1.771208
  11. two A 0.816482
  12. B 1.100230
  13. dtype: float64
  1. In [102]: stacked.unstack()
  2. Out[102]:
  3. A B
  4. first second
  5. bar one 0.029399 -0.542108
  6. two 0.282696 -0.087302
  7. baz one -1.575170 1.771208
  8. two 0.816482 1.100230
  9. In [103]: stacked.unstack(1)
  10. Out[103]:
  11. second one two
  12. first
  13. bar A 0.029399 0.282696
  14. B -0.542108 -0.087302
  15. baz A -1.575170 0.816482
  16. B 1.771208 1.100230
  17. In [104]: stacked.unstack(0)
  18. Out[104]:
  19. first bar baz
  20. second
  21. one A 0.029399 -1.575170
  22. B -0.542108 1.771208
  23. two A 0.282696 0.816482
  24. B -0.087302 1.100230

数据透视表

详情请参阅:数据透视表.

  1. In [105]: df = pd.DataFrame({'A' : ['one', 'one', 'two', 'three'] * 3,
  2. .....: 'B' : ['A', 'B', 'C'] * 4,
  3. .....: 'C' : ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'] * 2,
  4. .....: 'D' : np.random.randn(12),
  5. .....: 'E' : np.random.randn(12)})
  6. .....:
  7. In [106]: df
  8. Out[106]:
  9. A B C D E
  10. 0 one A foo 1.418757 -0.179666
  11. 1 one B foo -1.879024 1.291836
  12. 2 two C foo 0.536826 -0.009614
  13. 3 three A bar 1.006160 0.392149
  14. 4 one B bar -0.029716 0.264599
  15. 5 one C bar -1.146178 -0.057409
  16. 6 two A foo 0.100900 -1.425638
  17. 7 three B foo -1.035018 1.024098
  18. 8 one C foo 0.314665 -0.106062
  19. 9 one A bar -0.773723 1.824375
  20. 10 two B bar -1.170653 0.595974
  21. 11 three C bar 0.648740 1.167115

可以从这个数据中轻松的生成数据透视表:

  1. In [107]: pd.pivot_table(df, values='D', index=['A', 'B'], columns=['C'])
  2. Out[107]:
  3. C bar foo
  4. A B
  5. one A -0.773723 1.418757
  6. B -0.029716 -1.879024
  7. C -1.146178 0.314665
  8. three A 1.006160 NaN
  9. B NaN -1.035018
  10. C 0.648740 NaN
  11. two A NaN 0.100900
  12. B -1.170653 NaN
  13. C NaN 0.536826