六、 合并

Pandas 提供了大量的方法能够轻松的对SeriesDataFramePanel对象进行各种符合各种逻辑关系的合并操作。具体请参阅:合并

Concat

  1. In [73]: df = pd.DataFrame(np.random.randn(10, 4))
  2. In [74]: df
  3. Out[74]:
  4. 0 1 2 3
  5. 0 -0.548702 1.467327 -1.015962 -0.483075
  6. 1 1.637550 -1.217659 -0.291519 -1.745505
  7. 2 -0.263952 0.991460 -0.919069 0.266046
  8. 3 -0.709661 1.669052 1.037882 -1.705775
  9. 4 -0.919854 -0.042379 1.247642 -0.009920
  10. 5 0.290213 0.495767 0.362949 1.548106
  11. 6 -1.131345 -0.089329 0.337863 -0.945867
  12. 7 -0.932132 1.956030 0.017587 -0.016692
  13. 8 -0.575247 0.254161 -1.143704 0.215897
  14. 9 1.193555 -0.077118 -0.408530 -0.862495
  15. # break it into pieces
  16. In [75]: pieces = [df[:3], df[3:7], df[7:]]
  17. In [76]: pd.concat(pieces)
  18. Out[76]:
  19. 0 1 2 3
  20. 0 -0.548702 1.467327 -1.015962 -0.483075
  21. 1 1.637550 -1.217659 -0.291519 -1.745505
  22. 2 -0.263952 0.991460 -0.919069 0.266046
  23. 3 -0.709661 1.669052 1.037882 -1.705775
  24. 4 -0.919854 -0.042379 1.247642 -0.009920
  25. 5 0.290213 0.495767 0.362949 1.548106
  26. 6 -1.131345 -0.089329 0.337863 -0.945867
  27. 7 -0.932132 1.956030 0.017587 -0.016692
  28. 8 -0.575247 0.254161 -1.143704 0.215897
  29. 9 1.193555 -0.077118 -0.408530 -0.862495

Join

类似于 SQL 类型的合并,具体请参阅:数据库风格的连接

  1. In [77]: left = pd.DataFrame({'key': ['foo', 'foo'], 'lval': [1, 2]})
  2. In [78]: right = pd.DataFrame({'key': ['foo', 'foo'], 'rval': [4, 5]})
  3. In [79]: left
  4. Out[79]:
  5. key lval
  6. 0 foo 1
  7. 1 foo 2
  8. In [80]: right
  9. Out[80]:
  10. key rval
  11. 0 foo 4
  12. 1 foo 5
  13. In [81]: pd.merge(left, right, on='key')
  14. Out[81]:
  15. key lval rval
  16. 0 foo 1 4
  17. 1 foo 1 5
  18. 2 foo 2 4
  19. 3 foo 2 5

另一个例子:

  1. In [82]: left = pd.DataFrame({'key': ['foo', 'bar'], 'lval': [1, 2]})
  2. In [83]: right = pd.DataFrame({'key': ['foo', 'bar'], 'rval': [4, 5]})
  3. In [84]: left
  4. Out[84]:
  5. key lval
  6. 0 foo 1
  7. 1 bar 2
  8. In [85]: right
  9. Out[85]:
  10. key rval
  11. 0 foo 4
  12. 1 bar 5
  13. In [86]: pd.merge(left, right, on='key')
  14. Out[86]:
  15. key lval rval
  16. 0 foo 1 4
  17. 1 bar 2 5

Append

将一行连接到一个DataFrame上,具体请参阅附加

  1. In [87]: df = pd.DataFrame(np.random.randn(8, 4), columns=['A','B','C','D'])
  2. In [88]: df
  3. Out[88]:
  4. A B C D
  5. 0 1.346061 1.511763 1.627081 -0.990582
  6. 1 -0.441652 1.211526 0.268520 0.024580
  7. 2 -1.577585 0.396823 -0.105381 -0.532532
  8. 3 1.453749 1.208843 -0.080952 -0.264610
  9. 4 -0.727965 -0.589346 0.339969 -0.693205
  10. 5 -0.339355 0.593616 0.884345 1.591431
  11. 6 0.141809 0.220390 0.435589 0.192451
  12. 7 -0.096701 0.803351 1.715071 -0.708758
  13. In [89]: s = df.iloc[3]
  14. In [90]: df.append(s, ignore_index=True)
  15. Out[90]:
  16. A B C D
  17. 0 1.346061 1.511763 1.627081 -0.990582
  18. 1 -0.441652 1.211526 0.268520 0.024580
  19. 2 -1.577585 0.396823 -0.105381 -0.532532
  20. 3 1.453749 1.208843 -0.080952 -0.264610
  21. 4 -0.727965 -0.589346 0.339969 -0.693205
  22. 5 -0.339355 0.593616 0.884345 1.591431
  23. 6 0.141809 0.220390 0.435589 0.192451
  24. 7 -0.096701 0.803351 1.715071 -0.708758
  25. 8 1.453749 1.208843 -0.080952 -0.264610