Pandas断言框

2024-10-01 09:17:36 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在构建测试用例,我想比较两个数据帧。 即使dataframe具有相同的列和值,assert_frame_equal reports也不相等。 列顺序不同,我尝试重新排序列,但没有成功。在

在我的测试用例中,Im使用以下函数:

testing.assert_frame_equal(expected, tested, check_dtype=False)

第一个数据帧声明如下:

^{pr2}$

熊猫数据帧pd2:

    artista artista_sugerido  busqueda media_sugerido   mid_sugerido  \
0   Beyoncé          Beyoncé   Beyoncé          album  /g/11bz0dg4b_   
1  Radiolab         Radiolab  Radiolab          album  /g/11bt_6j9dk   
2      Xmas             None      Xmas          track  /g/11c2nz8jc2   
3   Beyonce          Beyonce   Beyonce          album  /g/11bt_6jXXX   

                      texto            texto_sugerido  
0                  Lemonade                  Lemonade  
1                  Radiolab                  Radiolab  
2  Merry Christmas Lil Mama  Merry Christmas Lil Mama  
3                   Beyonce                   Beyonce  

第二个数据帧是从函数(result)返回的数据帧。在

    artista  busqueda   mid_sugerido                     texto  \
0   Beyoncé   Beyoncé  /g/11bz0dg4b_                  Lemonade   
1  Radiolab  Radiolab  /g/11bt_6j9dk                  Radiolab   
2      Xmas      Xmas  /g/11c2nz8jc2  Merry Christmas Lil Mama   
3   Beyonce   Beyonce  /g/11bt_6jXXX                   Beyonce   

             texto_sugerido artista_sugerido media_sugerido  
0                  Lemonade          Beyoncé          album  
1                  Radiolab         Radiolab          album  
2  Merry Christmas Lil Mama             None          track  
3                   Beyonce          Beyonce          album 

运行时出现以下错误:assert_frame_equal(df2, result)

Traceback (most recent call last):
  File "/Users/spicyramen/Documents/Development/parzee/python/coverage/experimental/pandas_creation.py", line 158, in <module>
    assert_frame_equal(df6, _Normalize(df5, test_dict))
  File "/Users/spicyramen/Documents/Development/parzee/python/coverage/experimental/pandas_creation.py", line 16, in assert_frame_equal
    testing.assert_frame_equal(expected, tested, check_dtype=False)
  File "/Library/Python/2.7/site-packages/pandas/util/testing.py", line 1142, in assert_frame_equal
    obj='{0}.columns'.format(obj))
  File "/Library/Python/2.7/site-packages/pandas/util/testing.py", line 761, in assert_index_equal
    obj=obj, lobj=left, robj=right)
  File "pandas/src/testing.pyx", line 58, in pandas._testing.assert_almost_equal (pandas/src/testing.c:3887)
  File "pandas/src/testing.pyx", line 147, in pandas._testing.assert_almost_equal (pandas/src/testing.c:2769)
  File "/Library/Python/2.7/site-packages/pandas/util/testing.py", line 915, in raise_assert_detail
    raise AssertionError(msg)
AssertionError: DataFrame.columns are different

DataFrame.columns values are different (85.71429 %)
[left]:  Index([u'artista', u'artista_sugerido', u'busqueda', u'media_sugerido',
       u'mid_sugerido', u'texto', u'texto_sugerido'],
      dtype='object')
[right]: Index([u'artista', u'busqueda', u'mid_sugerido', u'texto', u'texto_sugerido',
       u'artista_sugerido', u'media_sugerido'],
      dtype='object')

列是相同的,但是顺序不同,如果我使用df.sort_索引(axis=1)要对列重新排序,我得到:

Traceback (most recent call last):
  File "/Users/spicyramen/Documents/Development/parzee/python/coverage/experimental/pandas_creation.py", line 154, in <module>
    assert_frame_equal(df6.sort_index(axis=1), _Normalize(df5, test_dict).sort_index(axis=1))
  File "/Users/spicyramen/Documents/Development/parzee/python/coverage/experimental/pandas_creation.py", line 16, in assert_frame_equal
    testing.assert_frame_equal(expected, tested, check_dtype=False, check_like=False)
  File "/Library/Python/2.7/site-packages/pandas/util/testing.py", line 1166, in assert_frame_equal
    obj='DataFrame.iloc[:, {0}]'.format(i))
  File "/Library/Python/2.7/site-packages/pandas/util/testing.py", line 1049, in assert_series_equal
    check_less_precise, obj='{0}'.format(obj))
  File "pandas/src/testing.pyx", line 58, in pandas._testing.assert_almost_equal (pandas/src/testing.c:3887)
  File "pandas/src/testing.pyx", line 147, in pandas._testing.assert_almost_equal (pandas/src/testing.c:2769)
  File "/Library/Python/2.7/site-packages/pandas/util/testing.py", line 914, in raise_assert_detail
    [right]: {3}""".format(obj, message, left, right)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 7: ordinal not in range(128)

Tags: inpyobjpandaslineassertequaltesting