Sunday, March 6, 2011

How do I concisely implement multiple similar unit tests in the Python unittest framework?

I'm implementing unit tests for a family of functions that all share a number of invariants. For example, calling the function with two matrices produce a matrix of known shape.

I would like to write unit tests to test the entire family of functions for this property, without having to write an individual test case for each function (particularly since more functions might be added later).

One way to do this would be to iterate over a list of these functions:

import unittest
import numpy

from somewhere import the_functions
from somewhere.else import TheClass

class Test_the_functions(unittest.TestCase):
  def setUp(self):
    self.matrix1 = numpy.ones((5,10))
    self.matrix2 = numpy.identity(5)

  def testOutputShape(unittest.TestCase):
     """Output of functions be of a certain shape"""
     for function in all_functions:
       output = function(self.matrix1, self.matrix2)
       fail_message = "%s produces output of the wrong shape" % str(function)
       self.assertEqual(self.matrix1.shape, output.shape, fail_message)

if __name__ == "__main__":
  unittest.main()

I got the idea for this from Dive Into Python. There, it's not a list of functions being tested but a list of known input-output pairs. The problem with this approach is that if any element of the list fails the test, the later elements don't get tested.

I looked at subclassing unittest.TestCase and somehow providing the specific function to test as an argument, but as far as I can tell that prevents us from using unittest.main() because there would be no way to pass the argument to the testcase.

I also looked at dynamically attaching "testSomething" functions to the testcase, by using setattr with a lamdba, but the testcase did not recognize them.

How can I rewrite this so it remains trivial to expand the list of tests, while still ensuring every test is run?

From stackoverflow
  • You could use a metaclass to dynamically insert the tests. This works fine for me:

    import unittest
    
    class UnderTest(object):
    
        def f1(self, i):
            return i + 1
    
        def f2(self, i):
            return i + 2
    
    class TestMeta(type):
    
        def __new__(cls, name, bases, attrs):
            funcs = [t for t in dir(UnderTest) if t[0] == 'f']
    
            def doTest(t):
                def f(slf):
                    ut=UnderTest()
                    getattr(ut, t)(3)
                return f
    
            for f in funcs:
                attrs['test_gen_' + f] = doTest(f)
            return type.__new__(cls, name, bases, attrs)
    
    class T(unittest.TestCase):
    
        __metaclass__ = TestMeta
    
        def testOne(self):
            self.assertTrue(True)
    
    if __name__ == '__main__':
        unittest.main()
    
    saffsd : Thanks, this works. Only one slight quirk, nose is unable to see the testcases added by the metaclass. Any suggestions?
    Dustin : I'm not familiar with nose. This adds the methods to the class, so I'm not sure what nose could be doing to miss them. I'd be interesting to find out what its magic is, though.
    Matt Joiner : cutting it bit fine with the use of `f` in the __new__ there, it's a bit obscure
  • Metaclasses is one option. Another option is to use a TestSuite:

    import unittest
    import numpy
    import funcs
    
    # get references to functions
    # only the functions and if their names start with "matrixOp"
    functions_to_test = [v for k,v in funcs.__dict__ if v.func_name.startswith('matrixOp')]
    
    # suplly an optional setup function
    def setUp(self):
        self.matrix1 = numpy.ones((5,10))
        self.matrix2 = numpy.identity(5)
    
    # create tests from functions directly and store those TestCases in a TestSuite
    test_suite = unittest.TestSuite([unittest.FunctionTestCase(f, setUp=setUp) for f in functions_to_test])
    
    
    if __name__ == "__main__":
    unittest.main()
    

    Haven't tested. But it should work fine.

    saffsd : unittest.main() doesn't automatically pick this up, and neither does nose. Also, FunctionTestCase calls setUp without arguments, and the functions_to_test need to be wrapped in something that asserts the test.
  • Here's my favorite approach to the "family of related tests". I like explicit subclasses of a TestCase that expresses the common features.

    class MyTestF1( unittest.TestCase ):
        theFunction= staticmethod( f1 )
        def setUp(self):
            self.matrix1 = numpy.ones((5,10))
            self.matrix2 = numpy.identity(5)
        def testOutputShape( self ):
            """Output of functions be of a certain shape"""
            output = self.theFunction(self.matrix1, self.matrix2)
            fail_message = "%s produces output of the wrong shape" % (self.theFunction.__name__,)
            self.assertEqual(self.matrix1.shape, output.shape, fail_message)
    
    class TestF2( MyTestF1 ):
        """Includes ALL of TestF1 tests, plus a new test."""
        theFunction= staticmethod( f2 )
        def testUniqueFeature( self ):
             # blah blah blah
             pass
    
    class TestF3( MyTestF1 ):
        """Includes ALL of TestF1 tests with no additional code."""
        theFunction= staticmethod( f3 )
    

    Add a function, add a subclass of MyTestF1. Each subclass of MyTestF1 includes all of the tests in MyTestF1 with no duplicated code of any kind.

    Unique features are handled in an obvious way. New methods are added to the subclass.

    It's completely compatible with unittest.main()

    muhuk : I like this object-oriented solution. "Explicit is better than implicit"
    saffsd : I don't like this because it introduces a whole heap of duplicated code. Since each of the functions is meant to observe the same invariant being tested, I'd like a way to express exactly this without having to lump them into a single testcase. Perhaps I should? Thanks for the suggestion though.
    S.Lott : Refactor common code up into a superclass. That's what superclasses are for. Your "common test" is precisely why we have superclasses and subclasses.
    saffsd : The issue is that it doesn't refactor. I'm working with classification algorithms, and I'm modelling each as a function with two input matrices and one output matrix. Most of these functions are not even python, they're thin wrappers. Perhaps I should assert inside rather than test outside?
    S.Lott : @saffsd: "it doesn't refactor"? What is "it? I'm talking about refactoring the tests into a single common superclass so each function has a subclass that assures that common features of the function have common methods in a test.
  • The problem with this approach is that if any element of the list fails the test, the later elements don't get tested.

    If you look at it from the point of view that, if a test fails, that is critical and your entire package is invalid, then it doesn't matter that other elements won't get tested, because 'hey, you have an error to fix'.

    Once that test passes, the other tests will then run.

    Admittedly there is information to be gained from knowledge of which other tests are failing, and that can help with debugging, but apart from that, assume any test failure is an entire application failure.

    saffsd : I recognize that, but another counter-argument is that if you are running tests in a batch, say overnight, you want to know where all of the failures are, not just the first one.
  • If you're already using nose (and some of your comments suggest you are), why don't you just use Test Generators, which are the most straightforward way to implement parametric tests I've come across:

    For example:

    from binary_search import search1 as search
    
    def test_binary_search():
        data = (
            (-1, 3, []),
         (-1, 3, [1]),
         (0,  1, [1]),
         (0,  1, [1, 3, 5]),
         (1,  3, [1, 3, 5]),
         (2,  5, [1, 3, 5]),
         (-1, 0, [1, 3, 5]),
         (-1, 2, [1, 3, 5]),
         (-1, 4, [1, 3, 5]),
         (-1, 6, [1, 3, 5]),
         (0,  1, [1, 3, 5, 7]),
         (1,  3, [1, 3, 5, 7]),
         (2,  5, [1, 3, 5, 7]),
         (3,  7, [1, 3, 5, 7]),
         (-1, 0, [1, 3, 5, 7]),
         (-1, 2, [1, 3, 5, 7]),
         (-1, 4, [1, 3, 5, 7]),
         (-1, 6, [1, 3, 5, 7]),
         (-1, 8, [1, 3, 5, 7]),
        )
    
        for result, n, ns in data:
         yield check_binary_search, result, n, ns
    
    def check_binary_search(expected, n, ns):
        actual = search(n, ns)
        assert expected == actual
    

    Produces:

    $ nosetests -d
    ...................
    ----------------------------------------------------------------------
    Ran 19 tests in 0.009s
    
    OK
    
  • The above metaclass code has trouble with nose because nose's wantMethod in its selector.py is looking at a given test method's __name__, not the attribute dict key.

    To use a metaclass defined test method with nose, the method name and dictionary key must be the same, and prefixed to be detected by nose (ie with 'test_').

    # test class that uses a metaclass
    class TCType(type):
        def __new__(cls, name, bases, dct):
            def generate_test_method():
                def test_method(self):
                    pass
                return test_method
    
            dct['test_method'] = generate_test_method()
            return type.__new__(cls, name, bases, dct)
    
    class TestMetaclassed(object):
        __metaclass__ = TCType
    
        def test_one(self):
            pass
        def test_two(self):
            pass
    
  • You don't have to use Meta Classes here. A simple loop fits just fine. Take a look at the example below:

    import unittest
    class TestCase1(unittest.TestCase):
        def check_something(self, param1):
            self.assertTrue(param1)
    
    def _add_test(name, param1):
        def test_method(self):
            self.check_something(param1)
        setattr(TestCase1, 'test_'+name, test_method)
        test_method.__name__ = 'test_'+name
    
    for i in range(0, 3):
        _add_test(str(i), False)
    

    Once the for is executed the TestCase1 has 3 test methods that are supported by both the nose and the unittest.

    Matt Joiner : yeah i find metaclasses for purposes of "instrumenting" single-use classes never flies well, this is a much better approach.

0 comments:

Post a Comment