Skip to content
FND edited this page Oct 18, 2015 · 10 revisions

Gabbi's purpose is to turn textual descriptions of HTTP interactions into executable tests in order to verify the behavior of HTTP server applications. YAML files are used to express requests along with the expected response.

Gabbi can be invoked either via a programmatic Python API [what's the entry point?] or as a command-line (CLI) application (gabbi-run). The target host may be any HTTP server - alternatively, when using the Python API in combination with a Python web application, the network connection can be short-circuited by using wsgi-intercept for improved performance and isolation [too vague].

Test Creation

Upon invocation, one or more YAML files are parsed and turned into test suites, each containing a sequence of tests:

   +-----------+                                  Legend:
   | YAML file |
   +-----------+                                  +--------+  /-----------\
         |                                        | entity |  | operation |
         |                                        +--------+  \-----------/
         v
     /-------\
     | parse |
     \-------/
         |
         |
         v
 +---------------+
 | suite dict    |
 | * fixtures - -|- - - - - - - - - - - - - - - - - - - - - - - - +
 | * defaults    |                                                :
 | * test dicts¹ |                                                :
 +---------------+                                                :
         |                                                        :
         |                                                        :
         v                                                        v
/-----------------\                                        +--------------+
| merge defaults² |                                        | `GabbiSuite` |
\-----------------/                                        +--------------+
         |                                                        ^
         |                                                        |
         v                                                        |
  +-------------+     /----------\     +------------------+       |
  | test dicts¹ |---->| validate |---->| `HTTPTestCase`s¹ |-------+
  +-------------+     \----------/     +------------------+

¹ ordered sequence

² includes both gabbi and per-suite defaults

[!] validation includes response handlers - replacers too?

[!] @cdent: "HttpTestCase operates as a metaclass …"

Test dicts describe the actual HTTP interactions in terms of request and response properties[?]. The set of response properties is largely determined by whatever response handlers are registered on HTTPTestCase.

Fixtures are used to establish the testing environment for the respective suite, e.g. setting up (and later tearing down) the back end and/or database. Suite dicts merely reference the respective class names which contain the actual implementation.

Test cases and suites are based on Python's unittest framework (see below). The order of test cases is materialized as a linked list, with each test case referencing its predecessor.

Test Discovery and Selection

Once a test suite has been constructed, it can be executed either directly or via a test runner. The latter is responsible for orchestrating test execution and reporting results.

The CLI application uses a simple runner to execute the respective suite's tests and provide output to a terminal. However, many existing test harnesses use a more complex custom runner, e.g. to distribute test execution across CPUs or machines (cf. testr). In such a scenario, an arbitrary test case might serve as the entry point - to retain the original order, each test case is responsible for ensuring that its predecessor has been executed.

Test Execution

When a test is executed, the corresponding request is sent to target host and the response checked against the expectations. These checks are largely delegated to individual response handlers.

TBD: replacers

Result reporting is determined by the respective runner.

unittest Foundations

unittest was chosen originally to ensure compatibility with OpenStack's infrastructure.

A unittest TestCase is a class that can contain multiple test methods, each of which can have multiple assertions. These are aggregated in a sequence on the TestCase. In gabbi each HTTPTestCase has only one test method as that simplified managing the state of HTTP requests (e.g. prior referencing an instance rather than a method).

In the unittest world, both a case and a suite have the same interface to make them "run" (i.e. perform the tests within). This is effectively what allows a suite to contain other suites and a suite to contain cases and cases to contain tests. That interface is usually run(). In old-school unittests if you run a suite, it does run() on its contents. This is effectively what happens under the covers of gabbi-run: The suite is created and given to ConciseTestRunner to run.

When using build_tests a different process happens after creation: It creates a suite and then returns it to whatever calls it. In all extant uses the caller is an implementation of the load tests protocol. The test harness/runner then chooses how it will actually run() the tests. This can vary quite based on the runner involved (cf. testr).

Ideally the cost of instantiating an HTTPTestCase is low and the effort is in its run()ing. This is part of what drives the structure of fixtures and the need for lots of data to live on the HTTPTestCase: We want any single instantiated test case to know how to establish its required context so that the independent calling described above will work.

Whatever the runner is, it will a results class in which the runner will accumulate the results. You can see this in gabbi.reporter.

testr

testr will call build_tests multiple times. It:

  • creates lists of tests by running load_tests in each module that has it (two in the case of gabbi), each of these will call build_tests
  • splits those lists according to concurrency plans
  • calls load_tests again (usually once for each CPU) to match the lists against the available tests and run those selected

.testr.conf contains a group_regex which ensures that tests from the same file will end up on the same processor. This avoids duplication between individual test runs: If two tests named in the same file are split to two different processes then all the tests prior to the individual test will be run in that process causing duplicated results across the two processes.

Within the same process it is possible for testr (or any other runner) to ask for tests out of order (testr, for example, randomizes the order of test cases). To preserve ordering and protect against duplication an HTTPTestCase has two tools at its disposal:

  • if the current test has already run, it will not run again
  • if it has not run, it will check to see if its prior has run and run it if not

So while it is possible for the tests to be asked to run weirdly, they will not. Getting this part right revealed a lot about testrunners.

Clone this wiki locally