Plugin Testing¶
Prerequisite Reading¶
Familiarity with the concepts covered in the following sections are highly recommended:
Introduction¶
This document provides some recommendations when it comes to testing plugins.
This is not a comprehensive guide to testing in general.
We’ll cover ways to test the different plugin parts.
Testing Abstract Types, Concrete Types, Wrappers¶
Testing types is difficult to do in isolation outside of trivially testing methods and attributes.
There are no Metagraph-specific testing practices we recommend in particular here.
Much of the functionality provided by types will be rigorously tested indirectly in the algorithm tests and translator tests.
Testing Translators¶
It is not recommended to write tests for every translator included in a module.
The exception to this advice is if a plugin defines a new AbstractType
.
Testing of translators is typically done using the metagraph.core.roundtrip.RoundTripper
class, which
tests all possible translation paths for a given abstract type.
Here’s an example:
import metagraph as mg
from metagraph.plugins.numpy.types import NumpyNodeMap
def test_nodemap_roundtrip(resolver):
rt = RoundTripper(resolver)
nodes = np.array([1, 2, 42, 99])
vals = np.array([12.5, 33.4, -1.2, 0.0])
rt.verify_round_trip(NumpyNodeMap(vals, nodes=nodes))
Here we test translation from a NumpyNodeMap to all other reachable NodeMaps and back again.
We use the Metagraph resolver’s translate
method to translate and NumpyNodeMap’s assert_equal
method to verify that
the translations yield identical objects.
Testing Algorithms¶
Since abstract algorithms are merely specifications, we can only test concrete algorithms.
When testing concrete algorithms, simply testing that outputs for given inputs match expectations is insufficient for verifying that the outputs also match the results from concrete algorithms written in other plugins (that correspond to the same abstract algorithm).
We highly recommend using the utility metagraph.core.multiverify.MultiVerify
as it verifies that all concrete
algorithms for a given abstract algorithm give the same result.
It does this by finding all the concrete algorithms for the given abstract algorithm,
using the resolver’s translators to translate the input types to the appropriate types for each concrete
algorithm, and using the assert_equal
method of the concrete types to verify that results from all the concrete
algorithms are the same.
This also indirectly tests translators.
MultiVerify and MultiResults¶
MultiVerify(r)
creates a MultiVerify instance bound to the Resolver r
.
Following creation, .compute()
is called to exercise all implementations of an abstract algorithm.
The signature for compute
is:
r = mg.resolver mv = MultiVerify(r) results = mv.compute("path.to.abstract.algorithm", *args, **kwargs)
The abstract algorithm path is identical to the path used to call the algorithm from the resolver.
For example, “centrality.pagerank” and resolver.algos.centrality.pagerank
are equivalent ways to
point to PageRank abstract algorithm.
The args
and kwargs
of the algorithm are passed in following the algorithm name in a manner similar
to functools.partial
.
The result of calling compute
is a MultiResult
instance.
MultiResult¶
The MultiResult
contains a map of all concrete algorithms and their associated return values.
MultiResults can be used in several ways:
.normalize(type)
Converts all results to a consistent type
.assert_equal(expected_value)
Compares all results against an expected value
custom_compare(cmp_func)
Passes each result to a comparison function
MultiVerify.transform(exact_algo, *args, **kwargs)
The MultiResult is an argument to
transform
, allowing further refinement of the result using additional Metagraph algorithms[index]
If the concrete algorithms return tuples, slicing the MultiResult will return a new MultiResult with each tuple sliced accordingly
normalize¶
Calling results.normalize(type)
will perform the translation of those results to the indicated
type. This is a pre-requisite step for several other operations, although it is not required for assert_equal
as the type of the comparison is known and normalize
is called under the hood.
The result of calling normalize
is a new MultiResult
with the same concrete algorithm list, but all results
are of a uniform type.
assert_equal¶
assert_equal
compares and expected result with each value of the MultiResult
. The output of each
algorithm is translated to the same type as the expected result before calling the type’s assert_equal
method.
Here’s an example:
import networkx as nx
import numpy as np
import metagraph as mg
from metagraph.core.multiverify import MultiVerify
networkx_graph_data = [(0, 1), (0, 2), (2, 0), (1, 2), (3, 2)]
networkx_graph = nx.DiGraph()
networkx_graph.add_edges_from(networkx_graph_data)
expected_val = {
0: 0.37252685132844066,
1: 0.19582391181458728,
2: 0.3941492368569718,
3: 0.037500000000000006,
}
graph = mg.wrappers.Graph.NetworkXGraph(networkx_graph)
MultiVerify(mg.resolver).compute(
"centrality.pagerank",
graph,
tolerance=1e-7
).assert_equal(expected_val, rel_tol=1e-5)
Calling .assert_equal()
with the expected value and any parameters affecting the comparison accuracy
will perform the assertions. If no AssertionError
is raised, then the results from all concrete algorithms
match the expected value.
If any fail, an error is raised with additional information of which algorithm produced the failing results.
custom_compare¶
Sometimes, assert_equal
is insufficient for verifying that multiple concrete algorithms have the same
behavior. custom_compare
gives the user full flexibility over how to compare results which by nature are
non-deterministic.
Consider the Louvain community detection algorithm. This algorithm attempts to find communities in a graph that minimize a modularity metric, but includes elements of randomness in the solution. Thus, simply checking for the same community label assignments for each node in a node map is insufficient.
Here’s an example of how custom_compare
might be used to verify reasonable correctness:
import metagraph as mg
from metagraph.core.multiverify import MultiVerify
ebunch = [
(0, 3, 1),
(1, 0, 2),
(1, 4, 3),
(2, 5, 5),
(2, 7, 6),
(3, 1, 7),
(3, 4, 8),
(5, 6, 10),
(6, 2, 11),
]
nx_graph = nx.Graph()
nx_graph.add_weighted_edges_from(ebunch)
graph = mg.wrappers.Graph.NetworkXGraph(nx_graph)
def cmp_func(x):
x_graph, modularity_score = x
assert len(x_graph) == 8, len(x_graph)
assert modularity_score > 0.45
results = MultiVerify(mg.resolver).compute("clustering.louvain_community", graph)
results.normalize(mg.types.NodeMap.PythonNodeMapType).custom_compare(cmp_func)
custom_compare
takes a comparison function (in this example cmp_func
). The comparison function is passed the output
of each concrete algorithm and verifies expected behavior.
To ensure that the comparison function only has to deal with a single type, normalize
is typically called prior
to calling custom_compare
. In this case, the normalization is not strictly necessary as all NodeMap
objects
have a __len__
property.
In this example, cmp_func
simply takes the modularity score and verifies that it is above a selected threshold.
transform¶
While assert_equal
is for exact matches and custom_compare
gives full flexibility, transform
provides
a hybrid solution to the problem of non-deterministic results.
Often, while the solution is non-deterministic, there are elements of the solution which will be deterministic.
Consider a max-flow problem. At the bottlenecks, the flow into a node will be consistent for any correct solution.
Thus, if we can remove all other nodes and simply compare values for the bottleneck nodes, we could use the simpler
assert_equal
method. That is where transform
comes in to play.
transform
behaves nearly identically to compute
with two key differences:
At least one arg of kwarg must be a normalized
MultiResult
.The first argument is not the path to an abstract algorithm. It must be an exact algorithm call.
The reason for these differences is that we are not trying to exercise all concrete algorithms to check for correctness.
Instead, we simply want to run all the results through the same algorithm in order to simplify the result in
preparation for a call to assert_equal
or custom_compare
.
Multiple transforms can be performed on the results, chained together sequentially.
Here is an example for max flow:
nx_graph = nx.DiGraph()
nx_graph.add_weighted_edges_from([
(0, 1, 9),
(0, 3, 10),
(1, 4, 3),
(2, 7, 6),
(3, 1, 2),
(3, 4, 8),
(4, 5, 7),
(4, 2, 4),
(5, 2, 5),
(5, 6, 1),
(6, 2, 11),
])
graph = mg.wrappers.Graph.NetworkXGraph(nx_graph)
# These are the elements of the result which *are* deterministic
expected_flow_value = 6
bottleneck_nodes = {2, 4}
expected_nodemap = {2: 6, 4: 6}
mv = MultiVerify(mg.resolver)
results = mv.compute("flow.max_flow", graph, source_node=0, target_node=7)
# Note: each algorithm returns a tuple of (flow_rate, graph_of_flow_values)
# Compare flow rate
results[0].assert_equal(expected_flow_value)
# Normalize actual flow to prepare to transform
actual_flow = results[1].normalize(dpr.wrappers.Graph.NetworkXGraph)
# Compare sum of out edges for bottleneck nodes
out_edges = mv.transform(
mg.plugins.core_networkx.algos.util.graph.aggregate_edges,
actual_flow,
lambda x, y: x + y,
initial_value=0,
)
out_bottleneck = mv.transform(
mg.algos.util.nodemap.select.core_python, out_edges, bottleneck_nodes
)
out_bottleneck.assert_equal(expected_nodemap)
The result from max flow is a tuple of (flow_rate, graph_of_flow_values). The flow rate is compared first by
indexing into the results and using assert_equal
with the expected flow value.
The graph of flow values is first normalized, then transformed by aggregating edges and another transformation
to filter to only keep the bottleneck nodes. At this point, the results are deterministic and can be compared
using assert_equal
.
The workflow for each concrete algorithm was:
Compute max flow
The flow rate was compared against the expected value
The flow graph was normalized to a networkx graph
The flow graph was translated to a node map using the core_networkx plugin’s version of “graph.aggregate_edges”
The node map was filtered using the core_python plugin’s version of “nodemap.select”
The smaller node map was compared against the expected result
This comparison could have been done using custom_compare
and manually calculating the flow rate out of
the bottleneck nodes, but using transform
allows easy access to existing
utility algorithms which are often adequate to extract the deterministic portions and compare using assert_equal
.