Bond and site percolation

The family of percolation processes are common in physics as a way to study the gradual accretion of nodes (site percolation) or edges (bond percolation) in a network. The process itself is easy to describe.

Bond percolation: The network has all its edges labelled as “unoccupied”. For a given probability \(\phi\) each edge is marked as occupied. At the end of this process the occupied edges will determine a sub-graph of the original network, consisting of one or more connected components.

Site percolation: The network’s nodes are marked an unoccupied, and each is then occupied with probability \(\phi\). At the end of the process the edges between occupied nodes are marked as occupied, and again define a sub-graph.

Note

To perform bond (edge) percolation in conjuction with epidemic processes see Percolate.

The reason percolation is of interest is that it provides a model of a wide range of interesting processes. (Originally it was used to study crack formation in crystals and waterflow in soils.) It also bears a striking similarity to the ways in which epidemics form, with the final occupied sub-graph being mathematically related to the contact tree of an epidemic. This lets us study epidemics using percolation, an approach pioneered by Newman [New02].

The Newman-Ziff algorithm

While percolation can also be used in conjunction with epidemic processes, it is often used standalone to study the size of the giant connected component (GCC) formed by the occupied sub-graph. An algorithm due to Newman and Ziff [NZ00] can generate instances of percolated sub-graphs for every value of \(\phi\) in a single sweep through the network. The algorithm is very fast, performing (for example) a site percolation experiment on a million-node network in under 30s.

Newman-Ziff has variants for both bond (occupying edges) and site (occupying nodes) percolation. Both derive from the same base class, which isn’t used directly.

class epydemic.NewmanZiff(g=None, samples=None)

Bases: NetworkExperiment

Base class for the Newman-Ziff site and bond percolation algorithm, as described in [NZ00].

Parameters:

g (Graph) – (optional) the underlying network or generator
samples (Union[int, Iterable[float]]) – (optional) number of samples or list of sample points (defaults to 100)

Two methods control the lifecycle of the percolation experiment.

NewmanZiff.setUp(params)

Set up the process.

Parameters:: params (Dict[str, Any]) – the experimental parameters

NewmanZiff.tearDown(): Throw away the components data structure at tear-down.

Two other methods control sampling and reporting.

NewmanZiff.sample(p)

Take a sample. The default does nothing, and it overridden by sub-classes.

Parameters:: p (float) – the current occupation probability
Return type:: Dict[str, Any]
Returns:: an empty dict

NewmanZiff.report(params, meta, res)

Return a properly-structured dict of results. The default returns a dict with results keyed by Experiment.RESULTS, the data point in the parameter space keyed by Experiment.PARAMETERS, and timing and other metadata keyed by Experiment.METADATA. Overriding this method can be used to record extra metadata values, but be sure to call the base method as well.

If the experimental results are a list of results dicts, then we report a results dict whose results are a list of results dicts. This is used by RepeatedExperiment amd other experiments that want to report multiple sets of results.

Parameters:

params (Dict[str, Any]) – the parameters we ran under
meta (Dict[str, Any]) – the metadata for this run
res (Union[Dict[str, Any], List[Dict[str, Dict[str, Any]]]]) – the direct experimental results from do()

Return type:

Dict[str, Dict[str, Any]]

Returns:

a results dict

There are several methods that can be used to query the progress of a percolation process or to create new percolation process variants.

NewmanZiff.components()

Return the number of components in the network.

Return type:: int
Returns:: the number of components

NewmanZiff.componentSize(n)

Return the size of the component containing the node.

Parameters:: n (Any) – the node
Return type:: int
Returns:: the size of the component of which this node is part

NewmanZiff.largestComponentSize()

Return the size of the largest component.

Return type:: int
Returns:: the size of largest component

NewmanZiff.inLargestComponent(n)

Return True if the given node is part of the largest component. Note that this is non-deterministic, in the sense that before the percolation threshold there will be more than one “largest” cluster (and mightr indeed be several even afterwards): in a largest cluster might be a better interpretation.

Parameters:: n (Any) – the node
Return type:: bool
Returns:: True if the node is part of the largest component

Results

Percolation is a very detailed process, occupying edges or nodes one at a time. This makes it both time-consuming and data-rich: too data-rich in the main. For this reason the constructors take either a number or a sequence of sample points at which to return data about the progress of the process.

Both bond and site percolation processes result in two time series: one consisting of the percolation probabilities at which the process was sampled; and the other holding the sizes of the largest connected component at these probabilities.

Warning

Versions of epydemic prior to 1.9.1 laid out the results of percolation processes slightly differently.

Bond percolation

class epydemic.BondPercolation(g=None, samples=None)

Bases: NewmanZiff

A bond (edge) percolation experiment. This experiment computes the size of the giant connected component (GCC) as edges are “occupied” within the underlying network. It samples the GCC at a given sequence of occupation probabilities, returning a time series of the growth of the GCC.

Each occupation of a bond is treated as an event. The event will be sent to the event tap.

The working copy of the network will initially have no edges. As the process evolves, the working network will grow as edges are occupied. This means that you can observe the percolating network directly from sample() by calling NetworkExperiment.network(), for example to observe its other properties.

Parameters:: g (Graph) – (optional) the underlying network or generator :param

samples: (optional) number of samples or list of sample points (defaults to 100)

BondPercolation.setUp(params)

Set up the process, creating the initial components data structure from the underlying network.

Parameters:: params (Dict[str, Any]) – the experimental parameters

BondPercolation.do(params)

Perform the percolation process. This passes a permuted list of edges to NewmanZiff.percolate(), which performs the actual operation.

Parameters:: params (Dict[str, Any]) – experimental parameters
Return type:: List[Dict[str, Any]]
Returns:: a list of dicts of experimental results

The percolation process returns two time series:

BondPercolation.P: Final[str] = 'epydemic.bondpercolation.pOccupied': Result holding series of percolation values.

BondPercolation.GCC: Final[str] = 'epydemic.bondpercolation.gcc': Result holding sizes of largest component.

The process has one event:

BondPercolation.OCCUPY: Final[str] = 'epydemic.bondpercolation.occupy': Name for bond-occupation event.

BondPercolation.occupy(n, m)

Occupy an edge. If this causes two components to be joined, update the GCC and return the size of the new component; otherwise return None.

To collect statistics as the network grows, override sample() in preference to this method, as that allows the sample rate to be controlled.

Parameters:

n (Any) – one node
m (Any) – the other node

Return type:

Optional[int]

Returns:

the size of any newly-combined component, or None

Note

See the cookbook recipe Finding the percolation threshold for an example of using this class.

Site percolation

class epydemic.SitePercolation(g=None, samples=None)

Bases: NewmanZiff

A site (node) percolation experiment. This experiment computes the size of the giant connected component (GCC) as nodes are “occupied” within the underlying network. It samples the GCC at a given sequence of occupation probabilities, returning a time series of the growth of the GCC.

The working copy of the network will initially have no nodes. As the process evolves, the working network will grow as nodes are occupied, and edges between occupied nodes will also be added. This means that you can observe the percolating network directly from sample() by calling NetworkExperiment.network(), for example to observe its other properties.

This mthod doesn’t maintain the sense of edges as they are occupied, and wouldn’t work as expected with directed networks.

Parameters:

g (Graph) – (optional) the underlying network or generator
samples (Union[int, Iterable[float]]) – (optional) number of samples or list of sample points (defaults to 100)

SitePercolation.OCCUPY: Final[str] = 'epydemic.sitepercolation.occupy': Name for site-occupation event.

SitePercolation.setUp(params)

Set up the process, creating the initial components data structure from the underlying network.

Parameters:: params (Dict[str, Any]) – the experimental parameters

SitePercolation.do(params)

Perform the percolation process. This passes a permuted list of nodes to NewmanZiff.percolate(), which performs the actual operation.

Parameters:: params (Dict[str, Any]) – experimental parameters
Return type:: List[Dict[str, Any]]
Returns:: a list of dicts of experimental results

The percolation process returns two time series:

SitePercolation.P: Final[str] = 'epydemic.sitepercolation.pOccupied': Result holding sequence of percolation values.

SitePercolation.GCC: Final[str] = 'epydemic.sitepercolation.gcc': Result holding sizes of largest component.

The process has one events:

SitePercolation.OCCUPY: Final[str] = 'epydemic.sitepercolation.occupy': Name for site-occupation event.

SitePercolation.occupy(nr)

Occupy a node, which is then joined to all adjacent occupied nodes. This will update the GCC and return the size of the new component.

To collect statistics as the network grows, override sample() in preference to this method, as that allows the sample rate to be controlled.

Parameters:: nr (Any) – the node
Return type:: int
Returns:: the size of any newly-combined component