Bond and site percolation

The family of percolation processes are common in physics as a way to study the gradual accretion of nodes (site percolation) or edges (bond percolation) in a network. The process itself is easy to describe.

Bond percolation: The network has all its edges labelled as “unoccupied”. For a given probability \(\phi\) each edge is marked as occupied. At the end of this process the occupied edges will determine a sub-graph of the original network, consisting of one or more connected components.

Site percolation: The network’s nodes are marked an unoccupied, and each is then occupied with probability \(\phi\). At the end of the process the edges between occupied nodes are marked as occupied, and again define a sub-graph.

Note

To perform bond (edge) percolation in conjuction with epidemic processes see Percolate.

The reason percolation is of interest is that it provides a model of a wide range of interesting processes. (Originally it was used to study crack formation in crystals and waterflow in soils.) It also bears a striking similarity to the ways in which epidemics form, with the final occupied sub-graph being mathematically related to the contact tree of an epidemic. This lets us study epidemics using percolation, an approach pioneered by Newman [New02].

The Newman-Ziff algorithm

While percolation can also be used in conjunction with epidemic processes, it is often used standalone to study the size of the giant connected component (GCC) formed by the occupied sub-graph. An algorithm due to Newman and Ziff [NZ00] can generate instances of percolated sub-graphs for every value of \(\phi\) in a single sweep through the network. The algorithm is very fast, performing (for example) a site percolation experiment on a million-node network in under 30s.

Newman-Ziff has variants for both bond (occupying edges) and site (occupying nodes) percolation. Both derive from the same base class, which isn’t used directly.

class epydemic.NewmanZiff(g=None, samples=None)

Bases: NetworkExperiment

Base class for the Newman-Ziff site and bond percolation algorithm, as described in [NZ00].

Parameters:
  • g (Optional[Graph]) – (optional) the underlying network or generator

  • samples (Union[int, Iterable[float], None]) – (optional) number of samples or list of sample points (defaults to 100)

Two methods control the lifecycle of the percolation experiment.

NewmanZiff.setUp(params)

Set up the process.

Parameters:

params (Dict[str, Any]) – the experimental parameters

NewmanZiff.tearDown()

Throw away the components data structure at tear-down.

Two other methods control sampling and reporting.

NewmanZiff.sample(p)

Take a sample. The default does nothing, and it overridden by sub-classes.

Parameters:

p (float) – the current occupation probability

Return type:

Dict[str, Any]

Returns:

an empty dict

NewmanZiff.report(params, meta, res)

Return a properly-structured dict of results. The default returns a dict with results keyed by Experiment.RESULTS, the data point in the parameter space keyed by Experiment.PARAMETERS, and timing and other metadata keyed by Experiment.METADATA. Overriding this method can be used to record extra metadata values, but be sure to call the base method as well.

If the experimental results are a list of results dicts, then we report a results dict whose results are a list of results dicts. This is used by RepeatedExperiment amd other experiments that want to report multiple sets of results.

Parameters:
  • params (Dict[str, Any]) – the parameters we ran under

  • meta (Dict[str, Any]) – the metadata for this run

  • res (Union[Dict[str, Any], List[Dict[str, Dict[str, Any]]]]) – the direct experimental results from do()

Return type:

Dict[str, Dict[str, Any]]

Returns:

a results dict

There are several methods that can be used to query the progress of a percolation process or to create new percolation process variants.

NewmanZiff.components()

Return the number of components in the network.

Return type:

int

Returns:

the number of components

NewmanZiff.componentSize(n)

Return the size of the component containing the node.

Parameters:

n (Any) – the node

Return type:

int

Returns:

the size of the component of which this node is part

NewmanZiff.largestComponentSize()

Return the size of the largest component.

Return type:

int

Returns:

the size of largest component

NewmanZiff.inLargestComponent(n)

Return True if the given node is part of the largest component. Note that this is non-deterministic, in the sense that before the percolation threshold there will be more than one “largest” cluster (and mightr indeed be several even afterwards): in a largest cluster might be a better interpretation.

Parameters:

n (Any) – the node

Return type:

bool

Returns:

True if the node is part of the largest component

These are mainly used either for sub-classing when creating new percolation experiments, or in defining events for use with event taps

Results

Percolation is a very detailed process, occupying edges or nodes one at a time. This makes it both time-consuming and data-rich: too data-rich in the main. For this reason the constructors take either a number or a sequence of sample points at which to return data about the progress of the process.

Both bond and site percolation processes result in two time series: one consisting of the percolation probabilities at which the process was sampled; and the other holding the sizes of the largest connected component at these probabilities.

Warning

Versions of epydemic prior to 1.9.1 laid out the results of percolation processes slightly differently.

Bond percolation

class epydemic.BondPercolation(g=None, samples=None)

Bases: NewmanZiff

A bond (edge) percolation experiment. This experiment computes the size of the giant connected component (GCC) as edges are “occupied” within the underlying network. It samples the GCC at a given sequence of occupation probabilities, returning a time series of the growth of the GCC.

Each occupation of a bond is treated as an event.

Parameters:
  • g (Optional[Graph]) – (optional) the underlying network or generator

  • samples (Union[int, Iterable[float], None]) – (optional) number of samples or list of sample points (defaults to 100)

BondPercolation.setUp(params)

Set up the process, creating the initial components data structure from the underlying network.

Parameters:

params (Dict[str, Any]) – the experimental parameters

BondPercolation.do(params)

Perform the percolation process. This passes a permuted list of edges to NewmanZiff.percolate(), which performs the actual operation.

Parameters:

params (Dict[str, Any]) – experimental parameters

Return type:

List[Dict[str, Any]]

Returns:

a list of dicts of experimental results

The percolation process returns two time series:

BondPercolation.P: Final[str] = 'epydemic.bondpercolation.pOccupied'

Result holding series of percolation values.

BondPercolation.GCC: Final[str] = 'epydemic.bondpercolation.gcc'

Result holding sizes of largest component.

The process has two events:

BondPercolation.OCCUPY: Final[str] = 'epydemic.bondpercolation.occupy'

Name for bond-occupation event.

BondPercolation.SAMPLE: Final[str] = 'epydemic.bondpercolation.sample'

Name for sampling event.

Note

See the cookbook recipe Finding the percolation threshold for an example of using this class.

Site percolation

class epydemic.SitePercolation(g=None, samples=None)

Bases: NewmanZiff

A site (node) percolation experiment. This experiment computes the size of the giant connected component (GCC) as nodes are “occupied” within the underlying network. It samples the GCC at a given sequence of occupation probabilities, returning a time series of the growth of the GCC.

Parameters:
  • g (Optional[Graph]) – (optional) the underlying network or generator

  • samples (Union[int, Iterable[float], None]) – (optional) number of samples or list of sample points (defaults to 100)

SitePercolation.OCCUPY: Final[str] = 'epydemic.sitepercolation.occupy'

Name for site-occupation event.

SitePercolation.setUp(params)

Set up the process, creating the initial components data structure from the underlying network.

Parameters:

params (Dict[str, Any]) – the experimental parameters

SitePercolation.do(params)

Perform the percolation process. This passes a permuted list of nodes to NewmanZiff.percolate(), which performs the actual operation.

Parameters:

params (Dict[str, Any]) – experimental parameters

Return type:

List[Dict[str, Any]]

Returns:

a list of dicts of experimental results

The percolation process returns two time series:

SitePercolation.P: Final[str] = 'epydemic.sitepercolation.pOccupied'

Result holding sequence of percolation values.

SitePercolation.GCC: Final[str] = 'epydemic.sitepercolation.gcc'

Result holding sizes of largest component.

The process has two events:

SitePercolation.OCCUPY: Final[str] = 'epydemic.sitepercolation.occupy'

Name for site-occupation event.

SitePercolation.SAMPLE: Final[str] = 'epydemic.sitepercolation.sample'

Name for sampling event.

Events

Both bond and site percolation have two types of event:

  • A “occupation” event, occurring whenever an edge or node is occupied as part of the percolation; and

  • A “sampling” event, which occurs every time the largest component is sampled according to the schedule set in the constructor.

Each event is fed to the event taps interface. The occupation event passes whichever element was occupied, the sampling event always passes None.