Bond and site percolation
The family of percolation processes are common in physics as a way to study the gradual accretion of nodes (site percolation) or edges (bond percolation) in a network. The process itself is easy to describe.
Bond percolation: The network has all its edges labelled as “unoccupied”. For a given probability \(\phi\) each edge is marked as occupied. At the end of this process the occupied edges will determine a sub-graph of the original network, consisting of one or more connected components.
Site percolation: The network’s nodes are marked an unoccupied, and each is then occupied with probability \(\phi\). At the end of the process the edges between occupied nodes are marked as occupied, and again define a sub-graph.
Note
To perform bond (edge) percolation in conjuction with epidemic
processes see Percolate
.
The reason percolation is of interest is that it provides a model of a wide range of interesting processes. (Originally it was used to study crack formation in crystals and waterflow in soils.) It also bears a striking similarity to the ways in which epidemics form, with the final occupied sub-graph being mathematically related to the contact tree of an epidemic. This lets us study epidemics using percolation, an approach pioneered by Newman [New02].
The Newman-Ziff algorithm
While percolation can also be used in conjunction with epidemic processes, it is often used standalone to study the size of the giant connected component (GCC) formed by the occupied sub-graph. An algorithm due to Newman and Ziff [NZ00] can generate instances of percolated sub-graphs for every value of \(\phi\) in a single sweep through the network. The algorithm is very fast, performing (for example) a site percolation experiment on a million-node network in under 30s.
Newman-Ziff has variants for both bond (occupying edges) and site (occupying nodes) percolation. Both derive from the same base class, which isn’t used directly.
- class epydemic.NewmanZiff(g=None, samples=None)
Bases:
NetworkExperiment
Base class for the Newman-Ziff site and bond percolation algorithm, as described in [NZ00].
- Parameters:
g (
Optional
[Graph
]) – (optional) the underlying network or generatorsamples (
Union
[int
,Iterable
[float
],None
]) – (optional) number of samples or list of sample points (defaults to 100)
Two methods control the lifecycle of the percolation experiment.
- NewmanZiff.setUp(params)
Set up the process.
- Parameters:
params (
Dict
[str
,Any
]) – the experimental parameters
- NewmanZiff.tearDown()
Throw away the components data structure at tear-down.
Two other methods control sampling and reporting.
- NewmanZiff.sample(p)
Take a sample. The default does nothing, and it overridden by sub-classes.
- Parameters:
p (
float
) – the current occupation probability- Return type:
Dict
[str
,Any
]- Returns:
an empty dict
- NewmanZiff.report(params, meta, res)
Return a properly-structured dict of results. The default returns a dict with results keyed by
Experiment.RESULTS
, the data point in the parameter space keyed byExperiment.PARAMETERS
, and timing and other metadata keyed byExperiment.METADATA
. Overriding this method can be used to record extra metadata values, but be sure to call the base method as well.If the experimental results are a list of results dicts, then we report a results dict whose results are a list of results dicts. This is used by
RepeatedExperiment
amd other experiments that want to report multiple sets of results.- Parameters:
params (
Dict
[str
,Any
]) – the parameters we ran undermeta (
Dict
[str
,Any
]) – the metadata for this runres (
Union
[Dict
[str
,Any
],List
[Dict
[str
,Dict
[str
,Any
]]]]) – the direct experimental results from do()
- Return type:
Dict
[str
,Dict
[str
,Any
]]- Returns:
There are several methods that can be used to query the progress of a percolation process or to create new percolation process variants.
- NewmanZiff.components()
Return the number of components in the network.
- Return type:
int
- Returns:
the number of components
- NewmanZiff.componentSize(n)
Return the size of the component containing the node.
- Parameters:
n (
Any
) – the node- Return type:
int
- Returns:
the size of the component of which this node is part
- NewmanZiff.largestComponentSize()
Return the size of the largest component.
- Return type:
int
- Returns:
the size of largest component
- NewmanZiff.inLargestComponent(n)
Return True if the given node is part of the largest component. Note that this is non-deterministic, in the sense that before the percolation threshold there will be more than one “largest” cluster (and mightr indeed be several even afterwards): in a largest cluster might be a better interpretation.
- Parameters:
n (
Any
) – the node- Return type:
bool
- Returns:
True if the node is part of the largest component
These are mainly used either for sub-classing when creating new percolation experiments, or in defining events for use with event taps
Results
Percolation is a very detailed process, occupying edges or nodes one at a time. This makes it both time-consuming and data-rich: too data-rich in the main. For this reason the constructors take either a number or a sequence of sample points at which to return data about the progress of the process.
Both bond and site percolation processes result in two time series: one consisting of the percolation probabilities at which the process was sampled; and the other holding the sizes of the largest connected component at these probabilities.
Warning
Versions of epydemic
prior to 1.9.1 laid out the results of
percolation processes slightly differently.
Bond percolation
- class epydemic.BondPercolation(g=None, samples=None)
Bases:
NewmanZiff
A bond (edge) percolation experiment. This experiment computes the size of the giant connected component (GCC) as edges are “occupied” within the underlying network. It samples the GCC at a given sequence of occupation probabilities, returning a time series of the growth of the GCC.
Each occupation of a bond is treated as an event.
- Parameters:
g (
Optional
[Graph
]) – (optional) the underlying network or generatorsamples (
Union
[int
,Iterable
[float
],None
]) – (optional) number of samples or list of sample points (defaults to 100)
- BondPercolation.setUp(params)
Set up the process, creating the initial components data structure from the underlying network.
- Parameters:
params (
Dict
[str
,Any
]) – the experimental parameters
- BondPercolation.do(params)
Perform the percolation process. This passes a permuted list of edges to
NewmanZiff.percolate()
, which performs the actual operation.- Parameters:
params (
Dict
[str
,Any
]) – experimental parameters- Return type:
List
[Dict
[str
,Any
]]- Returns:
a list of dicts of experimental results
The percolation process returns two time series:
-
BondPercolation.P:
Final
[str
] = 'epydemic.bondpercolation.pOccupied' Result holding series of percolation values.
-
BondPercolation.GCC:
Final
[str
] = 'epydemic.bondpercolation.gcc' Result holding sizes of largest component.
The process has two events:
-
BondPercolation.OCCUPY:
Final
[str
] = 'epydemic.bondpercolation.occupy' Name for bond-occupation event.
-
BondPercolation.SAMPLE:
Final
[str
] = 'epydemic.bondpercolation.sample' Name for sampling event.
Note
See the cookbook recipe Finding the percolation threshold for an example of using this class.
Site percolation
- class epydemic.SitePercolation(g=None, samples=None)
Bases:
NewmanZiff
A site (node) percolation experiment. This experiment computes the size of the giant connected component (GCC) as nodes are “occupied” within the underlying network. It samples the GCC at a given sequence of occupation probabilities, returning a time series of the growth of the GCC.
- Parameters:
g (
Optional
[Graph
]) – (optional) the underlying network or generatorsamples (
Union
[int
,Iterable
[float
],None
]) – (optional) number of samples or list of sample points (defaults to 100)
-
SitePercolation.OCCUPY:
Final
[str
] = 'epydemic.sitepercolation.occupy' Name for site-occupation event.
- SitePercolation.setUp(params)
Set up the process, creating the initial components data structure from the underlying network.
- Parameters:
params (
Dict
[str
,Any
]) – the experimental parameters
- SitePercolation.do(params)
Perform the percolation process. This passes a permuted list of nodes to
NewmanZiff.percolate()
, which performs the actual operation.- Parameters:
params (
Dict
[str
,Any
]) – experimental parameters- Return type:
List
[Dict
[str
,Any
]]- Returns:
a list of dicts of experimental results
The percolation process returns two time series:
-
SitePercolation.P:
Final
[str
] = 'epydemic.sitepercolation.pOccupied' Result holding sequence of percolation values.
-
SitePercolation.GCC:
Final
[str
] = 'epydemic.sitepercolation.gcc' Result holding sizes of largest component.
The process has two events:
-
SitePercolation.OCCUPY:
Final
[str
] = 'epydemic.sitepercolation.occupy' Name for site-occupation event.
-
SitePercolation.SAMPLE:
Final
[str
] = 'epydemic.sitepercolation.sample' Name for sampling event.
Events
Both bond and site percolation have two types of event:
A “occupation” event, occurring whenever an edge or node is occupied as part of the percolation; and
A “sampling” event, which occurs every time the largest component is sampled according to the schedule set in the constructor.
Each event is fed to the event taps interface. The occupation event passes whichever element was occupied, the sampling event always passes None.