Last updated:
0 purchases
pathcensus 0.1
Welcome to the documentation of pathcensus package.
It is a Python (3.8+) implementation of structural similarity and
complementarity coefficients for undirected (un)weighted networks based
on efficient counting of 2- and 3-paths (triples and quadruples)
and 3- and 4-cycles (triangles and quadrangles).
Structural coefficients are graph-theoretic
measures of the extent to which relations at different levels
(of edges, nodes or entire networks) are driven by similarity or
complementarity between different nodes. Even though they are defined
in purely combinatorial manner they are motivated by geometric arguments
which link them to the family of latent space/random geometric graph models.
In particular, the geometric view allow the identification of network motifs
charactersitic for similarity (triangles) and complementarity (quadrangles).
They can be seen as a generalization of the well-known
local and global clustering coefficients which summarize the structure
of a network in terms of density of ego subgraph(s).
Even though it is a Python package pathcensus is performant as its main
workhorse functions are just-in-time (JIT) compiled to efficient C code
thanks to the numba library. It is compatible with numpy
arrays and scipy sparse matrices making it easy to use in practice.
Moreover, it allows registering graph classes implemented by different
third-party packages such as networkx so they can be converted
automatically to sparse matrices. Conversion methods for networkx,
igraph and graph-tool are registered automatically
provided the packages are installed.
NOTE
pathcensus uses the A_{ij} = 1 convention to indicate
that a node i sends a tie to a node j. Functions converting
graph-like objects to arrays / sparse matrices need to be aware
of that.
NOTE
pathcensus is compatible only with Python versions supported
by numba. In practice it means that it is compatible with all
versions (starting from 3.8) except for the latest one, which usually
starts to be supported by numba with some (often significant)
delay.
For the sake of convenience pathcensus also provides implementations
of most appropriate null models for statistical calibration of structural
coefficients which are simple wrappers around the excellent NEMtropy
package. It also defines the pathcensus.inference submodule with
utility class for facilitating approximate statistical inference based on
sampling from null models.
See examples subfolder and the main documentation for more details.
At the command line via pip:
# Install from PyPI
pip install pathcensus
The current development version (not guaranteed to be stable)
can be installed directly from the github repo
pip install git+ssh://[email protected]/sztal/pathcensus.git
How to cite?
You find the package useful? Please cite our work properly.
Main theory paper
Talaga, S., & Nowak, A. (2022). Structural measures of similarity and complementarity
in complex networks. Scientific Reports, (in press).
Usage
NOTE
Main internal functions for calculating path census are JIT-compiled
when used for the first time. Thus, the first initialization of a
PathCensus object may be quite slow as its execution time will include
the time required for compilation. However, this happens only once.
We will use igraph to generate graphs used in examples. However, even though
it is automatically integrated with pathcensus, igraph is not
a dependency and needs to be installed separately.
# Main imports used in the examples below
import random
import numpy as np
import igraph as ig
from pathcensus import PathCensus
# Set random and numpy rng seeds
random.seed(303)
np.random.seed(101)
More detailed examples can be found in the official documentation.
Path census & structural coefficients
Path census is a set of counts of different paths and cycles per edge, node
or in the entire graph. The counts are subsequently used to calculate different
kinds of structural coefficients.
# Generate simple undirected ER random graph
G = ig.Graph.Erdos_Renyi(100, p=.05, directed=False)
# Initialize path census object.
# it precomputed path/cycle counts at the level of edges.
# Other counts are derived from them.
P = PathCensus(G)
# Get edge-level census
P.census("edges")
# Get node-level census
P.census("nodes") # or just P.census()
# Get global census
P.census("global")
# Column definitions
?P.definitions
Once path census is computed it can be used to calculate structural
coefficients.
# Similarity coefficients
P.tclust() # triangle-clustering equivalent to local clustering coefficient
P.tclosure() # triangle-closure equivalent to local closure coefficient
P.similarity() # structural similarity (weighted average of clustering and closure)
# Edge-wise similarity
P.similarity("edges")
# Global similarity (equivalent to global clustering coefficient)
P.similarity("global")
The figure below sums up the design of structural similarity coefficients,
their geometric motivation and some of the main properties.
# Complementarity coefficients
P.qclust() # quadrangle-based clustering
P.qclosure() # quadrangle-based closure
P.complementarity() # structural complementarity (weighted average of clustering and closure)
# Edge-wise complementarity
P.complementarity("edges")
# Global complementarity
P.complementarity("global")
The figure below sums up the design and the geometric motivation of
complementarity coefficients as well as their main properties.
Similarity and/or complementarity coefficients may be calculated in one
go using appropriate methods as shown below.
# Similarity + corresponding clustering and closure coefs
P.simcoefs() # node-wise
P.simcoefs("global") # global
# Complementarity + corresponding clustering and closure coefs
P.compcoefs() # node-wise
P.compcoefs("global") # global
# All coefficients
P.coefs()
# All coefficients + full path census
P.coefs(census=True)
Weighted coefficients
Below we create an ER random graph with random integer edge weights
between 1 and 10. As long as edge weights are assigned to an edge property
of the standard name ("weight") they should be detected automatically
and pathcensus will calculate weighted census. However, unweighted census
may be enforced by using weighted=False.
G = ig.Graph.Erdos_Renyi(100, p=0.05, directed=False)
G.es["weight"] = np.random.randint(1, 11, G.ecount())
P = PathCensus(G)
P.weighted # True
# Get all coefficients and full path census
P.coefs(census=True)
# Use unweighted census
P = PathCensus(G, weighted=False)
P.weighted # False
P.coefs(census=True)
Below is the summary of the construction of weighted coefficients.
Parallel PathCensus algorithm
PathCensus objects may be initialized using parallelized algorithms
by using parallel=True.
NOTE
Parallel algorithms require an extra compilation step so the first
time parallel=True is used there will be a significant extra
overhead.
NOTE
The parallel=True argument may not work and lead to segmentation
faults on some MacOS machines.
# By default all available threads are used
P = PathCensus(G, parallel=True)
# Use specific number of threads
P = PathCensus(G, parallel=True, num_threads=2)
Other features
Other main features of pathcensus are:
Null models based on the ERGM family.
Utilities for conducting statistical inference based on null models.
Integration with arbitrary classes of graph-like objects.
All these features are documented in the official documentation.
Testing
The repository with the package source code can be cloned easily
from the github repo.
git clone [email protected]:sztal/pathcensus.git
It is recommended to work within an isolated virtual environment.
This can be done easily for instance using conda.
Remember about using a proper Python version (i.e. 3.8+).
conda create --name my-env python=3.8
conda activate my-env
After entering the directory in which pathcensus repository
was cloned it is enough to install the package locally.
pip install .
# Or in developer/editable mode
pip install --editable .
In order to run tests it is necessary to install also test dependencies.
pip install -r ./requirements-tests.txt
# Now tests can be run
pytest
# Or alternatively
make test
# And to run linter
make lint
And similarly for building the documentation from source.
pip install -r ./requirements-docs.txt
# Now documentation can be built
make docs
Tests targeting different Python versions can be run using tox test
automation framework. You may first need to install tox
(e.g. pip install tox).
make test-all
# Or alternatively
tox
Test coverage
Unit test coverage report can be generated easily.
make coverage
# Report can be displayed again after running coverage
make cov-report
Feedback
If you have any suggestions or questions about Path census feel free to email me
at [email protected].
If you encounter any errors or problems with Path census, please let me know!
Open an Issue at the GitHub http://github.com/sztal/pathcensus main repository.
Authors
Szymon Talaga <[email protected]>
For personal and professional use. You cannot resell or redistribute these repositories in their original state.
There are no reviews.