 # Blog

## Discrimination reveals reconstructability of multiplex networks from partial observations

11:31 27 junio in Artículos por Website

### Framework for reconstructing multiplex layer structures

We will first introduce the notations by denoting the adjacency matrix of layer α in a multiplex network M by Mα (α = 1, 2,   , L), and $${M}_{ij}^{\alpha }=1$$ if there is an edge between nodes i, j in layer α and vice versa (i, j = 1, 2,   , N). It is hypothesized that each multiplex network M is generated by some process such that the probability of generating a multiplex network with adjacency matrices Mα is ∏αP(Mαθ), where θ represents the parameters of such a process. An aggregate mechanism is a mapping from a multiplex network M to a monoplex (single-layer) topology $${A}^{{{{{{{{\mathcal{O}}}}}}}}}$$, where $${A}^{{{{{{{{\mathcal{O}}}}}}}}}\in {{\mathbb{R}}}^{N\times N}$$ describes the adjacency matrix. In this article, for illustration, we describe the framework using multiplex networks aggregated by the OR mechanism, which is the most common case ranging from biological networks to social networks (see Supplementary Note 1 for other aggregate mechanisms). Then, we have

$${A}_{ij}^{{{{{{{{\mathcal{O}}}}}}}}}=1-\mathop{\prod }\limits_{\alpha =1}^{L}(1-{M}_{ij}^{\alpha }).$$

(1)

Partial observation Γ indicates a set that contains the observed edges in the multiplex network, where $${{\Gamma }}=\{(i,j,\alpha )| {A}_{ij}^{{{{{{{{\mathcal{O}}}}}}}}}=1,{{{{{{{\rm{and}}}}}}}}\,{M}_{ij}^{\alpha }{{\,{{{{{\rm{is}}}}}}\,{{{{{\rm{observed}}}}}}}}\}$$. Supposing that we have the aggregate topology $${A}^{{{{{{{{\mathcal{O}}}}}}}}}$$ and the partial observation Γ from a multiplex network M with L layers, our goal is to predict whether there is a link between any two nodes i, j in layer α when (i, j, α)  Γ (Fig. 2a). Once given an aggregate topology $${A}^{{{{{{{{\mathcal{O}}}}}}}}}$$, there are in total $$L\cdot | {A}^{{{{{{{{\mathcal{O}}}}}}}}}|$$ links are expected to predict, where $$| {A}^{{{{{{{{\mathcal{O}}}}}}}}}|$$ indicates the number of edge in the aggregate topology $${A}^{{{{{{{{\mathcal{O}}}}}}}}}$$. Notice that, among the $$L\cdot | {A}^{{{{{{{{\mathcal{O}}}}}}}}}|$$ potential links, we have already observed card(Γ) links, where card() indicates the elements number of a set. Thus, we denote the fraction of partial observations by c (0 ≤ c < 1), indicating the ratio between the number of observed edges in layers and all edges in the multiplex network, i.e., $$c=\frac{{card}({{\Gamma }})}{L\cdot | {A}^{{{{{{{{\mathcal{O}}}}}}}}}| }$$. Fig. 2: A schematic illustration of the reconstruction method for multiplex networks.

The first step of the reconstruction is to find the most probable value of θ by maximizing the posterior probability $$P(\theta | {A}^{{{{{{{{\mathcal{O}}}}}}}}},{{\Gamma }})$$, where θ is the network model parameter. Notice that the probability here provides a general description for any form of the network parameter θ (see Supplementary Note 2 for specific forms with a certain network model). Since there is no prior knowledge about the parameter θ, we assume it to have a uniform distribution, i.e., P(θ) = const.33. In this case, based on the Bayesian rule, the maximum posterior estimate is equivalent to maximizing the log-likelihood function

$$l(\theta )=\ln P({A}^{{{{{{{{\mathcal{O}}}}}}}}},{{\Gamma }}| \theta ),$$

(2)

which performs the maximum likelihood estimation (MLE). The second step is to reconstruct the multiplex structure M, inferring the probability for each possible structure specifically. Since many potential layer structures can produce the same multiplex aggregate topology, we denote the probability distribution for all multiplex structures by Q(M), where ∑MQ(M) = 1. Then, the estimated parameter θ can reconstruct the multiplex structure by calculating the posterior distribution

$$Q({{{{{{{\bf{M}}}}}}}})=P({{{{{{{\bf{M}}}}}}}}| {A}^{{{{{{{{\mathcal{O}}}}}}}}},{{\Gamma }},\theta ).$$

(3)

However, there is a gap between the observations and network model parameters θ, since the multiplex structure M is a hidden variable in l(θ), where $$l(\theta )=\ln {\sum }_{{{{{{{{\bf{M}}}}}}}}}P({A}^{{{{{{{{\mathcal{O}}}}}}}}},{{\Gamma }},{{{{{{{\bf{M}}}}}}}}| \theta )$$. Notice that the sums over M here are expected to be over M that are consistent with $${A}^{{{{{{{{\mathcal{O}}}}}}}}}$$ and Γ. For any distribution Q(M), by employing the Jensen’s inequality, we have

$$l(\theta )=\ln \mathop{\sum}\limits_{{{{{{{{\bf{M}}}}}}}}}P({A}^{{{{{{{{\mathcal{O}}}}}}}}},{{\Gamma }},{{{{{{{\bf{M}}}}}}}}| \theta )\ge \mathop{\sum}\limits_{{{{{{{{\bf{M}}}}}}}}}Q({{{{{{{\bf{M}}}}}}}})\ln \frac{P({A}^{{{{{{{{\mathcal{O}}}}}}}}},{{\Gamma }},{{{{{{{\bf{M}}}}}}}}| \theta )}{Q({{{{{{{\bf{M}}}}}}}})},$$

(4)

where the equality holds if and only if the Eq. (3) is satisfied. Thus, l(θ) and Q(M) are interdependent, and we perform an iterative process to obtain the MLE of the parameter θ and the posterior distribution Q(M) as follows. Given an arbitrary initial value θ(0), we find the optimized posterior distribution Q(k)(M) by Eq. (3). Then, we update the parameters θ(k) that maximize the right-hand side of Eq. (4) by posterior distribution Q(k−1)(M), which performs a coordinate ascent to maximize the log-likelihood function (Fig. 2b). The iterations above are derived from the expectation-maximization (EM) algorithm34 (details in the “Methods” section), and a toy example is shown (Fig. 2c). Note that if there is any prior on the parameters θ, the proposed framework above can be improved by maximizing the product of the likelihood function $$P({A}^{{{{{{{{\mathcal{O}}}}}}}}},{{\Gamma }}| \theta )$$ and the prior P(θ), i.e., the so-called maximum a posterior estimation (MAP).

In estimation and statistics theory, an unbiased estimator is called efficient if the variance of the estimator reaches Cramer-Rao lower bound (CRLB)35. Fortunately, the proposed framework yields a maximum likelihood estimation, which is an unbiased estimator, and performs asymptotic normality indicating the estimator converges in distribution to a normal distribution36. With this, we prove that the variance of the estimator designed in our framework decreases as the fraction of partial observations c increases, and further reaches the CRLB when the network size N approaches infinity (see Supplementary Note 3 and Supplementary Fig. 4 for more details).

### Evaluations for the performance of the reconstruction

We now analyze the performance of reconstruction on various real-world multiplex networks. Notice that the framework works for any given analytical generative model, such as Erdos-Rényi random network model and stochastic block model. For illustration, we employ the configuration model as our network model, which exploits the parameter set D to describe model parameters. For a multiplex network composed by L layers, specifically, the parameter set D contains L vectors $$\overrightarrow{{d}^{\alpha }}\in {{\mathbb{R}}}^{N}$$ (α = 1, 2,   , L), encoding the degree sequences in each layer, where the component dα(i) describes the degree of node i in layer α. The configuration model can significantly reduce the complexity from exponential to polynomial by exploiting the independence of each link and this model has been widely applied to analyze the relationship between structure and function of complex networks37,38,39.

As we mentioned in the last section, once a certain network model is determined, we can conduct specific derivations to find the most probable values of all parameters in D by maximizing the likelihood $$P({A}^{{{{{{{{\mathcal{O}}}}}}}}},{{\Gamma }}| D)$$ (see Supplementary Note 2 for detailed derivations). After estimating degree sequences D, the posterior probability $${Q}_{ij}^{\alpha }$$ can be calculated by

$${Q}_{ij}^{\alpha }=P({M}_{ij}^{\alpha }=1| {A}^{{{{{{{{\mathcal{O}}}}}}}}},{{\Gamma }},D),$$

(5)

which is called here link reliability, measuring the probability that a link exists between node i and node j in layer α (see Supplementary Note 2 for complete algorithm and complexity analysis in this case). We examine the reliability of all links in the testing set ET consisting of potential links except those of the partial observations, i.e., $${E}^{T}=\{(i,j,\alpha )\ \notin\ {{\Gamma }}| {A}_{ij}^{{{{{{{{\mathcal{O}}}}}}}}}=1\}$$ (see Supplementary Fig. 5 for a schematic illustration). For this purpose, we calculate the TP (true positive rate) $$P({M}_{ij}^{\alpha }=1| {Q}_{ij}^{\alpha }\ > \ q)$$, FP (false positive rate) $$P({M}_{ij}^{\alpha }=0| {Q}_{ij}^{\alpha }\ > \ q)$$, TN (true negative rate) $$P({M}_{ij}^{\alpha }=0| {Q}_{ij}^{\alpha }\ < \ q)$$ and FN (false negative rate) $$P({M}_{ij}^{\alpha }=1| {Q}_{ij}^{\alpha }\ < \ q)$$ in ET, where q is the threshold that determines the classifier boundary for varying classes.

We first vary the threshold q from 0 to 1, and calculate AUC (area under the receiver operating characteristic (ROC) curve) for two real-world datasets (C. elegans neural network and London transportation network) against different c. In the meantime, we compare our results with three-link prediction methods that performed well on inference tasks by partial observations so far. The first relevant work is referred to De Bacco et al., who have proposed a generative model, and an efficient expectation-maximization algorithm, which allows to perform inference tasks such as community detection and link prediction40. It works for multiplex networks with groups, but it may fail in networks without group-based structures. The second is referred to Tarres-Deulofeu et al., who introduced a stochastic block model, which can take full advantage of the information contained in the whole network41. The third baseline is calculated by a single-layer link prediction method, layer by layer, in the multiplex network (Layered model). Notice that all of them did not take the aggregate topology into consideration, while our method can provide valuable insights into how to aggregate topology information helps reconstruction. The AUCs by four methods for C. elegans neural network and London transportation are shown in Fig. 3a, displaying that the aggregate topology truly provides important information (see Supplementary Fig. 13 for more empirical results). Fig. 3: Performance of our reconstruction framework in various real multiplex networks.

Further, we display the ROC curves in the ROC space for c = 0.01, 0.05, 0.25, 0.5, respectively (Fig. 3b, c). Here, the ROC space is defined by plotting the false positive rate on the x-axis and the true positive rate on the y-axis, displaying the relative trade-offs between false positive (costs) and true positive (benefits). Notice that the ROC curve describes the performance as a continuous function of the threshold q, and can well quantify the true positive rate and the false-positive rate for any given threshold q. Thus, AUC is a scalar that quantifies the performance and it does not depend on the choice of threshold. The results of the ROC analysis show that our proposed framework is very effective for any threshold in the interval of (0, 1), and thus it is stable for different thresholds in various real-world networks. Further, the true positive rate is positively correlated with the false-positive rate, and there exists a threshold, above which a false positive rate increases faster than the true positive rate. It is, thereby, not justified anymore to improve a true positive rate by increasing the false positive rate beyond such a threshold.

Next, we will set specific thresholds for further analysis. For example, since we do have any prior knowledge about the threshold q in the task, we set q = 0.5 to avoid arbitrary choice, i.e., when the link reliability is larger than 0.5, an edge is considered to exist, and vice versa. Then, we calculate the four metrics to evaluate the performance of multiplex reconstruction for the nine real-world datasets. As a result, the fraction of partial observations c, indicating the portion of the observed edges, exhibits a positive correlation with the accuracy of the reconstruction, showing good performance even for a quite small c (Fig. 3f) (see the “Methods” section and Supplementary Fig. 6 for more details for the evaluations). In practice, we sample the partial observations by vertex sampling, and repeat the reconstruction 100 times for each value of c (see the “Methods” section for more details about the sampling of partial observations).

Besides link reliability, network characteristics also include average degree, degree distribution, length distribution of the shortest paths, which are significant to network reconstruction. One prominent advantage of the reconstruction framework is that we can simultaneously obtain other micro- or mesoscale network properties. For example, the expectation of the degree distribution in layer α can be obtained by E(pα) = ∑MQ(M) pα(M), where pα(M) is the degree distribution in layer α for any specific multiplex structure M. The reconstructed degree distributions in two layers with different values of c are compared to the ground truth (c = 1), demonstrating that the degree distributions in all layers can also be well reconstructed as c increases (Fig. 3d, e). Generally, the expectation of property X is the first raw moment obtained by E(X) = ∑MQ(M) X(M), while the corresponding variance is the second central moment, $$D(X)={\sum }_{{{{{{{{\bf{M}}}}}}}}}Q({{{{{{{\bf{M}}}}}}}})\cdot {\left[X({{{{{{{\bf{M}}}}}}}})-E(X)\right]}^{2}$$. Moreover, we can obtain skewness, kurtosis, and higher moments of a property X. We will discuss how different network characteristics (e.g., average degrees and overlap of edges in different layers) impact the performance of reconstruction in the next section.

### Reconstructability determined by the discrimination indicator

It is of great significance to investigate how various characteristics of multiplex networks affect the reconstructability, i.e., the reconstruction accuracy we measured. Without loss of generality, we conduct an analysis of two-layer multiplex networks to illustrate the method in detail. Once a link is observed in the aggregate network, the probability space of the potential multiplex structure contains three events: the potential link exists (i) only in layer 1, (ii) only in layer 2, or (iii) in both layers. Then, the uncertainty of all links in the multiplex network M can be quantified by the information entropy

$${{{{{{{\mathcal{H}}}}}}}}({{{{{{{\bf{M}}}}}}}}| {A}^{{{{{{{{\mathcal{O}}}}}}}}})={{{{{{{\mathcal{H}}}}}}}}({M}^{1}| {A}^{{{{{{{{\mathcal{O}}}}}}}}})+{{{{{{{\mathcal{H}}}}}}}}({M}^{2}| {A}^{{{{{{{{\mathcal{O}}}}}}}}}).$$

(6)

Generally speaking, the smaller the information entropy $${{{{{{{\mathcal{H}}}}}}}}$$ is, the more certain the potential multiplex structure is, and vice versa.

To study how different characteristics of multiplex networks impact the information entropy, we first introduce the ratio of average degrees of two layers denoted by r, i.e., $$r=\frac{\left\langle {k}_{1}\right\rangle }{\left\langle {k}_{2}\right\rangle }$$, where $$\left\langle {k}_{1}\right\rangle$$ and $$\left\langle {k}_{2}\right\rangle$$ are the average degrees in layer 1 and layer 2, respectively. We assume, without loss of generality, that $$\left\langle {k}_{1}\right\rangle \le \left\langle {k}_{2}\right\rangle$$, such that 0 < r ≤ 1. Then, we consider the overlap of edges denoted by v between the two layers. A high overlap indicates that a link is more likely to exist in one layer if the corresponding link exists in the other layer, i.e., a low uncertainty. To measure the overlap v of a multiplex network, we refer to the Jaccard index of $${{{{{{{{\mathcal{E}}}}}}}}}_{1}$$ and $${{{{{{{{\mathcal{E}}}}}}}}}_{2}$$ that indicate the two edge sets in the two layers, i.e., $$v=| {{{{{{{{\mathcal{E}}}}}}}}}_{1}\cap {{{{{{{{\mathcal{E}}}}}}}}}_{2}”https://www.nature.com/” {{{{{{{{\mathcal{E}}}}}}}}}_{1}\cup {{{{{{{{\mathcal{E}}}}}}}}}_{2}|$$ (see Supplementary Fig. 9 for more details about multiplex network characteristics).

We next explore how these factors impact the information entropy of a multiplex network and further determine the performance of reconstruction. By Eq. (6), we can calculate the information entropy $${{{{{{{\mathcal{H}}}}}}}}$$ with a mean-field approximation, and obtain

$${{{{{{{\mathcal{H}}}}}}}}({{{{{{{\bf{M}}}}}}}}| {A}^{{{{{{{{\mathcal{O}}}}}}}}})=-| {A}^{{{{{{{{\mathcal{O}}}}}}}}}| \cdot \mathop{\sum }\limits_{\alpha =1}^{2}[{\bar{p}}_{\alpha }\cdot \ln {\bar{p}}_{\alpha }+(1-{\bar{p}}_{\alpha })\cdot \ln (1-{\bar{p}}_{\alpha })],$$

(7)

where

$${\bar{p}}_{1}=E[P({M}_{ij}^{1}=1| {A}_{ij}^{{{{{{{{\mathcal{O}}}}}}}}}=1)]=\frac{\hat{v}+\hat{r}}{1+\hat{r}},$$

(8)

and

$${\bar{p}}_{2}=E[P({M}_{ij}^{2}=1| {A}_{ij}^{{{{{{{{\mathcal{O}}}}}}}}}=1)]=\frac{1+\hat{v}\cdot \hat{r}}{1+\hat{r}},$$

(9)

indicating the average probability for the existence of any link in layer 1 and layer 2, respectively. Notice that in Eqs. (8) and (9), $$\hat{v}$$ and $$\hat{r}$$ are the estimations of v and r when we only have partial observations Γ, and we approximate them by cv and rc empirically (see Methods section for more details). Thus, we find that the information entropy of a given multiplex network is highly related to the fraction of partial observations c, the ratio of average degrees r and overlap v. It is clear that the uncertainty of the probability space decreases with the increasing of c and v. Hence, the information entropy $${{{{{{{\mathcal{H}}}}}}}}$$ is a monotonously decreasing function of c and v over the domain. For r, however, the information entropy is a monotonously increasing function when r increases from 0 to 1 (Fig. 4a). Clearly, $${{{{{{{\mathcal{H}}}}}}}}$$ describes the microscale discrimination between layers of a multiplex network, since a high discrimination (e.g., r tends to 0) leads to a low information entropy.

Generally, the reconstruction accuracy is expected to be determined by the information entropy $${{{{{{{\mathcal{H}}}}}}}}$$, since $${{{{{{{\mathcal{H}}}}}}}}$$ is the primary variable that affects the uncertainty of the distribution of potential multiplex structures. Empirically, we find that the accuracy is linearly determined by the indicator $$\rho \cdot {{{{{{{\mathcal{H}}}}}}}}$$ (Fig. 4b), i.e.,

$${{{{{{{\rm{Accuracy}}}}}}}}\approx 1-\rho \cdot {{{{{{{\mathcal{H}}}}}}}},$$

(10)

where ρ is a scaling factor satisfying

$$\rho =\frac{1}{2\ln 2\cdot | {A}^{{{{{{{{\mathcal{O}}}}}}}}}| }\cdot \left(1-\frac{1-v}{1+v}\cdot {c}^{s}\right).$$

(11)

In Eq. (11) above, s = s(M) is a constant related to the topology of the multiplex network M (see Supplementary Table 2 for approximate values of s(M). The term (1 − v)/(1 + v) in Eq. (11) indicates the uncertainty of links in the testing set can be reduced by partial observations. The parameter s describes the scale how partial observations can reduce the uncertainty of links in the testing set (details in Methods section). We further find that s(M) is closely proportional to the cosine similarity of the two degree sequences in each layer, i.e., $$s\propto \cos \langle \overrightarrow{{d}^{1}},\overrightarrow{{d}^{2}}\rangle$$ (Fig. 4c). Clearly, $$\cos \langle \overrightarrow{{d}^{1}},\overrightarrow{{d}^{2}}\rangle$$ describes the similarity between degree sequences of the two layers in a multiplex network, indicating the mesoscale discrimination, which is not relevant to microscale discrimination including r, v, and $${{{{{{{\mathcal{H}}}}}}}}$$ generally (Fig. 4d).

Thus, the reconstructability is determined by the discrimination indicator ($$1-\rho \cdot {{{{{{{\mathcal{H}}}}}}}}$$) from both microscale and mesoscale views. This discovery indicates that the reconstruction can be predicted accurately by the discrimination indicator, obtaining a high accuracy of reconstruction where either ρ or $${{{{{{{\mathcal{H}}}}}}}}$$ is small. For example, the reconstructability can be enhanced when the difference in average degrees between layers is vast (r tends to 0). Notice that we can approximate s by the cosine similarity if we do not meet the exact value of s empirically, since s is highly related to the cosine similarity. We will next discuss how to allocate the partial observations in different layers when a specific budget $$\bar{c}$$ is given.

### Allocating limited budget for partial observations

Usually, we have a limited budget for conducting observations in practice. It is, thereby, necessary to investigate budget allocation (partial observations Γ) in different layers to optimize the performance of reconstruction (e.g., the accuracy) as far as possible. We denote the average fraction of partial observations in each layer by $$\bar{c}$$, i.e., $$\bar{c}={\sum }_{\alpha }{c}_{\alpha }/L$$, where cα indicates the fraction of partial observation in layer α, and denote $${{{{{{{\mathcal{H}}}}}}}}({M}^{\alpha }| {A}^{{{{{{{{\mathcal{O}}}}}}}}})$$ by $${{{{{{{{\mathcal{H}}}}}}}}}_{\alpha }$$ for simplicity. Similarly, employing the mean-field approximation (details in the “Methods” section), we can predict the accuracy by the function F defined as

$$F({c}_{1},{c}_{2})= \ 1-\frac{1-{c}_{1}}{1-{c}_{1}+(1-{c}_{2})/\hat{r}}\\ \cdot {\rho }_{1}\cdot {{{{{{{{\mathcal{H}}}}}}}}}_{1}-\frac{(1-{c}_{2})/\hat{r}}{1-{c}_{1}+(1-{c}_{2})/\hat{r}}\cdot {\rho }_{2}\cdot {{{{{{{{\mathcal{H}}}}}}}}}_{2},$$

(12)

where

$${\rho }_{\alpha }=\frac{1}{2\ln 2\cdot | {A}^{{{{{{{{\mathcal{O}}}}}}}}}| }\cdot \left(1-\frac{1-v}{1+v}\cdot {c}_{3-\alpha }^{s}\right),\,\alpha =1,2.$$

(13)

We next explore how the performance of reconstruction is impacted by the ratio c1/c2 when a certain budget is given for partial observations. Once $$\bar{c}$$ is given, we regard the function F as a unary function of c1, i.e., F = F(c1), since $${c}_{2}=2\bar{c}-{c}_{1}$$. Then, the domain of F(c1) is $$[0,2\bar{c}]$$ if $$\bar{c}\ \le\ 0.5$$, and is $$[2\bar{c}-1,1]$$ if $$\bar{c}\ > \ 0.5$$.

We notice that the function F(c1) is monotonically increasing over the domain if $$\bar{c}$$ is small, but decreases at first and increases later if $$\bar{c}$$ is large. Theoretical analysis shows that $$F(0)\ge F(2\bar{c})$$ if $$\bar{c}\le 0.5$$, and $$F(2\bar{c}-1)\ge F(1)$$ if $$\bar{c}\ > \ 0.5$$ (details in the “Methods” section). The result indicates that it is always better to allocate the budget as much as possible to the layer whose average degree is lower, and we can reach the optimal strategy to obtain the highest accuracy for the given network model then. Moreover, there exists a threshold $$0\ < \ {\bar{c}}_{0}({{{{{{{\bf{M}}}}}}}})\ < \ 1$$ for each multiplex network M, where $${\bar{c}}_{0}$$ is the solution to the equation

$$F(0)=F(2{\bar{c}}_{0}).$$

(14)

If the budget $$\bar{c}$$ is less than $${\bar{c}}_{0}$$, the accuracy increases when c1/c2 increases, and reaches the maximum as c1/c2 tends to . If the budget $$\bar{c}$$ is large ($$\bar{c}\ > \ {\bar{c}}_{0}$$), however, the accuracy increases when c1/c2 tends to 0 or , and reaches the maximum as c1/c2 tends to (Fig. 5a), indicating that the multiplex network can be reconstructed when the aggregate topology and either of the two layers is observed (Fig. 5b). The reason is as follows. The partial observations in different layers can capture the maximal characteristics of each layer when c1/c2 = 1. However, it will lead to more redundancy and lower accuracy if the partial observations in different layers have a high overlap of observations, making the performance even worse when c1/c2 = 1 and $$\bar{c}$$ is large. The theoretical results enable us to make the best strategy to allocate budget and thus obtain the optimal reconstruction of multiplex networks in this case. Furthermore, results from real-world data sets verified our theoretical analysis (Fig. 5a, c). We will discuss how different multiplex network characteristics impact the performance of reconstruction from a dynamical behavior point of view in the next section.