Distance-Preserving Graph Compression Techniques

We study the problem of distance-preserving graph compression for weighted paths and trees. The problem entails a weighted graph $G = (V, E)$ with non-negative weights, and a subset of edges $E^{\prime} \subset E$ which needs to be removed from G (with their endpoints merged as a supernode). The goal is to redistribute the weights of the deleted edges in a way that minimizes the error. The error is defined as the sum of the absolute differences of the shortest path lengths between different pairs of nodes before and after contracting $E^{\prime}$. Based on this error function, we propose optimal approaches for merging any subset of edges in a path and a single edge in a tree. Previous works on graph compression techniques aimed at preserving different graph properties (such as the chromatic number) or solely focused on identifying the optimal set of edges to contract. However, our focus in this paper is on achieving optimal edge contraction (when the contracted edges are provided as input) specifically for weighted trees and paths.


Introduction and Related Work
Graphs have become increasingly relevant for solving real-world problems, leveraging their numerous characteristics [25,27].However, many of these graphs are incredibly large, consisting of trillions of edges and vertices, which poses scalability challenges for modern systems [7,17,20].Consequently, graph compression techniques have garnered significant research interest in recent years, aiming to obtain a smaller graph while retaining the essential properties of the original input graph.Different names, such as graph compression [24], graph summarization [21], graph modification [10], and graph contraction [19], have been used in the literature to describe this problem, each within its specific context, leading to various proposed approaches.Regardless of the terminology or context, most of these problems focus on reducing the size of the graph while preserving a specific property [24], while some approaches aim to modify a graph to satisfy a given property [15,28].Furthermore, graphs can be compressed in different ways, such as vertex deletions, edge deletions, and edge contractions.It is worth noting that many of these resulting graph modification problems are NP-hard, as indicated in [15].
A relevant problem that is commonly referred to as the blocker problem [19] is defined as follows.Given a graph G, integers k and d, an invariant π : G − → R, and some modification operations (such as edge contractions), a blocker problem asks whether there exists a set of at most k graph modification operators such that in the resulting graph G ′ , π (G ′ ) ≤ π(G) − d holds.In recent years, blocker problems have been studied for various graph properties, such as the chromatic number [3,9], maximum weight independent set and minimum weight vertex cover [4], maximum independent set [8,9], the clique number [9], the total domination number [12], diameter [14], and maximum weight clique [22].A lot of these blocker problems are defined as contraction problems, in which graphs can only be modified via edge contractions.More precisely, given a graph G, integers k and d, and an invariant π : G − → R, CONTRACTION(π) asks whether there exists a set of at most k edge contractions that results in a graph G ′ with π (G ′ ) ≤ π(G) − d.Galby et al. [11] studied the contraction problems in which a specific edge could be provided as input (in addition to π).As an important contribution, Galby et al. [11] proved that, unless P=NP, there exists no polynomial-time algorithm that decides whether contracting a given edge reduces the total domination number.Biedl et al. [6] studied the problem of flow preserving graph simplification, which is the problem of finding a set of edges whose removal does not change the maximum flow of the underlying network.
Shortest path queries are crucial to various domains, including search engines [13], networks [1,16] and transportation [2,30].In a more relevant work to this paper, Bernstein et al. [5] studied a slightly different variant of CONTRACTION.In their work, Bernstein et al. [5] focused on compressing a given graph as much as possible, while permitting only a limited amount of distance distortion among any pair of vertices.Given a tolerance function φ(x) = x/α − β, with α ≥ 1 and β ≥ 0, Bernstein et al. [5] studied the problem of finding the maximum cardinality set of edges whose contraction results in a graph G ′ such that d G ′ (u, v) ≥ φ (d G (u, v)) for all u, v ∈ G.However, in their work, they only focused on finding an optimal set instead of optimally redistributing the weights.More specifically, after finding the optimal set E ′ , they set the weight of each edge e ∈ E ′ to zero.
Unlike the work by Bernstein et al. [5], the work by Sadri et al. [24] focuses on optimally redistributing the weights.However, they do not provide any bound guarantees on the amount of error.Precisely, they assess the efficiency of their proposed approach by a set of experimental studies.Moreover, their weight redistribution approach for trees ignores the size of each subtree rooted at the endpoints of a given contracted edge.This is a key factor in deciding an optimal assignment, as we will show in this paper.More recently, Liang et al. [18] studied the problem of reachability-preserving graph compression techniques.There have also been other works related to graph compression for unweighted and weighted graphs, as listed in [23,26].Zhou et al. [29] proposed an efficient approach to remove a large portion of the edges in a network without affecting the overall connectivity by much.Ruan et al. [23] studied the minimum gate-vertex set discovery (MGS) problem.The MGS problem is concerned with finding the minimum cardinality set of vertices, designating them as gate vertices, using which every non-local pair of vertices (whose distance is above some threshold) is able to recover its distance in the original network.However, the work by Ruan et al. [23] only studies unweighted graphs.
Where our work stands in the literature: To the best of our knowledge, all existing works have either only focused on finding an optimal set of edges to contract or have not provided any bounds on the amount of error.We study a different problem: instead of choosing which optimal edge to contract, we are interested in finding out how to contract a given edge optimally.Even though we still study distance-preserving graph compression, our focus is mainly on optimally modifying the graph after a given edge has been contracted.Our primary modification operation is changing the edge weights of the graph.It is worth noting that this problem has received limited attention in the literature, with the closest existing work being the study by Sadri et al. [24].Their approach involves solving a system of equations to determine the new edge weights in the resulting graph.However, their analysis of the problem has certain limitations.Firstly, they do not offer any optimal guarantees for their weight distribution technique.Furthermore, their weight redistribution method does not account for the sizes of the individual subgraphs connected to a given edge.In contrast, as we will demonstrate throughout this paper, the sizes of subtrees (particularly in the context of paths and trees) play a crucial role in achieving optimal weight redistribution.The organization of the paper: The remainder of this paper is organized as follows.In Section 1.1, we present a summary of our main results along with some comments and details regarding each contribution.In Section 2.1, we describe the notation used in the paper, using which we formally define the scope of our paper in Section 2.2.In Section 3, we study the problem of distance-preserving graph compression for weighted paths, where we prove optimal approaches to contracting any set of k edges.In Section 4, we study the problem of graph compression for weighted trees, where we provide an optimal linear-time algorithm for contracting a single edge.We present the concluding remarks of this paper along with some potential avenues of future work in Section 5.

Contributions and Results
In Section 3, we study the problem of distance-preserving graph compression for weighted paths.
As a warm-up, we prove an optimal bound for merging1 a single edge in a path topology in Section 3.1, whose main result is stated in Theorem 1.
In Section 3.1, we present a method for transforming any weight redistribution for a given merged edge e * to another redistribution in which only the weights of its neighbouring edges are altered.
We present Algorithm 1 for merging any set of k ≤ n 2 independent edges (edges that have no endpoints in common and induce a matching on the path) in a path of size n.
We note that Algorithm 1 produces suboptimal solutions when applied to a contiguous subpath (a connected subgraph) of the given input path.We relate this suboptimal performance to the distinction between merging two regular vertices and two supernodes.We thoroughly investigate this distinction in Lemma 3, where we present an optimal redistribution for merging two supernodes.
Having the suboptimal performance of Algorithm 1 for merging subpaths in mind, in Section 3.3 we study the problem of finding the optimal redistribution for any connected subgraph of a given input path.The optimal method for contracting any contiguous subpath of the input path is presented in Theorem 2.
After studying the case of contiguous subpaths, we present Theorem 3 as another generalization of the case with a single edge (Theorem 1).When the edge set to be compressed consists of k ≤ n 2 independent edges that induce a matching on the input path, Theorem 3 presents an optimal method for graph compression.Theorem 3 provides a correctness proof for Algorithm 1.
In Section 4, we study the problem of distance-preserving graph compression for the tree topology, where we present optimal approaches for merging a single edge in a weighted tree.To this end, we define a relevant problem, which we refer to as the marking problem.The objective of the marking problem is to minimize the error, as defined in Section 2, by marking a subset of the neighbouring edges of the merged edge e * (with weight w * ).For merging an edge e * with weight w * , an edge e i is said to be marked if its new weight w ′ (e i ) = w(e i ) + w * .As a warm-up, in Section 4.3, we study the marking problem for a tree in which the neighbouring subtrees of e * are of equal sizes.For such edges, we show that the optimal marking is achieved when all edges either to the left or right (but not both) of e * are marked.In Section 4.4, we generalize the findings of Section 4.3 and present an optimal marking for any merged edge e * in a weighted tree.
The definition of the marking problem implies that an edge can either be fully marked or unmarked.It is non-trivial to see whether fractionally marking the edges produces better results.Therefore, in Section 4.5, we thoroughly investigate the distinction between the marking problem (Definition 9) and the fractional marking problem (Definition 14) and conclude that any solution to the latter can be transformed into another solution to the former without worsening the error value.
We present Algorithm 2, an O(|V |)-time algorithm, for finding an optimal marking for e * in a weighted tree.

Preliminaries
In this section, we first discuss the common notation (Section 2.1) and then present some additional definitions (Section 2.2) that help describe the scope of our paper.In Section 2.3, we present a simple number-theoretic lemma, which is later used in some of the proofs in this paper.Throughout this section, we use the path in Figure 1 as the running example of the definitions.

Notation
Let G = (V, E) denote a graph with V and E as its sets of vertices and edges, respectively.With every edge e ∈ E, we associate a weight w(e) w : E − → R ≥0 .We sometimes denote an edge e by (u, v), where u, v ∈ V are referred to as the endpoints of e.Throughout this paper, we frequently denote edges and vertices using subscripts (for instance, e i and v i ) and superscripts for merged edges (for instance e * ).When the context is clear, we sometimes abuse the weight notation and denote the weight of e i and e * by w i and w * , respectively.We denote the number of vertices by n = |V | and a path of n vertices by P n .Throughout this paper, we frequently use n L and n R in different contexts to denote different quantities.However, in most cases, we denote by n L and n R the number of vertices to the left and right of a given vertex of a path, respectively (including itself).For instance, in Figure 1-(a), n L and n R denote the number of vertices to the left of (and including) v ′ 3 and the right of (and including) v ′ 4 , respectively.Formally, let G 1 be one of the two connected components of Figure 1: The path used as the running example in Section 2: (a) A path of 8 vertices, with regular edges denoted by e i , w i = w(e i ), and the contracted edge is highlighted in red and denoted by e * = (u 1 , v 1 ), w * = w(e * ), and (b) The same path after contracting e * and marking e 3 by setting w ′ (e 3 ) = w 3 + w * .In this example, n L = n R , this is not always the case.
v ′ 3 , and let G 2 be the connected component of H that is adjacent to v ′ 4 .We have Therefore, in this paper, we assume that the graph is laid out in the plane and the edge to be merged (e * in Figure 1-(a)) is horizontal.This assumption will simplify the description of our results.

Additional Definitions
We now provide some additional definitions for defining the scope of our paper.
Definition 1 For a weighted graph G = (V, E), the distance between two vertices u, v ∈ V, denoted by d G (u, v), is the length of the shortest weighted path between u and v in G.
Definition 2 A merged edge, or a contracted edge, is one whose endpoints are merged, and the edge itself is removed from the graph.
For instance, e * = (u 1 , v 1 ) in Figure 1-(a) (highlighted in red) is a contracted edge.After contracting e * , the path of Figure 1-(a) is transformed into the one in Figure 1-(

b).
Definition 3 A supernode is a node containing a subset V ′ ⊂ V of the nodes in the original graph, which is a result of a series of edge contractions.We denote the set of all supernodes by V s .Definition 4 For a supernode v ∈ V s , the cardinality of v, denoted by C, C : V s → N, is the number of regular vertices it contains.
In the path of Figure 1 In the path of Figure 1 The set of unmerged vertices is defined as In the path of Figure 1, we have . With reference to a set of merged edges E m ⊂ E and a weight redistribution w ′ , let G ′ be the resulting graph after contracting the edges in E m and setting the new edge weights according to w ′ .The error associated with w ′ with respect to E m is denoted by |∆E| and calculated as: In other words, the error is equal to the sum of the absolute differences (between G and G ′ ) of all shortest path lengths between vertices u, v, at least one of which is in V m .Returning to our example in Figure 1, the error function of Eq. ( 1) sums up the absolute values of the shortest path differences among the vertices of , and between the vertices of V m and the vertices of V m = {u 1 , v 1 }.As the final example, we now explain how the distance difference between one of the aforementioned pairs of vertices is calculated.In Figure 1, the shortest path value difference between v ′ 1 and u 1 changes from w 1 + w 2 + w 3 in G (Figure 1-(a)) to w 1 + w 2 + w 3 + w * in G ′ (Figure 1-(b)).The error induced by this change is thus equal to We are now ready to present the formal definition of our first studied problem: Definition 8 Distance-Preserving Graph Compression: Given a graph G, and a set of contracted edges E m , the problem of distance-preserving graph compression is to find a weight redistribution w ′ for which |∆E| is minimized.

A Number-Theoretic Lemma
The following lemma is used in some of the proofs.Proof: Let us prove α 1 ≥ B; the other proof will be analogous.For the sake of contradiction, assume that α 1 < B. We have four cases depending on whether the values inside the absolute value function (x − A and x − A − B) are positive or negative.Note that |a| = a when a ≥ 0, and |a| = −a otherwise.
Case 1: x < A and x < A + B: which contradicts the assumption (x < A).
Case 2: x < A and x ≥ A + B: These two conditions imply that B < 0, we have: which is a contradiction since B < 0.
Case 3: x ≥ A and x < A + B: Case 4: x ≥ A and x ≥ A + B: x which contradicts the assumption.
Since we get a contradiction for every possible case, we have α 1 ≥ B and α 1 = B for A ≤ x ≤ A+B.Similarly, we have α 2 ≥ B and α 2 = B for C ≤ x ≤ B + C. □ We will also use the following corollary.In this section, we study the problem of distance-preserving graph compression for a weighted path with non-negative weights.The paths in this section all have n ≥ 3 vertices since the compression problem for a two-vertex path is trivial.The remainder of this section is organized as follows.As a warm-up, we provide optimal bounds for merging a single edge in Section 3.1.In Section 3.2, we study the path compression problem for an edge connecting two supernodes, each consisting of a subset of nodes from the path.Two generalizations of the results of Section 3.1 for contracting any subpath (a contiguous subpath of the original graph) and any set of independent edges (that induce a matching in the original path) are provided in Section 3.3 and Section 3.4 respectively.

A Tight Lower Bound for Merging One Edge
This section presents a tight lower bound on the optimal error (Eq.( 1)) associated with merging a single edge in a path topology.As seen in Figure 2, the edge between v 2 and v 3 is merged, and only the immediate edge weights are altered to x and y.Later in this section (Lemma 2), we show why it is sufficient to alter only the immediate edge weights (A and B in Figure 2) to get the minimum amount of error.Note that, for merging a single edge (Figure 2) we have: The following theorem is now presented: Theorem 1 Let |∆E| be the error associated with merging a single edge e * = (v 2 , v 3 ) (with weight B) in a path P n , n ≥ 3 (Figure 2).Furthermore, let Moreover, this lower bound is tight and can be achieved by marking the left neighbour of the merged edge.If the merged edge has no left or right neighbour, the lower bound can be achieved by simply contracting the edge, and no further modifications (weight changes) are required.
Proof: Figure 2 depicts the situation in which edge e * with weight B is merged.We first assume that e * has a left neighbour, and we handle the no-neighbour exception at the end of the proof.As seen in Figure 2-(b), let x and y denote the new edge weights of the neighbouring edges of e * , and let G 1 (with n L vertices) and G 2 (with n R vertices) denote the subpaths rooted at v 1 and v 4 respectively.We denote the error by |∆E| and classify it into different parts (in accordance with Eq. ( 1)): 1.The error between two vertices 2. Between two vertices u, v ∈ G 1 , there is no error, because the shortest path value between all such pairs of vertices is unchanged.Similarly, between two vertices u, v ∈ G 2 , there exists no error.
3. The error between a vertex u ∈ G 1 and the vertices in In the path from u to v 2 , the only changed (with reference to edge weights) subpath is the subpath between v 2 and v 1 which changes from A (Figure 2 Similarly, the error between some vertex u ∈ G 1 and v 3 is |x − A − B| as that is the amount by which the weight of the subpath from v 1 to v 3 changes.The total amount of error between all vertices u ∈ G 1 and the vertices in 4. By similar reasoning to the one provided above, the total amount of error between all vertices u ∈ G 2 and the vertices in Therefore, we can formulate |∆E| as where α 1 and α 2 are the values defined in Lemma 1.Using Lemma 1, we know α 1 ≥ B and Which proves the first part of the theorem. As for the second part, we now show that this lower bound is tight.By marking the left neighbouring edge of e * (effectively setting x = A + B and y = C) we get This analysis concludes the proof for the case where e * has a left neighbour.If e * has no left neighbour, i.e., n L = 0, and no shortest path crossing e * is affected.For each shortest path starting from v 4 (and its right-side vertices) and terminating at v Observe that marking the left neighbouring edge is not the only way of achieving the lower bound as it can also be achieved by marking the right neighbouring edge.In fact, any assignment of values to x and y such that x = A + ϵ 1 , y = C + ϵ 2 , ϵ 1 + ϵ 2 = B will have the same impact.Therefore, for merging a single edge in a weighted path, the marked neighbour can be chosen arbitrarily, and the error value is oblivious to the marking direction.However, this observation (being oblivious to the marking direction) only holds for merging two regular nodes.As we will w * (a) The original graph, before contracting e * = (v  show in Lemma 3, for merging two supernodes the optimal error is obtained by marking the edge adjacent to the smaller node with respect to cardinality.Theorem 1 assumes that to achieve the minimum amount of error, we have to alter only the immediate edges directly connected to the endpoints of the merged edge.We now prove the correctness of this assumption.We first define some notation.For any edge e i ∈ E, we denote its new weight as w ′ (e i ) = w(e i ) + ϵ i where ϵ i is a real number (see Figure 3).This definition allows us to increase or decrease the weight of any given edge e i by ϵ i .We call this assignment of weights a redistribution for e * .We refer to an edge e i as altered if ϵ i ̸ = 0, and unaltered otherwise.Moreover, let V L , V R ⊂ V be the vertices to the left and right of the merged edge respectively as depicted in Figure 3. Therefore, the problem is now to show that there exists an optimal solution, with only the immediate edges altered.For simplicity, we slightly abuse the notation and write w(e i ) as w i and w(e * ) as w * .In the following lemma, we show a construction for transforming any redistribution into another equivalent redistribution in which only the immediate edges are altered.
Lemma 2 (See Figure 3 and Figure 4) For a merged edge e * (in a weighted path) that has both left and right neighbouring edges, any weight redistribution can be transformed into another weight redistribution in which only the left neighbouring edge of e * is altered (i.e., ∀i ̸ = n 1 in Figure 3, ϵ i = 0; see Figure 4).The error associated with this redistribution is no worse than that of the original one.
Proof: We prove the lemma by presenting a construction method for transforming any arbitrary weight redistribution to another one in which only the left neighbouring edge is altered.Furthermore, we show this transformation does not worsen the error.The illustration is mainly based on Figure 3 and Figure 4. We now present a simple construction as follows.For illustration, see Figure 4. Set ϵ i = 0 ∀i ̸ = n 1 , and ϵ n1 = w * .Note that this new redistribution may cause some parts of the error to increase.However, we will use Corollary 1 to provide an upper bound on any potential error increase and show that there will always be enough decrease in error to counterbalance the increase.In the original redistribution, the error between two vertices v i , v j ∈ V L (i < j) is: The indices i and j used in this proof are based on the ones depicted in Figure 3 and Figure 4.
Assume that the path of Figure 3 has For instance, v 1 and v 3 are two vertices from V L in Figure 3 and the original shortest path length between v 1 and v 3 is 3-(a)).In the original redistribution of Figure 3-(b) (which we transform into Figure 4-(b)), this length is be defined for any two vertices v i , v j ∈ V R .Moreover, the error between a vertex v i ∈ V L and a vertex v j ∈ V R (i < j) in the original redistribution is equal to: Transforming the weight redistribution (as shown in Figure 4) changes the error value.We break this change down into five different cases:

JGAA, 28(1) 179-224 (2024) 191
Case 1: The error between two vertices v i , v j ∈ V L (i < j, j ̸ = n 1 + 1) decreases by − j−1 k=i ϵ k , because after the construction this error is equal to zero (compare Figure 4-(b) with Figure 3-(a) for v 1 and v 3 ) and using Eq. ( 3), the change is equal to 0 Case 2: The error between some vertex v i ∈ V L , v i ̸ = v n1+1 , and every vertex v j ∈ V R decreases.Specifically, the error between v i and v n1+2 decreases by − |w * − n1 k=i ϵ k | using Eq. ( 4) (see Figure 5).

Case 3:
The error between some vertex In other words, if the construction causes an increase in error, it is at most equal to |w * − n1 k=i ϵ k |.However, from Case 2 we know that each such vertex v i also has an error decrease of − |w * − n1 k=i ϵ k |, which will be enough to nullify this increase.

Case 5:
The error between some vertex v j ∈ V R , v j ̸ = v n1+2 , and v n1+1 changes by ϵ k .This change may lead to an increase in error; however, we use Corollary 1 to bound this increase: Therefore, we have enough decrease from Case 4 to nullify this increase.

□
Corollary 2 For a given merged e * with left and right neighbouring edges, there exists an optimal redistribution in which only the left-neighbouring edge of e * is marked and all other edges are unmarked.
Based on Theorem 1, the algorithm for merging a set S of edges in a path G = P n is presented in Algorithm 1. Algorithm 1 continuously applies Theorem 1 and marks the left neighbouring edge of each edge e * ∈ S taken in an arbitrary order.Remove e from G and merge its endpoints Remove e from S 12: end while 13: end procedure Unfortunately, Algorithm 1 may produce suboptimal results when applied to specific kinds of inputs.Precisely, it may produce suboptimal results when merging a connected subpath of the given path.The reason behind this suboptimal performance lies in the difference between merging two regular nodes and two supernodes.An example of merging a subpath of size four is depicted JGAA, 28(1) 179-224 (2024) 193 in Figure 6-(a), for which Algorithm 1 may produce the suboptimal solution Figure 6-(b).Later in Section 3.3, we shall show that the optimal solution for this example is the one depicted in Figure 6-(c).Furthermore, in Theorem 3, we prove that when the input to Algorithm 1 consists of independent edges in G (if S induces a matching in G), it produces the optimal results.As seen in the previous section, Algorithm 1 may find suboptimal solutions when given an entire subpath of size k.The main reason behind this suboptimal performance lies in the difference between merging regular nodes and supernodes.Recall from Definition 3 that a supernode contains more than one node of the original graph.In this section, we show that merging two supernodes differs from merging two regular vertices, and we provide a generalized version of Theorem 1.Interestingly, we observe that unlike merging two regular nodes in which the error value was oblivious to the marking direction, for merging supernodes, this direction is directly affected by the cardinality (Definition 4) of each endpoint.In the following lemma, we shall see that for merging an edge e * = (u, v) connecting two supernodes u and v, the optimal solution is obtained by marking the edge adjacent to the lighter vertex (the one with the smaller cardinality) among u and v.

Merging Supernodes
Lemma 3 Suppose we have supernodes v and u (as shown in Figure 7), with C(v) = k and C(u) = k ′ (where k ≥ k ′ ), connected to vertices w 1 and w 2 , respectively.The error incurred by merging the edge e . Furthermore, this lower bound can be achieved by marking the neighbouring edge adjacent to the smaller vertex among v and u in terms of cardinality (e = (u, w 2 ) in Figure 7).If the smaller vertex, with reference to cardinality, has no neighbouring edge other than e * = (u, v), then the optimal error can be achieved by contracting e * without any further modifications or weight changes.
Proof: The analysis is similar to the case of merging regular vertices.We enumerate all possible error values and then deduce the optimal assignment.We first assume that u (the smaller vertex) is adjacent to another edge e ′ ̸ = e * , and we handle the other case (only adjacent to e * ) later in the proof.We denote the error by |∆E|.Because k ≥ k ′ , we further simplify |∆E| as: Using Lemma 1, and the fact that k − k ′ ≥ 0, we have: We can observe that this lower bound (Eq.( 7)) is tight and can be achieved by setting y = B + C and x = A in Eq. ( 5).Now if u is only adjacent to e * , the lower bound can be achieved by just contracting e * and leaving the weight function unchanged.To see why, suppose u is only adjacent to e * .We have n R = 0 and the error is equal to: and the lower bound is achieved without any weight changes.It is worth noting that similar to Theorem 1, we assume that it is sufficient to only alter the neighbouring edges of the merged edge e * = (u, v).This proof for this assumption is almost identical to that of Lemma 2, where any arbitrary redistribution can be transformed into another redistribution in which only the edge adjacent to the smaller vertex (e = (u, w 2 )) is marked.Then, similar to the proof of Lemma 2, the decrease in error is always sufficient to counterbalance any potential error increase.The only difference is that in the new claim, the decrease in error and any potential error increase are weighted by k and k ′ , respectively.Since k ≥ k ′ , the proof follows.□ Remark 1 Lemma 3 is a generalization of Theorem 1. Thinking of each regular vertex as a supernode with cardinality one, we have k = k ′ = 1 and using Lemma 3, the error is equal to by arbitrarily marking one of the neighbouring edges (since the endpoints have equal cardinalities).
Remark 2 Using Lemma 3, we can now explain the suboptimal performance of Algorithm 1 for edges that are not independent and form a contiguous subpath.For inputs of such kind, Algorithm 1 continuously marks the left neighbouring edges of all edges e * ∈ S, potentially marking an edge adjacent to the heavier endpoint of some e * ∈ S along the way and violating the conditions of Lemma 3.
In the next section, we study the problem of optimally merging an entire contiguous subpath of the path.This section presents an optimal way of merging any contiguous subpath (or connected subpath) of a given path.For convenience, we refer to contiguous subpaths as subpaths.Let P ′ ⊆ P be the desired subpath consisting of k edges (see Figure 8 for an illustration).Throughout this section, we assume k is even; otherwise, we can convert P ′ into an equivalent subpath of even length by adding a dummy edge of weight zero.As depicted in Figure 8, we assume P ′ partitions the set of vertices into two subsets, V L and V R , with n L and n R vertices, respectively.We denote the error associated with contracting P ′ by E and break it down into three components:

Merging Contiguous Subpaths
E L , the error between the vertices in V L and the ones inside P ′ , E R , the error between the vertices in V R and the ones inside P ′ , and E LR the error between the vertices of V L and V R .
With this in mind, we formulate E as: such that: where x and y are the new weights of the neighbouring edges of P ′ (Figure 8-(b)).
We first prove the optimal solution for E L and derive the optimal solution for E R by symmetry.Let E (i) L denote the value of E L when x = w 0 + w 1 + • • • + w i for 0 ≤ i ≤ k.We prove the following lemma using induction on i.
L , assume x = w 0 .By a simple replacement into Eq.( 9) we get: In other words, every w j , 1 ≤ j ≤ k, is repeated k + 1 − j times, and: Now assume the lemma holds for all j < i + 1.By the inductive hypothesis, we have L first increases by n L × (i + 1) w i+1 because there are i + 1 clauses c 0 , c 1 , . . ., c i that do not include w i+1 , and then decreases by n L × (k − i) w i+1 JGAA, 28(1) 179-224 (2024) 197 because there are k − i clauses c i+1 , . . ., c k that include w i+1 and were not covered by the previous assignment of x (x = i j=0 w j ).Therefore, we have: The following lemma states that the optimal value of E L is equal to E L .Lemma 5 The optimal value of E L is obtained when Proof: It suffices to show the optimal value of E L is equal to E L .From the proof of Lemma 4, we know that E In other words, E is strictly better than (less than) any E (j) Note that the optimal solution also cannot happen when x = ϵ + k 2 j=0 w j for some 0 < ϵ < w k 2 +1 , because in that case, the error would be equal to: Using simple replacements, we can deduce that E denote the value of E L for some x < w 0 .For some x < w 0 , all clauses in Eq. ( 9) have negative values.Recalling that |x| = −x when x < 0, we have: The other case (x > w 0 + • • • + w k ) can be handled analogously.□

Lemma 6
The optimal value of E R is obtained when y = w k 2 +1 + w k 2 +2 + • • • + w k+1 .Proof: By symmetry and using Lemma 4 and Lemma 5. □ We now derive the following theorem, which states that the optimal way of contracting an entire subpath is by distributing the left and right halves of the edges in the subpath to the left and right neighbours, respectively.
Theorem 2 Let P ′ ⊆ P be a contiguous subpath of P (a weighted path on n vertices) consisting of k edges {e 1 , . . ., e k }, and let e 0 and e k+1 be the left and right neighbouring edges of P ′ respectively.Furthermore, let w i = w(e i ) ∀i ∈ {0, . . ., k+1}.The optimal error for contracting P ′ is obtained by , where x and y are the new edge weights of e 0 and e k+1 respectively (see Figure 8).If P ′ has no left neighbour (e 0 does not exist), the optimal error can be achieved by setting If P ′ has no right neighbour (e k+1 does not exist), the optimal error can be achieved by setting Finally, if P ′ has neither a left nor a right neighbour, the optimal error can be achieved by simply contracting P ′ and no further modifications (weight changes) are required.
Proof: The case with both neighbours existing is immediate from Lemma 5, Lemma 6, Eq. ( 8), and the fact that If P ′ has no left neighbour (e 0 does not exist), we have n L = 0 and consequently E LR = E L = 0.It follows that E = E R whose optimal value is obtained by setting y = w k 2 +1 + w k 2 +2 + • • • + w k+1 using Lemma 6.The other cases can be shown analogously.
To prove that it is sufficient to alter only the immediate neighbouring edges of P ′ , we only provide a sketch to avoid repetition.The idea is very similar to the proof of Lemma 2 and Lemma 3. Suppose we have any arbitrary weight redistribution, which we transform to the one provided in this theorem.Let u be some vertex in V L (as in Figure 8).In the original redistribution, let x be the length of the shortest path from u to the super vertex v * = {v 2 , v 3 , . . ., v k+2 } in P ′ (Figure 8-(b)).It is easy to see that in the original distribution, the error between u and all of the vertices in v * is equal to: For a fixed x, E 1 corresponds to E L n L (Eq. ( 9)).It is easy to see that in the new redistribution, the error between u and all vertices in v * is equal to n L .Therefore, using Lemma 5, we know that n L ≤ 0 for any x, and this change in the weight redistribution cannot worsen the error associated with any u ∈ V L .Other cases can be handled analogously.□

Merging a Set of Independent Edges
We now generalize the results of Section 3.1 by proving the correctness of Algorithm 1 for merging any set of independent edges.The proof of correctness consists of the following lemma and theorem, which are similar to Lemma 2 and Theorem 1, respectively.
Lemma 7 For merging a set of independent edges E m from a path on n vertices P n , there exists an optimal redistribution in which for each e ∈ E m , only its left neighbouring edge is marked.If e ′ ∈ E m is the leftmost edge on P n , then this optimal solution is obtained by marking the left neighbouring edge of all edges in E m except for e ′ .Proof: The proof is similar to the proof of Lemma 2, and we will provide a sketch using Figure 9.
In Figure 9, the edges in E m and the vertices in V m are highlighted in red, and the vertices in V m are depicted in blue.We assign an ordering to the vertices (of V m and V m ) and the edges (of E m and E m ) from left to right, as illustrated in Figure 9.Let v i and u j be the i-th and the j-th vertex in V m and V m respectively according to this ordering.Similarly, let e i and e * j be the i-th and the j-th edge in E m = E − E m and E m respectively.For convenience, we denote w(e i ) and w(e * i ) by w i and w * i respectively.Figure 9-(b) depicts some arbitrary weight redistribution in which the new weight of each edge e i is set to w(e i ) + ϵ i .We shall show that the error associated with the weight redistribution of Figure 9-(c) (in which the left neighbours of E m are marked) is no worse than that of Figure 9-(b).We again assume that all edges in E m have left neighbours.First, observe how this new weight redistribution removes any error between the vertices in V m .For instance, in the path of Figure 9-(c), the shortest path value between v 1 , v 3 ∈ V m is the same as the one in the original path (Figure 9-(a)).Therefore, it suffices to study only the error between all pairs of vertices (u, v), u ∈ V m , v ∈ V m .Using our ordering of edges, let e * k = (u j , u j+1 ) ∈ E m and let v i ∈ V m be a vertex to the left of e * k (we will explain how the other case can be handled analogously).Continuing with our example of Figure 9, let e * k = e * 3 = (u 5 , u 6 ) and v i = v 1 .Observe how between v i (v 1 in Figure 9-(a)) and u j+1 (u 6 in Figure 9-(c)), there exists no error in the new redistribution as they have equal shortest path values in the original graph (Figure 9-(a)) and the new distribution (Figure 9-(c)).We show that, going from the distribution of Figure 9-(b) to the one in Figure 9-(c), any increase in the error between v i and the left endpoint of e * k (u j ) can be nullified by the decrease in the error between v i and u j+1 .The case where v i is located on the right of e * k can be handled similarly.For any E ′ ⊆ E we define the following quantities: where W ′ (E ′ ) denotes the sum of all ϵ i 's in the distribution of Figure 9-(b).Let π v,u , π ′ v,u , and π ′′ v,u denote the shortest path values between v and u in the original graph (Figure 9-(a)), the first redistribution (Figure 9-(b)), and the second redistribution (Figure 9-(c)) respectively.Moreover, let E (u,v) denote the set of edges on the unique shortest path from u to v. We have: We provide some examples of these quantities in Example 1 for better readability.Note that: The error between v i and u j+1 in the redistribution of Figure 9-(b) is: As mentioned before, the error between v i and u j+1 in the weight redistribution of Figure 9-(c) is equal to zero: Therefore, transforming Figure 9-(b) into Figure 9-(c) changes the error between v i to u j+1 by: The error between v i and u j in the redistribution of Figure 9-(b) is: The error between v i and u j in the weight redistribution of Figure 9-(c) is equal to: Transforming Figure 9-(b) into Figure 9-(c) changes the error between v i to u j by: ) using Corollary 1.Therefore, going from the first redistribution to the second one changes the error between the endpoints of e * k = (u j , u j+1 ) and v i by: Since each e * k edge in E m has exactly two endpoints, this concludes the proof for the first case (v i is on the left of e * k ).The other case can be handled analogously.□ Example 1 Returning to our example of Lemma 7 and Figure 9, let e * k = (u j , u j+1 ) = e * 3 = (u 5 , u 6 ), and v i = v 1 .Then: Theorem 3 Let |∆E| be the optimal error resulting from merging a set of k independent edges e 1 , e 2 , . . ., e k with respective weights w * 1 , w * 2 , . . ., w * k from a path on n vertices This optimal value can be achieved by marking the left neighbour of each edge in E m after contraction.If the leftmost edge in E m has no left neighbour, the optimal error can be achieved by marking the left neighbours of all other edges in E m .
Proof: Let w ′ : E → R ≥0 be the weight redistribution that marks the left neighbouring edge (if any) of each edge in E m (Figure 9-(c)).That w ′ is optimal follows directly from Lemma 7. We now prove the error associated with w ′ .
Since the edges in E m induce a matching on Recall from the proof of Lemma 7 that in w ′ , there exists no error between two vertices v 1 , v 2 ∈ V m .Let us fix some e * k ∈ E m .Using the proof of Lemma 7, we know that each vertex v i ∈ V m induces an error of w * k with exactly one endpoint of e * k (and no error with the other endpoint).Summing over all vertices v i ∈ V m , we get that each edge e * k ∈ E m accumulates a total of (n − 2k)w * k in error.Summing again over all edges e * k ∈ E m yields the desired bound.□

Graph Compression for Trees
In this section, we study the problem of distance-preserving graph compression for weighted trees.
Precisely, we study a relevant problem, referred to as the marking problem, for a tree T = (V, E), |V | = n, and weight function w : E − → R ≥0 .The remainder of this section is organized as follows.In Section 4.1, we formally define the marking problem.The adaptation of the error function (Eq.( 1)) to the marking problem is thoroughly explained in Section 4.2.As a warm-up, we study a special case of the marking problem in Section 4.3, after which we generalize the results in Section 4.4 and present a lineartime algorithm for solving the marking problem in Algorithm 2. As the final component of this section, we thoroughly study the difference between the marking problem (Definition 9) and the fractional marking problem (Definition 14) in Section 4.5.

The Marking Problem for a Single Edge
As seen in Section 3, for merging a single edge in a weighted path, marking one of the neighbouring edges produces the optimal amount of error.An important question is how to generalize this result to solve the same problem for weighted trees.We formally state the marking problem as: Definition 9 The Marking Problem for Weighted Trees: Given a contracted edge e * in a weighted tree T , what subset of the neighbouring edges of e * should we mark such that the error value of Eq. ( 1) is minimized over all such possible subsets?
An example of the marking problem is depicted in Figure 10-(a), where edge e * with weight w * is contracted.As shown in Figure 10-(b), in the marking problem, the goal is to mark a subset of the neighbouring edges of e * , by setting the new weight of each marked edge e i to w ′ (e i ) = w(e i ) + ϵ i , ϵ i ∈ {0, w * }, in a way that minimizes the error function of Eq. ( 1) over all such possible subsets.Note that the fractional case (when the weight of each marked edge e i is set to w ′ (e i ) = w(e i ) + ϵ i , ϵ i ∈ [0, w * ]) is thoroughly studied in Section 4.5.
In the tree of Figure 10-(a), e * has four neighbouring edges, namely e 1 = (v 1 , v 3 ), e 2 = (v 1 , v 4 ), e 3 = (v 2 , v 5 ), and e 4 = (v 2 , v 6 ).Different subsets of these neighbouring edges can be marked, for instance, in Figure 11-(a), {e 1 , e 2 } is marked.In the remainder of this section, we may refer to each of these marked subsets as a marking for simplicity.For example, in Figure 11-(c), {e 1 , e 3 } is a marking.An optimal marking is one that minimizes the error function of Eq. ( 1) over all possible markings.
Since for merging an edge in a weighted path marking one of the neighbouring edges gives the optimal amount of error, our intuition tells us that in a weighted tree, we have to mark all neighbouring edges on one side of the contracted edge e * .As we shall show later, this intuition, though not completely correct, is optimal for specific kinds of input.To study the marking problem, we first present some definitions and observations using Figure 10 and Figure 11 as our running examples.We assume the tree is laid out in the plane and e * (the edge to be merged) is horizontal.This assumption will simplify the description of our results.Definition 10 Let T = (V, E) be a weighted tree with non-negative weights, and let e * = (v 1 , v 2 ) be the merged edge with weight w * , V m = {v 1 , v 2 }, and V m = V − V m .We denote by L the number of subtrees to the left of v 1 and by R the number of subtrees to the right of v 2 .More formally, let E ′ = E − e * .We have: For instance, in the tree of Figure 10, we have V L = {v 3 , v 4 } and V R = {v 5 , v 6 } and therefore } is a forest F, the components of which are used in our analyses and defined as follows: Definition 11 Let T , e * = (v 1 , v 2 ), V L and V R be as defined in Definition 10.Let F be the forest T − {v 1 , v 2 }.Furthermore, assume that the connected components of F are rooted at the vertices of V L or V R , and let C L and C R be the sets of components of F rooted at the vertices of V L and V R respectively.Then, we denote by T L i , i ∈ {1, . . ., L} the i-th member of C L , and by T R j , j ∈ {1, . . ., R} the j-th member of C R , given some arbitrary ordering on the members of C L and C R .
In the tree of Figure 10, L = 2, and C L has two members (the subtrees rooted at v 3 and v 4 ).Given some arbitrary ordering on the members of C L , T L 1 is the subtree rooted at v 3 .We also formally define the cardinality of the subtrees of Definition 11 as follows: . ., L} and T R j , j ∈ {1, . . ., R} be as defined in Definition 11.We have We refer to L i as the cardinality of the i-th edge on the left and R j as the cardinality of the j-th edge on the right.
A few examples of marking the edges of Figure 10 are provided in Figure 11-(a) to Figure 11-(c).In Figure 11-(a) and Figure 11-(b), all edges on one side of e * are marked, and in Figure 11-(c), a subset of edges from both sides are marked.Marking an edge could both increase and decrease the total amount of error.Before proceeding with the remainder of this section, we note the following lemma to justify our focus on minimizing the error between all pairs of vertices in V m .Lemma 8 (See Figure 10) Let e * = (v 1 , v 2 ) be the single merged edge in a weighted tree T = (V, E), and let V m = V − {v 1 , v 2 }.Then, as long as every neighbouring edge of e * is either marked or unmarked, the error between some vertex u ∈ V m and the vertices in {v 1 , v 2 } is minimized.
Proof: This lemma is a direct result of Lemma 1 and Theorem 1.Let us fix some vertex u ∈ T L 2 (see Figure 10-(b)), the error between u and the endpoints of e * , v 1 and v 2 , can be formulated as: Using Lemma 1, we have |∆E| ′ ≥ w * , and |∆E| ′ = w * for 0 ≤ ϵ 2 ≤ w * .Therefore, when (v 1 , v 4 ) is either marked or unmarked, we have ϵ 2 ∈ {0, w * }, which satisfies the desired conditions.This analysis applies to all nodes u ∈ V m , thus the lemma follows.□ In the remainder of this section, we therefore only focus on minimizing the error between all pairs of vertices u 1 , u 2 ∈ V m , because by the definition of the marking problem (Definition 9), the conditions of Lemma 8 are automatically satisfied.
(a) Both edges on the left are marked.
(b) Both edges on the right are marked.

2
(c) A subset of edges from both sides is marked.
Figure 11: The figure used in Section 4.2 for formulating the marking error, the marked edges are highlighted in red.We denote by T L i , i ∈ {1, 2}, and T R j , j ∈ {1, 2} the subtree rooted at the i-th edge to the left and the j-th edge to the right respectively.Moreover, L i = |{v|v ∈ T L i }| and R j = |{v|v ∈ T R j }| denote the number of vertices in each subtree.

Formulating the Error
Example 3 The error between v 3 and v 5 in Figure 11-(

JGAA, 28(1) 179-224 (2024) 205
Because in the original graph (Figure 10-(a)), e * appears only once on the unique path from v 3 to v 5 , while in the modified graph (Figure 11-(c)), the weight of e * appears twice.The total amount of error between all pairs of vertices u Example 4 In Figure 11-(c), the error between v 5 and v 6 is |w 3 + w * + w 4 − w 3 − w 4 | = w * .The total amount of error between all pairs of vertices u Example 5 In Figure 11-(c), the total amount of error between all pairs of vertices u 1 ∈ T L 1 and Example 6 In Figure 11-(a), the error between v 3 and v 5 is The length of the unique path between v 3 and v 5 does not change compared with Figure 10-(a).
Observation 1 Between the vertices of two edges (vertices belonging to the subtree rooted at that edge) adjacent to the endpoints of e * , there might exist some error.We classify this observation into the following cases: 1. Let T L i and T L j be the subtrees adjacent to two distinct marked edges on the left.Then, the total amount of error between all pairs of vertices u

Let T R
i and T R j be the subtrees adjacent to two marked edges on the right.Then, the total amount of error between all pairs of vertices u 3. Let T L i and T R j be the subtrees adjacent to two marked edges on the left and right, respectively.Then, the total amount of error between all pairs of vertices u 1 ∈ T L i and

Let T R
i and T R j be the subtrees adjacent to a marked edge and an unmarked edge on the right, respectively.Then, the total amount of error between all pairs of vertices u 1 ∈ T R i and 5. Let T L i and T L j be the subtrees adjacent to a marked edge and an unmarked edge on the left, respectively.Then, the total amount of error between all pairs of vertices u 1 ∈ T L i and 6. Let T L i be the subtree adjacent to a marked edge on the left, and T R j be the subtree adjacent to an unmarked edge on the right.Then, the total amount of error between all pairs of vertices u 1 ∈ T L i and u 2 ∈ T R j is equal to zero (see Example 6).
7. Let T L i be the subtree adjacent to an unmarked edge on the left, and T R j be the subtree adjacent to a marked edge on the right.Then, the total amount of error between all pairs of vertices u 1 ∈ T L i and u 2 ∈ T R j is equal to zero.

Equal-Sized Subtrees
We now investigate a special case where each subtree on the left has n L vertices and each subtree on the right has n R vertices, i.e., L i = n L , 1 ≤ i ≤ L, and Recall that every merged edge has two sides, left and right, one of which is designated as the preferable side.
A given side is preferable if it produces a smaller amount of error when fully marked compared to its fully marked counterpart.For example, if the left side is preferable, we have: The above inequality compares the error between the marking with the left side fully marked and the right side fully unmarked (Figure 11-(a)), and the opposite marking with the right side fully marked and the left side fully unmarked (Figure 11-(b)).In the first marking, there exists no error between the left and the right sides (Observation 1, Case 6), but there are L 2 distinct pairs of marked edges on the left, each inducing an error of n L ×n L ×2w * (Observation 1, Case 1).Therefore, the total amount of error for the first marking is equal to L 2 ×n L ×n L ×2w * = n 2 L ×L(L−1)×w * .The other marking can be analyzed analogously.Note that in the remainder of this section, we drop w * from each error term, and each error term counts the error units, where each unit is equal to w * .Therefore, all quantities are implicitly multiplied by w * in the remainder of this section.
The following lemma states that, for a contracted edge e * that has equal-sized subtrees on each side, the optimal solution is obtained by marking all edges on the preferable side of e * and leaving the other side completely unmarked.
Lemma 9 Given a merged edge e * (in a weighted tree) with two sides left and right, such that the subtrees on each side have equal sizes, the optimal marking is obtained if one side (the preferable side) is fully marked and the other side is fully unmarked.
Proof: By contradiction.This lemma assumes each subtree on the left and right side has n L and n R vertices respectively, i.e.
Without loss of generality, we assume the left side is preferable throughout this proof.Therefore, we have: Let i and j denote the number of marked edges on the left and right, respectively.We define two functions, MARK LEFT, which marks one of the edges on the left, and UNMARK RIGHT, which unmarks one edge on the right.We will show that for all values i < L or j > 0, one can achieve smaller error values by applying a series of MARK LEFT's and UNMARK RIGHT's and ending up at i = L and j = 0, as desired.For a function f ∈ F = {MARK LEFT, UNMARK RIGHT}, we define ∆(f ) as the amount of change in the error value after applying f to the tree.Since we are interested in decreasing the error value using the functions in F, in this proof, we will look for conditions under which ∆(MARK LEFT) ≤ 0 and ∆(UNMARK RIGHT) ≤ 0. We begin by investigating MARK LEFT.Note that this function sets i ← − i + 1 and j ← − j.We observe the following: 1.Because we are marking a new edge, the total amount of error between the marked edges on the left changes by: 2. The total amount of error between the unmarked edges and the marked ones on the left changes by: 3. The total amount of error between the marked edges on the left and right changes by: The total amount of error between the unmarked edges on the left and right changes by: Therefore, ∆(MARK LEFT) is equal to: Since we are looking for conditions under which ∆(MARK LEFT) ≤ 0, we have: Rearranging the terms, we have A similar reasoning can be used for UNMARK RIGHT.This function sets i ← − i and j ← − j − 1.We have: 1.The total amount of error between the marked edges on the right changes by: 2. The total amount of error between the unmarked edges and the marked ones on the right changes by: 3. The total amount of error between the marked edges on the left and right changes by: 4. The total amount of error between the unmarked edges on the left and right changes by: Thus, we have: Rearranging the terms, we have We conclude the proof by stating that whenever i < L or j > 0, one can achieve smaller error values by applying a series of MARK LEFT's and UNMARK RIGHT's and ending up at i = L and , Eq. ( 23) is satisfied.Therefore, we repeatedly apply MARK LEFT until i = L, at which point Eq. ( 24) is satisfied, and we repeatedly apply UNMARK RIGHT until j = 0, as desired.Now suppose edges are marked on the right side.If , Eq. ( 24) is satisfied, which allows us to repeatedly apply UNMARK RIGHT until j = 0, at which point Eq. ( 23) is satisfied and we repeatedly apply MARK LEFT until i = L, as desired.
, and both Eq. ( 23) and Eq. ( 24) are unsatisfied.We first apply and ∆(UNMARK RIGHT) = 0. Therefore, we set j ← − 0 without changing the error (since ∆(UNMARK RIGHT) = 0), and then we apply L−( 2n L MARK LEFT's until i = L, as desired.We now show that this sequence of MARK LEFT's and UNMARK RIGHT's results in an error value no worse than that of the original one: ≤ 0 and we arrive at i = L and j = 0 while obtaining a smaller error value.□ In the next section, we generalize Lemma 9 to the case in which different subtrees can have varying sizes. (a) A full marking of the edges on the left with error count 32.
(b) A full marking of the edges on the right with error count 40.
(c) A marking with edges marked on both sides with error count 75.
The optimal solution with error count 27.

Figure 12:
An example of tree compression in which the edges on each side have different-sized subtrees.The optimal solution does not have a full marking on any side.However, the optimal solution only has marked edges on one side (this is always the case as shown in Lemma 10).

Varying-Size Subtrees
As a generalization of Section 4.3, now assume the i-th subtree on the left (1 ≤ i ≤ L) has L i nodes, and the j-th subtree on the right (1 ≤ j ≤ R) is of size R j .We observe when each side has subtrees of different sizes, marking all edges on one side does not necessarily produce the optimal error.An example is depicted in Figure 12, where marking only one edge on the right produces the optimal amount of error.Although marking all edges on one side does not necessarily produce the optimal error, we observe that no optimal solution has markings on both sides, as the following lemma states.Similar to Section 4.3, we remove w * from all calculations and expressions in this section.Therefore, all calculations in this section are implicitly multiplied by w * .
Lemma 10 Given a merged edge e * (in a weighted tree) with two sides left and right, no optimal marking has marked edges on both sides.
Proof: By contradiction.We assume there exists such an optimal marking, and we strictly improve its error by unmarking everything on one of the two sides (thus obtaining a contradiction).

Partial Markings
In Lemma 10, we observed that no optimal marking has edges marked on both sides.In this section, we introduce the concept of partial markings, used to form optimal marking after merging a given edge e * .A partial left (respectively right) marking, denoted by M L (respectively, M R ), is a marking with all edges on the right (respectively, left) unmarked, and a subset of the edges on the left (respectively, right) marked.We call a partial marking optimal if its error count is less than any other partial marking for its respective side.Let M * L and M * R denote the optimal partial left and right markings, respectively.The following lemma is easy to prove.Lemma 11 After merging edge e * in a weighted tree with non-negative weights, the optimal marking M * is either M * L or M * R , depending on which one produces a smaller amount of error.
Proof: Immediate from Lemma 10. □ Applying the results of Lemma 11, we can find an optimal marking M * by finding the optimal partial markings M * L and M * R , comparing their respective error values, and choosing the one with the smaller error value as the optimal marking.The question is how to find the optimal partial markings, and it is answered in the following lemma.

Lemma 12
The optimal partial marking M * L consists of all edges e i (adjacent to Similarly, the optimal partial marking M * R consists of all edges e i (adjacent to Proof: We first prove Eq. ( 29) and derive Eq. ( 30) by symmetry.Suppose we are trying to construct an optimal partial marking for the left side.We can do so by keeping the right side Input: T = (V, E) (A tree with n vertices), an edge e * = (u, v) to be merged, the error function E(.)

3:
Output: A marking of edges M * with the optimal amount of error Remove e * from T and merge its endpoints 11: for each e i ∈ E L do 13: end if for each e end if

21:
end for 22: 23: Return M * 24: end procedure unmarked and marking edges on the left until the error can no longer be improved.Recall Eq. ( 27) from the proof of Lemma 10, we have to keep marking all edges e i (with cardinality L i ) until the error can no longer be improved, the change in the error value at each step is equal to: At each step, to get an improvement, we must have ∆(MARK LEFT) ≤ 0: However, since we are calculating a partial marking for the left side, we know by definition that the right side has to remain fully unmarked at all times, so we have S RU = S R and S RM = 0. Inserting these values in the above equation, we get the following inequality for edges that improve the partial left marking: Then, we can deduce that if an edge e i on the left satisfies S L − S R ≤ L i , it must be marked in M * L .Conversely, if an edge on the left e i is marked in M * L , it must satisfy S L − S R ≤ L i .To see why, assume M * L includes an edge e i with S L − S R > L i .Then, we can improve M * L by unmarking e i (see Eq. ( 26)): which contradicts the optimality of M * L and proves Eq. ( 29).The other inequality (Eq.( 30)) can be proven analogously by applying Eq. (28).□ We present our linear-time algorithm for finding the optimal marking after merging an edge e * in Algorithm 2.

Example 7
As an example, let us demonstrate how Algorithm 2 finds the optimal marking for the tree of Figure 12.The optimal marking M * L consists of all edges on the left, because: On the other hand, the optimal marking M * R consists of only one edge on the right, because: Moreover, because M * R has a better error count than M * L , Algorithm 2 returns M * R as the overall optimal marking M * which is the correct answer as depicted in Figure 12.Now, we summarize our result in the following theorem.

Fractional Markings
In the previous section, we studied the marking problem under the assumption that each edge could either be fully marked or fully unmarked.In this section, we study a generalized version of the marking problem, called the fractional marking problem (to be defined momentarily).We show that Algorithm 2 does not err by assuming that each edge can either be fully marked or fully unmarked.
Definition 13 With reference to a given merged edge e * in a graph G = (V, E) with the associated weight function w : E → R ≥0 , and a new weight redistribution function w ′ : E → R ≥0 , an edge e i is said to be fractionally marked if w ′ (e i ) = w (e i ) + c i w (e * ) for some c i ∈ (0, 1).The edge e i is fully marked if c i = 1.
Each neighbouring edge e i has thus an assigned c i , which denotes the (possibly fractional) amount by which it is marked.An edge e i is marked by ϵ if its corresponding c i is set to c ′ i = c i + ϵ, and it is unmarked by ϵ if its corresponding c i is set to c Definition 14 The Fractional Marking Problem for Weighted Trees: Given a contracted edge e * in a weighted tree T with non-negative weights, what subset of the neighbouring edges of e * should we fully mark or fractionally mark such that the error value of Eq. ( 1) is minimized over all such possible subsets?Similar to the previous section, we may omit some occurrences of w * from our calculation for convenience.We borrow our previous running example (Figure 10 and Figure 11) and extend it to present an example of fractional markings in Figure 14. Figure 14-(a) depicts the tree of Figure 10 with two edges fractionally marked.Figure 14-(b) illustrates a succinct representation of Figure 14-(a), where each fractionally marked edge e i is shown using its respective c i (Definition 13) and the weights of the unmarked edges are omitted.We use this succinct version often in the remainder of this section.
As a warm-up, we first present a property of any optimal marking that has at least one fractionally marked edge.
Lemma 13 Let M be an optimal marking for a contracted edge e * (in a weighted tree) that has at least one fractionally marked edge e ′ .Then, M necessarily has marked edges on both sides.
Proof: By contradiction.Suppose M is an optimal marking with a fractionally marked edge e ′ , and suppose M is a partial left or right marking (Section 4.4.1) with marked edges only on the left or right, respectively.Without loss of generality, assume M is a partial left marking with a fractionally marked edge e ′ = e 1 .As depicted in Figure 15-(a), assume e 1 has cardinality L 1 and marking value c 1 (Definition 13).We can obtain another marking M ′ by unmarking e 1 (Figure 15-(b)).Let E be the error function, then E(M ) = E(M ′ ) + ∆ 1 (MARK LEFT) and E(M ) < E(M ′ ) because M is an optimal marking.Therefore, ∆ 1 (MARK LEFT) < 0 when marking e 1 back in M ′ .We now formulate ∆ 1 (MARK LEFT) when marking e 1 in M ′ by c 1 .
Figure 16: Case 1 in the proof of Lemma 14: (a) There exist two edges e 1 and e 2 such that the properties of this subcase, we may get another marking M ′ by unmarking e i (setting c ′ i = 0) and then marking it by c j to get a third marking M ′′ with E(M ′′ ) ≤ E(M ).Similar to the proof of Lemma 13, we have: +(c j − c i ) × X Case 2-2: For all marked edges e i (with c i > 0) on the left c i = ϵ 1 , and for all marked edges e j (with c j > 0) on the right c j = ϵ 2 (ϵ 1 + ϵ 2 ≤ 1) (Figure 17-(b)).
For this case, we simply show that the error associated with the optimal partial marking (Lemma 12) is a lower bound on E(M ), or min(E(M * R ), E(M * L )) ≤ E(M ).Without loss of generality, we assume Furthermore, unmarking e i by ϵ decreases the error between the vertices of e i and the vertices of all other edges on the left by −ϵ × w * .Therefore, unmarking e i by ϵ changes E(M ) by: Because M is an optimal marking, ∆ 1 (UNMARK LEFT) ≥ 0 and: Conversely, we may assume that any edge e i on the left satisfying L i ≥ S L − S R is marked in M ; because otherwise, we could improve M by marking e i 2 .Similar reasoning can be applied to any marked edge e i on the right.method of Case 1 until it satisfies the conditions of Case 2-1.We then repeatedly apply the construction method of Case 2-1 until M satisfies the conditions of Case 2-2.Finally, if M satisfies the conditions of Case 2-2, we have already shown that E(M ) is lower bounded by the optimal partial marking of Lemma 12, which has no fractionally marked edges.□

Conclusion and Open Problems
In this paper, we studied the problem of distance-preserving graph compression for weighted paths and trees.We first presented a brief literature review of some related work in this domain, noting that one particular aspect of the problem is understudied.More specifically, there has been little attention in the literature to the problem of optimally compressing a given set of edges.To address this, we presented optimal algorithms for compressing any set of k edges in a weighted path and for optimally compressing a single edge in a weighted tree.We tackled the problems in an incremental order of difficulty.For weighted paths, we first solved the problem of optimally compressing a single edge, then we generalized it to any set of k independent edges.Finally, we provided an optimal approach to compressing any contiguous subset of edges in a weighted path.We then generalized our scope to weighted trees, where we studied the problem of optimally compressing a single edge.To this end, we first studied the easier case in which the subtrees of both sides of the merged edge had equal sizes.Finally, we generalized our results to the case in which subtrees were of different sizes.This research leads to several questions that require further exploration.
Problem 1 How to optimally contract multiple edges in a tree?
This problem includes the cases where the compressed edges form a contiguous subtree, when the edges form a matching, or when a combination of both cases happens.When multiple edges are to be contracted in a tree, many cases need to be considered in order to properly formulate the error function.Therefore, it may be worthwhile to see whether formulating the error when multiple edges are being contracted in a tree results in any interesting observations from a combinatorial optimization point of view, like the ones mentioned in Section 3 or Section 4.2.
There are two main reasons why the problem of contracting multiple edges in a tree is not as straightforward as its counterpart in a path.Firstly, the maximum degree in a tree is unbounded, whereas, in a path, the maximum degree is two.The second reason (which is a direct result of the first one) is that a node in an arbitrary tree can have many children.When an edge e * with weight w * is contracted in a path, there are only two groups of shortest paths that should not have w * added to their values, the ones that lie to the left of e * and the ones that lie to its right.However, as observed in Section 4.2, even for merging a single edge e * in a tree, there are significantly more cases to consider.With no restrictions on the maximum degree of an arbitrary tree, any error unit enumeration technique (such as the ones employed in Section 3 or Section 4.2) could quickly become obsolete due to an explosion in the number of cases when many edges are contracted.
Problem 2 Can we solve the distance-preserving graph compression problem for general graphs in polynomial time?
The above problem would indeed be a natural extension of this paper.The complexity of the weight redistribution problem for general graphs is still unknown.However, it appears that the related problem of finding the contracted edges is unlikely to be solved in polynomial time.Bernstein et al. [5] showed that CONTRACTION (defined in Section 1) is NP-hard even if the underlying graph is just a weighted cycle.In a graph with cycles, some vertices are connected via multiple paths.Therefore, after merging a single edge, several shortest paths that traverse that edge may need to be rerouted using completely different edges, making the analysis much more difficult.
Problem 3 Recall from Definition 7 that with reference to a set of merged edges E m ⊂ E, the set of merged vertices V m consists of all vertices with at least one endpoint in E m , or V m = {v|v, u ∈ V, ∃e = (u, v) ∈ E m }.How could we find an optimal redistribution strategy that also minimizes the error between all pairs of vertices in V m ?Note that even if some weight redistribution minimized the error between two nodes in different supernodes, it would still be non-trivial to do the same for two vertices that are placed in a single supernode.Obviously, a trivial solution would be to store the shortest path weights between the vertices in one supernode as separate table entries.However, such an approach would defeat the whole purpose of graph compression, which is to reduce memory requirements.
Problem 4 For the optimal weight redistribution problem, are there any better cost models (error functions)?
As stated in Section 2, in this paper, we defined the error function as the sum of the absolute differences of the shortest path lengths between different pairs of nodes before and after redistributing the weights.However, exploring alternative cost functions that better capture the distance-based similarity between a modified graph and its original version can open up exciting research avenues.Investigating whether there exist other cost functions that provide a more accurate measure of closeness between graphs can lead to valuable research opportunities.

Definition 5
-(b), {u 1 , v 1 } is a supernode with cardinality 2. Let G = (V, E) be a graph with weight function w : E → R ≥0 , and let e * ∈ E be the merged edge.A weight redistribution is a new weight function w

Definition 6 Definition 7
-(b), the weight redistribution sets the edge weights of Figure 1-(a) as w ′ (e) = w(e) + w(e * ) if e = e 3 , and w ′ (e) = w(e) otherwise.With reference to a given merged edge e * in a graph G = (V, E) with the associated weight function w : E → R ≥0 and a new weight redistribution function w ′ : E → R ≥0 , an edge e i is said to be marked if w ′ (e i ) = w(e i ) + w(e * ), unmarked if w ′ (e i ) = w(e i ), and altered otherwise.As shown in Figure1-(b), e 3 is marked and all other edges are unmarked.With reference to a set of merged edges E m ⊂ E, the set of merged vertices V m consists of all vertices with at least one endpoint in E m , or

Figure 2 :
Figure 2: Merging a single edge e * = (v 2 , v 3 ) with weight B in a path of n vertices, with n L ≥ 0, n R ≥ 0 vertices on the left of v 1 and the right of v 4 respectively (n L + n R = n − 2) (a) The original graph before merging v 2 and v 3 into a supernode.The neighbouring edges of e * have weights A and C. (b) The modified path after merging v 2 and v 3 into a supernode.
The only affected portion of such a shortest path is the subpath between v 1 and v 4 , the value of which changes from A + B + C (in Figure2-(a)) to x + y (in Figure2-(b)).Summing over all such pairs -(a)) to x (Figure 2-(b)), inducing an error of |x − A|.

2
and v 3 , there is an error of |y − C| + |y − B − C|.According to Lemma 1, |y − C| + |y − B − C| is minimized as long as C ≤ y ≤ B + C, which is the case if all edges are unmarked, i.e., y = C.We can use a similar argument if e * has neither a right nor a left neighbour.□

n 1
+2 } (b) An arbitrary weight redistribution which is transformed in the proof of Lemma 2 to the redistribution of Figure 4-(b).

Figure 3 :
Figure 3: The figure used in the proof of Lemma 2. The vertices of V L and V R are depicted in red and blue, respectively.

Figure 4 :
Figure 4: The construction method of Lemma 2.

Figure 5 :
Figure 5: Case 2 in the proof of Lemma 2. (a) The original graph (b) The original weight redistribution (c) After applying the construction method of Lemma 2. The affected shortest path of Case 2 is highlighted in red.
and v n1+1 changes by |w * |−| n1 k=i ϵ k |, because in the original redistribution, the error is equal to | n1 k=i ϵ k | (Eq.(3)) and in the new redistribution, it is equal to |w * |.This change might lead to an increase in error; however, by using Corollary 1 and setting z = w * and x = n1 k=i ϵ k we have:

Figure 6 :Algorithm 1 5 : 7 : 8 :
Figure 6: (a) An example of merging an entire subpath P ′ ⊂ P with four edges, (b) A suboptimal solution generated by Algorithm 1, and (c) The optimal solution.

Figure 7 :
Figure 7: An example of merging two supernodes with cardinalities k and k ′ .
x and y denote the new weights of the edges adjacent to (v, u), we have:|∆E| = n L × |x − A| × kbetween the subpath of w1 and the vertices in v + n L × |x − A − B| × k ′ between the subpath of w1 and the vertices in u + n R × |y − C| × k ′ between the subpath of w2 and the vertices in u + n R × |y − B − C| × k between the subpath of w2 and the vertices in v + n L × n R × |x + y − A − B − C| between the subpath of w1 and w2

Figure 8 :
Figure 8: The figure used in the proofs of Section 3.3 (a) Before merging an entire subpath P ′ ⊂ P with k edges, (b) After the merge.

Figure 9 :
Figure 9: The figure used in the proof of Lemma 7. (a) The original graph.The vertices and edges in V m and E m are depicted in red and the vertices in V m are depicted in blue.(b) An arbitrary weight redistribution which assigns w ′ (e i ) = w(e i ) + ϵ i to every edge e i ∈ E m = E − E m (c) Another weight redistribution that only marks the left neighbouring edge of each edge in E m whose associated error is no worse than the one depicted in (b).

Figure 10 :
Figure 10: The figure used in Section 4.1 for defining the marking problem.We denote by T Li , i ∈ {1, 2}, and T R j , j ∈ {1, 2} the subtree rooted at the i-th edge to the left and the j-th edge to the right respectively.Moreover, L i = |{v|v ∈ T L i }| and R j = |{v|v ∈ T R j }| denote the number of vertices in each subtree.

Example 2
This section formally explains how marking a set of edges affects the error function.Using Figure11, we first present some examples, which we generalize later in Observation 1.Throughout this section, we may sometimes refer to this error as units of error, where each unit is equal to w * .The error between v 3 and v 4 in Figure11-(a) is equal to |w 1 +w * +w 2 +w * −w 1 −w 2 | = 2w * .In the original graph (Figure10-(a)), e * does not appear on the unique path between v 3 and v 4 , while in the modified graph (Figure11-(a)), the weight of e * appears twice.In the marking of Figure11-(a), the total amount of error between all pairs of vertices u 1

Figure 13 :
Figure 13: The example tree used in the proof of Lemma 10.(a) An arbitrary marking in which edges from both sides are marked, and more vertices are connected to the unmarked edges on the right (S RU ≥ S LU ).(b) A strictly better marking than (a) in which the heavier side is fully unmarked, as described in Lemma 10.

4 :
Find all edgesE L = {(u, w)|(u, w) ∈ E, w ̸ = v} 5:L ← − |E L |, such that each edge e i in E L is connected to a subtree of size L i for all i = {1, . . ., L} 6:

Theorem 4
Algorithm 2 computes the optimal marking for a merged edge e * in O(|V |) time.Proof: Immediate from Lemma 10, Lemma 11, and Lemma 12.□

Figure 14 :
Figure 14: An extension of Figure 10 and Figure 11 as an example of fractional markings.(a) One edge from the left side and one from the right are fractionally marked.(b) A succinct representation of (a) in which each fractionally marked edge e i is shown using its respective c i (Definition 13).

Figure 17 :
Figure 17: Case 2 in the proof of Lemma 14: (a) Case 2-1: For all edges e 1 , e 2 on opposite sides c 1 + c 2 ≤ 1, and there exist two edges e i , e j on one side with c i ̸ = c j (for instance 0.4 and 0.1 on the left) (b) Case 2-2: For all edges e 1 , e 2 on opposite sides c 1 + c 2 ≤ 1, for all edges e i on the left c i = ϵ 1 = 0.4, and for all edges e j on the right c j = ϵ 2 = 0.5.