Effective failure recovery schemes for survivable networks
MetadataShow full item record
As we rely more and more on information exchanged through networks, avoiding prolonged disruptions to information exchange due to unexpected failures of network equipment becomes increasingly important. In this work, we focus on an important traffic engineering problem, which is dynamic establishment and release of bandwidth guaranteed survivable connections. The major challenges are how to allocate minimal amount of spare resources using fast algorithms, and in case a failure occurs, be able to quickly recover from it. We introduce a novel framework called Distributed Partial Information Management. It addresses several major challenges in achieving efficient shared path protection under distributed control with only partial information. We also design a novel, ultra-fast heuristic algorithm to address an NP-hard optimization problem, routing under above framework. It uses a so-called Potential Backup Cost (PBC) function when selecting an active path in the first phase, in order to take into consideration the backup bandwidth needed by the corresponding backup path which is yet to be chosen in the second phase. Since the traffic is carried on the active (working) path most of the time, it is useful to find a shortest possible active path. We address the Min-Min problem of finding a disjoint path pair such that the length of the shorter path (i.e. active path) is minimized. We prove that this problem is NP-complete in either single link cost or dual link cost networks. And in addition, it is NP-hard to obtain a k -approximation to the optimal solution for any k > 1. We design effective heuristic for the Min-Min problem and extend it to the case of shared path protection. In addition, we propose shared segment protection algorithm which make little or no compromise between high bandwidth efficiency and quick recovery. We develop an elegant Integer Linear Programming (ILP) model to determine an optimal set of segments to protect a given active path. We also design a fast heuristic algorithm based on dynamic programming to obtain a near-optimal set of segments and achieve the same bandwidth efficiency as some best-performing shared path protection schemes. We also address the survivability of the network with Shared Risk Link Group (SRLG) by focusing on two major open issues related to SRLG: avoiding failures in path determination caused by "traps", and maximizing bandwidth sharing. We propose an innovative trap avoidance heuristic that requires much less running time than ILP and existing methods and yet can effectively avoid almost all the avoidable traps and achieve the same bandwidth efficiency as an ILP based approach. Finally, we study shared path protection schemes that use different policies to deal with different dual-link failure scenarios in order to minimize total bandwidth consumption under deterministic or dynamic routing. We find that shared path protection with a policy can considerably reduce bandwidth consumption compared with shared path protection without a policy.