The full algorithm combines the two steps. First, we find a decycling set in a complete de Bruijn graph of order k and remove it from the graph, obtaining a DAG. Then, we repeatedly remove a vertex v with the largest hitting number T(v, ℓ) until there are no ℓ-long paths, recomputing T(u, ℓ) for all remaining vertices u after each removal. This is summarized below (Algorithm 1).
1: Generate a complete de Bruijn graph G of order k, set ℓ = L − k.
2: Find a decycling vertex set X using Mykkeltveit’s algorithm.
3: Remove all vertices in X from graph G, resulting in G′.
4: while there are still paths of length ℓ do
5: Calculate D(v, i) and F(v, i) for each vertex v and 0 ≤ i ≤ ℓ.
6: Calculate T(v, ℓ) for each vertex v.
7: Remove a vertex with maximum hitting number from G′, and add it to set X.
8: end while
9: Output set X.
Finding the decycling set takes O(|Σ|k). In the second phase, each iteration calculates the hitting number of all vertices in time O(|Σ|k+1ℓ). The number of iterations is 1 + p, where p is the number of vertices removed. Thus, the total running time is dominated by steps 4–8 and is O((1 + p)|Σ|k+1ℓ).
The exponential dependence of DOCKS on k limits the range of k to which it can be applied (see Results, Subsection DOCKS). This motivates us to develop two variants that trade larger solution sizes for faster running times in the different heuristics described next.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.