In this paper, we study the submodular hitting set problem (SHSP), which is a variant of the hitting set problem. In the SHSP, we are given a supergraph \(H = (V, \mathcal{C})\) and a nonnegative submodular function on the set \(2^{V}\). The objective is to determine a vertex subset to cover all hyperedges such that the cost of submodular covering is minimized. Our main work is to present a rounding algorithm and a primal-dual algorithm respectively for the SHSP and prove that they both have the approximation ratio \(k\), where \(k\) is the maximum number of vertices in all hyperedges.
The hitting set problem involves a supergraph \(H = (V, \mathcal{C})\), where \(V\) is a finite set with elements called vertices and \(\mathcal{C}\) is a family of subsets of \(V\) called hyperedges. Let \(k\) be the maximum size of all hyperedges and \(\Delta\) be the maximum degree of all vertices. A hitting set of \(H\), also called a vertex cover of \(H\), is a subset \(X\subseteq V\) such that each hyperedge in \(\mathcal{C}\) contains at least one vertex from \(X\). The hitting set problem is to select a vertex subset of minimum size. For each vertex \(u\in V\), if there is an associated nonnegative weight cost \(w(u)\), we obtain the weighted version of the problem, the goal of which is to select a vertex subset of minimum covering cost. The minimum hitting set problem is NP-hard [] and it is one of the fundamental combinatorial optimization problems. In this case, it is conventional that for any subset \(X\) of \(V\) the covering cost of \(X\) is the sum of the covering cost of all vertices in \(X\), that is, \(w(X)=\sum\limits_{u\in X}w(u)\). Then this function \(w(\cdot)\) induces a linear function from \(2^{V}\) to \(R_{+}\).
Most results of decision problems in combinatorial optimization, such as trees or cuts, vertex cover in graphs, facility location, set cover, are in general asking for the linear costs. However, nonlinear costs are more realistic in many settings and there have existed some works in the area of combinatorial optimization problems with submodular costs. Assume \(E\) is a nonempty set and \(2^{E}\) is the power set of \(E\). Let \(F(\cdot): 2^{E}\rightarrow R_{+}\) be a nonnegative-valued set function. We call \(F(\cdot)\) submodular if for any \(X, Y\in 2^{E}, F(X\cup Y)+F(X\cap Y)\leq F(X)+F(Y)\) and \(F(\cdot)\) modular if the equality holds. It is obvious that all modular functions are submodular and so all linear functions. We call \(F(\cdot)\) monotone if \(F(X)\leq F(Y)\) for each pair of subsets \(X, Y\in 2^{E}\) with \(X\subseteq Y\). Submodular functions occur very often in a variety of subjects such as economics, engineering, computer science and management science. There is a greedy approximation which achieves an approximation ratio of \(H(\Delta)\) for the hitting set problem with linear costs, where \(H(\Delta)\) is the \(\Delta\)-th harmonic number [1]. Moreover, using the greedy approximation, Wan et al. [2] gave some similar results for the hitting set problem with submodular costs.
One of the most fundamental problems related to submodular functions is the submodular function minimization problem. In this problem, we are given a finite set \(E\) and a submodular function \(F(\cdot)\) defined on \(2^{E}\), then the goal is to select a subset \(X\) of \(E\) minimizing \(F(X)\) among the power set of \(E\), i.e., to select a minimizer of submodular function \(F(\cdot)\) over \(2^{E}\). It is known that this problem can be solved in polynomial time. Recently, constrained variants of the submodular function minimization problem have been extensively studied in various fields ([3-6]). For instance, Hochbaum [4] studied the submodular minimization problem with linear constraints having at most two variables per inequality. Kamiyama [5] studied the the submodular function minimization problem with covering type linear constraints. Koufogiannakis and Young [3] studied the monotone submodular function submodular minimization problem with general covering constrains. Iwata and Nagano [6] studied the submodular function minimization problem with set cover constraints. In this paper, we are motivated to study the submodular hitting set problem (SHSP). That is, given a nonnegative submodular function \(f(\cdot): 2^{V}\rightarrow R_{+}\), the goal of the SHSP is to select a vertex subset \(X\subseteq V\) to cover all the hyperedges that minimizes the submodular cost \(f(X)\). We present a rounding algorithm and a primal-dual program for the SHSP and prove that they both have the approximation ratio \(k\).
The paper is organized as follows. In Section 2, we give some definitions about submodular function. In Section 3, we present a rounding algorithm and a primal-dual program respectively for the SHSP and prove that they both have the approximation ratio \(k\), where \(k\) is the maximum number of vertices in all hyperedges.
In this section, we introduce the \(Lov\acute{a}sz\) extension of a submodular function. Let \(F(\cdot): 2^{E}\rightarrow R_{+}\) be a submodular function.
Assume \(|E|=n\). Then for any vector \(\pmb{z}\in R^{n}\) and any non-empty subset \(X \subseteq E\), we define \[\pmb{z}(X)=\sum\limits_{i\in X}\pmb{z}(i),\] where \(\pmb{z}(i)\) is the \(i\)-th coordinate value of vector \(\pmb{z}\). It is easy to see that \(\pmb{z}(\cdot)\) is a modular function from \(2^{E}\rightarrow R.\)
For any \(X\subseteq E\), define \(\pmb{\chi}_{X}\in \{0, 1\}^{n}\) as follows: \[\pmb{\chi}_{X}(i)= \left\{ \begin{array}{ll} 1, & \hbox{if $i\in X$,} \\ 0, & \hbox{otherwise,} \end{array} \right.\] called the characteristic vector of \(X\).
Given a submodular function \(F(\cdot)\), denote \[P(F)=\{\pmb{z} \mid \pmb{z}\in R^{n}, \pmb{z}(X)\leq F(X), \forall X\subseteq E\}.\] We call \(P(F)\) a submodular polyhedron and its elements subbases. For any subbase \(\pmb{z}\in P(F)\), a subset \(X\) of \(E\) is called \(\pmb{z}\)-tight whenever \(\pmb{z}(X)=F(X)\). There is a property for the collection of all \(\pmb{z}\)-tight subsets.
Proposotion 1. <stron For any given subbase \(\pmb{z}\in P(F)\), the collection of all \(\pmb{z}\)-tight subsets is closed under union and intersection.
Proof. Since \(F(\cdot)\) is submodular and \(\pmb{z}(\cdot)\) is modular, the result holds. ◻
Now we introduce the \(Lov\acute{a}sz\) extension of a submodular function and its properties. Given a vector \(\pmb{p}\in R^{n}_{+}\) and suppose \({p_{1}, p_{2}, \cdots, p_{l}}\) are all distinct coordinate values of \(\pmb{p}\) with \(p_{1}> p_{2}> \cdots> p_{l}\). For each \(j=1, 2,\cdots, l\), denote \(N_{j}=\{i\mid \pmb{p}(i)\geq p_{j}\}\). Then we define a vector function over \(R^{n}_{+}\) as follows: \[\hat{F}(\pmb{p})=\sum_{j=1}^{l}(p_{j}-p_{j+1})F(N_{j}),\] where \(p_{l+1}=0\). The function \(\hat{F}(\cdot)\) is said to be the \(Lov\acute{a}sz\) extension of \(F(\cdot)\). It is easy to see that for any subset \(X\) of \(E\), \(\hat{F}(\pmb{\chi}_{X})=F(X)\) and \(\hat{F}(\pmb{0})=F(\emptyset)\).
It is known from [7] that \[\label{2020-4-5-N11} \hat{F}(\pmb{p})=\max\left\{\sum\limits_{i=1}^{n}\pmb{p}(i)\pmb{z}(i) \mid \pmb{z}\in P(F)\right\}.\tag{1}\] The problem (1) is a linear optimization over the submodular polyhedron, which is efficiently solved by Edmonds [8]. Combining (1) with the primal-dual theory, we get that \(\hat{F}(\pmb{p})\) is equal to the optimal value of the following program: \[\label{2020-4-5-N12} \begin{aligned} {\rm min} \sum_{X\subseteq E}F(X)\cdot\zeta(X) \\{\rm s.t.} \sum_{\substack{X\subseteq E \\ i\in X}}\zeta(X)=\pmb{p}(i), \forall i\in E, \\ \zeta(X)\geq 0, \forall X\subseteq E. \end{aligned}\tag{2}\]
There are some properties of the \(Lov\acute{a}sz\) extension function and we list them here for later use.
Lemma 1. ([7]) A set function \(F(\cdot)\) is submodular if and only if \(\hat{F}(\cdot)\) is convex.
Lemma 2. [9]
Assume \(\hat{F}(\cdot)\) is the \(Lov\acute{a}sz\) extension function of submodular function \(F(\cdot)\), then we have
(positively homogeneous) for any real number \(\alpha>0\) and any vector \(\pmb{v}\geq\pmb{0}\), \(\hat{F}(\alpha\pmb{v})=\alpha\hat{F}(\pmb{v})\);
the function \(\hat{F}(\cdot)\) is a monotonically increasing convex function.
In this section, we construct a rounding algorithm and a primal-dual algorithm for the SHSP.
For the given supergraph \(H = (V, \mathcal{C})\), assume \(|V|=n\), \(\mathcal{C}=\{T_1,T_2,\ldots,T_m\}\) and \(V = \cup_{i=1}^{m}\{{T_i}\}\). Denote \(k=\max\{|T_{i}|\mid T_{i}\in \mathcal{C}\}.\) Assume there is a submodular function \(f(\cdot): 2^{V}\rightarrow R_{+}\) with \(f(\emptyset)=0\), where \(f(X)\) means the covering cost of \(X\) for any subset \(X\) of \(V\).
Firstly we define a binary variable \(\zeta(X)\) for any subset \(X\subseteq V\) as follows: \[\zeta(X)= \left\{ \begin{array}{ll} 1, & \hbox{if $X$ is selected to cover some hyperedges,} \\ 0, & \hbox{otherwise.} \end{array} \right.\] Therefore, the integer linear program for the SHSP is
\[\label{E-20-4-5-p0} \begin{aligned} {\rm min} \sum_{X\subseteq V}f(X)\zeta(X) \\{\rm s.t.} \ \sum\limits_{\substack{X\subseteq V \\ X\cap T_{i}\neq\emptyset}} \zeta(X) \geq 1, \forall T_{i}\in \mathcal{C}, \\ \zeta(X)\in \{0,1\}, \forall X\subseteq V. \end{aligned}\tag{3}\]
In the above program, the first constraint ensures that each hyperedge \(T_{i} \in \mathcal{C}\) is covered by some vertex subset \(X\). Since \(f(\cdot)\) is a submodular function, there must exist a unique subset \(X\) of \(V\) such that \(\zeta(X)=1\) in the optimal solution of (3).
Recall that \(\hat{f}(\pmb{\chi}_{X})=f(X)\), the convex programming for the SHSP is: \[\label{equ5} \begin{aligned} {\rm min} \ \hat{f}(\pmb{x}) \\{\rm s.t.}\ \sum_{u\in T_{i}}\pmb{x}(u)\geq 1, \forall T_{i}\in \mathcal{C}, \\ \pmb{x}(u)\in \{0,1\}, \forall u\in V. \end{aligned}\tag{4}\] The relaxation of (4) is: \[\label{equ6} \begin{aligned} {\rm min} \ \hat{f}(\pmb{x}) \\{\rm s.t.}\ \sum_{u\in T_{i} }\pmb{x}(u)\geq1, \forall T_{i}\in \mathcal{C}, \\ \pmb{x}(u)\geq 0, \forall u\in V. \end{aligned}\tag{5}\]
For a submodular function \(f(\cdot)\), we define a set function \(f^{\circ}(\cdot): 2^{V}\rightarrow R_{+}\) as \[f^{\circ}(X)=\min\{f(Z)\mid X\subseteq Z\subseteq V \}, \forall X\subseteq V.\] Then \(f^{\circ}(\cdot)\) is monotone and submodular [10]. Moreover, this function \(\hat{f^{\circ}}(\cdot)\) has the following properties.
Lemma 3. ([6])
suppose \(X^{\circ}\) is the exactly minimal subset \(Z\) satisfying \(X \subseteq Z \subseteq V\) and \(f(Z)=f^{\circ}(X)\). Then \(f^{\circ}(X)=f(X^{\circ})\) and \(f(X)\geq f^{\circ}(X)\) for every subset \(X \subseteq V\);
for each \(\pmb{x} \in R_{+}^{n},\) \(\hat{f}(\pmb{x})\geq \hat{f}^{\circ}(\pmb{x})\);
\(\hat{f}^{\circ}(\cdot)\) is monotonically increasing and positively homogeneous.
Since the program (5) can be solved in polynomial time via the ellipsoid method, we denote \(\pmb{x^{*}}\in R_{+}^{n}\) an optimal solution of (5). Based on the program (5), we give the following rounding algorithm.
]
Step 0: Solve the program (5) and denote \(\pmb{x^{*}}\in R_{+}^{n}\) as the optimal
solution.
Step 1: Define the subset \(A\subseteq V\) by \[A=\{u\in V\mid \pmb{x^{*}}(u)\geq
\frac{1}{k}\}.\] Step 2: Define
\(A^{\circ}\) be the exactly minimal
subset of \(V\) such that \(A \subseteq A^{\circ}\) and \(f(A^{\circ})=f^{\circ}(A)\).
Step 3: Return \(A^{\circ}\).
Recall that \(k=\max\{|T_{i}|\mid T_{i}\in \mathcal{C}\}.\) For every hyperedge \(T_{i}\in \mathcal{C}\), since \(\sum\limits_{u\in T_{i}}\pmb{x}(u)\geq 1\), at least one of \(\pmb{x}^{*}(u)\) must be greater than \(\frac{1}{k},\) which ensures \(A\) is a hitting set. Since \(A\subseteq A^{\circ}\), \(A^{\circ}\) is also a hitting set. Moreover, the hitting set \(A^{\circ}\) can be obtained in polynomial time by submodular function minimization theory. Therefore, Algorithm 1 can be implemented in polynomial time and the obtained \(A^{\circ}\) is a hitting set. Finally, we analyze the approximation ratio of Algorithm 1.
Theorem 1. With reference to \(f(\cdot), f^{\circ}(\cdot)\) and \(A^{\circ}\) defined above, then we have \(f(A^{\circ})\leq kf(M)\) for any hitting set \(M\), where \(k=\max\{|T_{i}|\mid T_{i}\in \mathcal{C}\}.\)
Proof. Since \(\pmb{x^{*}}\in R_{+}^{n}\) is an optimal solution of (5) and \(\hat{f}(\pmb{\chi}_{M})=f(M)\) for any hitting set \(M\), we find \[\hat{f}(\pmb{x^{*}})\leq \hat{f}(\pmb{\chi}_{M})=f(M).\] Lemma 3(2) guarantees \[\hat{f}^{\circ}(\pmb{x^{*}})\leq \hat{f}(\pmb{x^{*}})\] and hence \[\label{equ-2020-4-13-A0} \hat{f}^{\circ}(\pmb{x^{*}})\leq f(M).\tag{6}\] The fact \(\pmb{\chi}_{A} \leq k\pmb{x^{*}}\) and Lemma 3(3) ensure that \[\hat{f}^{\circ}(\pmb{\chi}_{A})\leq \hat{f}^{\circ}(k\pmb{x^{*}})=k\hat{f}^{\circ}(\pmb{x^{*}}).\] Note that \(\hat{f}^{\circ}(\pmb{\chi}_{A})=f^{\circ}(A)=f(A^{\circ})\). Then we have \[\label{equ-2020-4-13-A1} f(A^{\circ})\leq k\hat{f}^{\circ}(\pmb{x^{*}}).\tag{7}\] Combining (6) with (7), we obtain \(f(A^{\circ})\leq kf(M)\) as desired. ◻
In this subsection, we give a primal-dual algorithm for the SHSP. For any subset \(X\subseteq V\), we denote \(\mathcal{S}_{X}=\{T_{i}\in \mathcal{C}\mid T_{i}\cap X\neq \emptyset\}\) and we call \(X\) is a hitting set if \(\mathcal{S}_{X}=\mathcal{C}.\)
Recall that for any given vector \(\pmb{p}\in R^{n}_{+}\), \(\hat{f}(\pmb{p})\) is equal to the optimal value of (2). Combining (2) with (5), we consider the following relaxation program for the SHSP : \[\label{2020-4-5-N13} \begin{aligned} {\rm min} \sum_{X\subseteq V}f(X)\cdot\zeta(X) \\ {\rm s.t.}\sum_{u\in T_{i} }\pmb{x}(u)\geq1, \forall T_{i}\in \mathcal{C}, \\ \sum_{\substack{X\subseteq V \\ u\in X}}\zeta(X)= \pmb{x}(u), \forall u\in V, \\ \zeta(X)\geq 0, \forall X\subseteq V. \end{aligned}\tag{8}\] We do not need to add the constraint that \(\pmb{x}(u)\geq 0\) for the second constraint of (8) ensures it for all \(u\in V\). And the dual program for (8) is: \[\label{equ-2020-4-6-10} \begin{aligned} {\rm max} \sum_{T_{i}\in \mathcal{C}}y(T_{i}) \\ {\rm s.t.}\sum_{\substack{T_{i}\in\mathcal{C} \\ u\in T_{i}}}y(T_{i})=\pmb{z}(u), \forall u\in V, \\ \pmb{z}\in P(f), \\ y(T_{i})\geq 0, \forall T_{i}\in \mathcal{C}. \end{aligned}\tag{9}\] Before proposing the primal-dual algorithm for the SHSP, we give some necessary facts.
Given a subbase \(\pmb{z}\in P(f).\) Let \((f-\pmb{z})(\cdot):2^{V}\rightarrow R_{+}\) be defined by \[(f-\pmb{z})(X)=f(X)- \pmb{z}(X),\forall X\subseteq V.\] Then \((f-\pmb{z})(\cdot)\) is a submodular function. Observe that the definition of \((f-\pmb{z})(\cdot)\) is related to the modularity of \(\pmb{z}(\cdot)\). So we get the function \((f-\pmb{z})(\cdot)\) for any subbase \(\pmb{z}\in P(f)\) and there is a proposition for the function.
Proposotion 2. Given a subbase \(\pmb{z}\in P(f)\), there exists the unique maximal subset \(X\) of \(V\) for the function \((f-\pmb{z})(\cdot)\) such that \(f(X)= \pmb{z}(X).\)
Proof. Note that \(\pmb{z}\in P(f)\), it is clear that \[(f-\pmb{z})(X)\geq 0, \forall X\subseteq V.\] Moreover, we get that \(\min\limits_{X\subseteq V}(f-\pmb{z})(X)=(f-\pmb{z})(\emptyset)=0.\) This implies that the existence of \(X\). According to Proposition 1, we see that for every pair of minimizers \(X, Y\) of function \((f-\pmb{z})(\cdot)\), \(X\cup Y\) is also a minimizer. Thus, there exists the unique maximal subset \(X \subseteq V\) for the function \((f-\pmb{z})(\cdot)\) such that \(f(X)= \pmb{z}(X)\). ◻
Lemma 4. Let \(\pmb{z}\in P(f)\) and \(A\) be the unique maximal subset of \(V\) satisfying \(f(A)= \pmb{z}(A)\). Assume \(\mathcal{C}\backslash \mathcal{S}_{A}\neq\emptyset\) and a hyperedge \(T_{j} \in \mathcal{C}\backslash \mathcal{S}_{A}\). If we define \[\alpha=\min_{\substack{X\subseteq V\\\pmb{\chi}_{T_{j}}(X)\neq 0}}\frac{f(X)-\pmb{z}(X)}{\pmb{\chi}_{T_{j}}(X)}\] and \[\pmb{z}'=\pmb{z}+\alpha \cdot \pmb{\chi}_{T_{j}},\] then we have
\(\pmb{z}'\in P(f)\);
furthermore, we define \(A'\) as the maximal subset of \(V\) such that \(f(A')= \pmb{z}'(A').\) Then we have \(A\subsetneq A'\).
Proof. To prove \(\pmb{z}'\in P(f)\), it suffices to prove \(\pmb{z}'(X)\leq f(X)\) for every subset \(X\) of \(V\). The assumption \(\pmb{z} \in P(f)\) implies that \[\pmb{z}(X)\leq f(X),\forall X\subseteq V.\] On the one hand, for every subset \(X\) of \(V\) such that \(\pmb{\chi}_{T_{j}}(X)= 0\), we have \[\pmb{z}'(X)=\pmb{z}(X)+\alpha \cdot0=\pmb{z}(X)\leq f(X).\] On the other hand, for every subset \(X\) of \(V\) such that \(\pmb{\chi}_{T_{j}}(X)\neq 0\), we have \[\pmb{z}'(X)=\pmb{z}(X)+\alpha \cdot \pmb{\chi}_{T_{j}}(X)\leq \pmb{z}(X)+ \frac{f(X)-\pmb{z}(X)}{\pmb{\chi}_{T_{j}}(X)}\cdot \pmb{\chi}_{T_{j}}(X)= f(X),\] where the inequality follows from the definition of \(\alpha.\) The result follows.
Next we prove \(A\subsetneq A'\). The assumption \(T_{j} \in \mathcal{C}\backslash \mathcal{S}_{A}\) indicates \(A\cap T_{j}= \emptyset\). The definition of \(\pmb{z}'\) implies \(\pmb{z}'(u)=\pmb{z}(u)\) for every element \(u \in A.\) Moreover, we get \(\pmb{z}'(A)=\pmb{z}(A)=f(A).\) The definition of \(A'\) implies \(A\subseteq A'\). Suppose \(Z\) is a subset of \(V\) which satisfies both \(\pmb{\chi}_{T_{j}}(Z)\neq 0\) and \[\alpha=\frac{f(Z)-\pmb{z}(Z)}{\pmb{\chi}_{T_{j}}(Z)}.\] Then we have \(f(Z)=\pmb{z}'(Z).\) The definition of \(A'\) implies \(Z\subseteq A'\). Since \(\pmb{\chi}_{T_{j}}(Z)\neq 0\), we have \(T_{j}\cap Z\neq 0.\) Combining \(A\cap T_{j}=\emptyset\) with \(T_{j}\cap Z\neq 0,\) we obtain \(A\nsubseteq Z\). Since \(A\nsubseteq Z, A\subseteq A'\) and \(Z\subseteq A',\) it is immediate that \(A\subsetneq A'\). ◻
Based on the dual program (9), we give the following primal-dual algorithm.
Algorithm 2(H). Step 0: Set \(t=1.\) Initialize \(y_{1}=0, \pmb{z}_{1}=0\). Define \(A_{1}\) as the unique maximal subset of
\(V\) such that \(f(A_{1})= \pmb{z}_{1}(A_{1})\).
Step 1: If \(\mathcal{C}\backslash
\mathcal{S}_{{A}_{t}}=\emptyset\), go to step 2. If not, select a
hyperedge \(T_{j}\in \mathcal{C}
\backslash\mathcal{S}_{{A}_{t}}\).
Define the real number \(\alpha_{t}\) by \[\alpha_{t}=\min_{\substack{X\subseteq V\\\pmb{\chi}_{T_{j}}(X)\neq 0}}\frac{f(X)-\pmb{z}_{t}(X)}{\pmb{\chi}_{T_{j}}(X)}\] Define \(y_{t+1}\) by \[y_{t+1}(T_{i})= \begin{cases} y_{t}(T_{i})+\alpha_{t},& \text{if $T_{i}=T_{j}$ ,}\\ y_{t}(T_{i}),& \text{otherwise} \end{cases}\] and let \[\pmb{z}_{t+1}= \pmb{z}_{t}+\alpha_{t}\cdot \pmb{\chi}_{T_{j}}.\]Define \(A_{t+1}\) to be the unique maximal subset of \(V\) such that \(\pmb{z}_{t+1}(A_{t+1})=f(A_{t+1})\).
Set \(t=t+1\).
Step 2: Let \(T=t\) and \(A=A_{T}\). Return \(A\).
To obtain the ratio of Algorithm 2, we first prove the following lemmas.
Lemma 5. Algorithm 2 can be implemented in polynomial time.
Proof. Combining the definition of \(A\) with Lemma 4(2), we get that the algorithm iterates after at most \(|V|+1\) times. That is, \(T\) is at most \(|V|+1\). At every iteration, we can compute \(\alpha_{t}\) in polynomial time [11]. The Proposition 2 guarantees there exists the unique maximal \(A_{t+1} \subseteq V\) such that \(\pmb{z}_{t+1}(A_{t+1})=f(A_{t+1})\) and we can compute it in polynomial time [7]. Therefore, Algorithm 2 can be implemented in polynomial time. ◻
Algorithm outputs \(\{(y_{1}, \pmb{z}_{1}), (y_{2}, \pmb{z}_{2}), \cdots ,(y_{T}, \pmb{z}_{T})\}\). Next we will prove it is a feasible region of the dual program (9).
Lemma 6. For each \(t \in\{1, 2, \cdots, T\}\), \((y_{t}, \pmb{z}_{t})\) is a feasible solution of the dual program (9).
Proof. Need to show for each \(t \in\{1, 2, \cdots, T\}\), \((y_{t}, \pmb{z}_{t})\) satisfies all constrains of (9). We prove this result by using induction on \(t\).
Since \((y_{1}, \pmb{z}_{1})=\pmb{0}\) and \(f(\cdot)\) is a nonnegative function, we obtain \((y_{1}, \pmb{z}_{1})\) is a feasible solution of (9). Assume that this lemma holds when \(t=k\). Suppose \(t=k+1\) and \(T_{j}\in \mathcal{C}\backslash \mathcal{S}_{{A}_{k}}.\) Firstly, according to Lemma 4 (1), it is obvious that \(\pmb{z}_{k+1}\in P(f).\) Combining \(\pmb{z}_{k}\in P(f)\) with the definition of \(\alpha_{k}\), we have \(\alpha_{k}\geq 0\), which implies \(y_{k+1}\geq 0.\) Next, we prove that for each \(u\in V,\) \((y_{k+1}, \pmb{z}_{k+1})\) satisfies \[\label{2020-4-5-a0} \sum_{T_{i}\in\mathcal{C}:u\in T_{i}}y_{k+1}(T_{i})=\pmb{z}_{k+1}(u).\tag{10}\] Since \((y_{k}, \pmb{z}_{k})\) is a feasible solution of (9), we have \[\label{2020-4-5-a1} \sum_{T_{i}\in\mathcal{C}:u\in T_{i}}y_{k}(T_{i})=\pmb{z}_{k}(u),\forall u\in V.\tag{11}\] Note that \(V=T_{j}\cup (V\backslash T_{j})\) and \(\pmb{z}_{k+1}=\pmb{z}_{k}+\alpha_{k} \cdot \pmb{\chi}_{T_{j}}\). For each \(u\in V\backslash T_{j}\), we obtain \[\label{2020-4-5-a2} \pmb{z}_{k+1}(u)=\pmb{z}_{k}(u)\tag{12}\] and \[\label{2020-4-5-a3} \sum_{T_{i}\in\mathcal{C}:u\in T_{i}}y_{k+1}(T_{i})=\sum_{T_{i}\in\mathcal{C}:u\in T_{i}}y_{k}(T_{i}).\tag{13}\] From (11)-(13), it is immediate that (10) holds for each \(u\in V\backslash T_{j}\). For each \(u\in T_{j}\), we have \[\pmb{z}_{k+1}(u)-\pmb{z}_{k}(u)=\alpha_{k}\] and \[\sum_{T_{i}\in\mathcal{C}:u\in T_{i}}y_{k+1}(T_{i})-\sum_{T_{i}\in\mathcal{C}:u\in T_{i}}y_{k}(T_{i})=y_{k+1}(T_{j})-y_{k}(T_{j})=\alpha_{k},\] which implies that (10) holds for each \(u\in T_{j}\). Therefore, we have (10) for each \(u\in V\). ◻
We now analyze the approximation ratio of Algorithm 2.
Theorem 2. Algorithm 2 is a \(k\)-approximation algorithm, where \(k=\max\{|T_{i}|\mid T_{i}\in \mathcal{C}\}.\)
Proof. Algorithm 2 returns a hitting set \(A\) and the final feasible solution \((y_{T}, \pmb{z}_{T})\) for the dual program (9). Let OPT denote the optimal objective value for the SHSP. In what follows, we prove that \[f(A)\leq k\cdot OPT.\] Based on Lemma 6 and the dual program (9), we get \[\label{2020-4-5-a10} f(A)=\pmb{z}_{T}(A)=\sum_{u\in A}\pmb{z}_{T}(u)=\sum_{u\in A}\sum_{\substack{T_{i}\in\mathcal{C}\\ u\in T_{i}}}y_{T}(T_{i}).\tag{14}\] Using the definitions of \(k\) and \(A\), it is immediate that \[\label{2020-4-5-a11} \sum_{u\in A}\sum_{\substack{T_{i}\in \mathcal{C}\\ u\in T_{i}}}y_{T}(T_{i})\leq k \sum_{T_{i}\in \mathcal{C}}y_{T}(T_{i}).\tag{15}\] The final feasible solution \((y_{T}, \pmb{z}_{T})\) for (9) indicates that \[\label{2020-4-5-a12} \sum_{T_{i}\in \mathcal{C}}y_{T}(T_{i})\leq OPT.\tag{16}\] From (14)-(16), we get \(f(A)\leq k\cdot OPT.\) ◻
The authors would like to thank Professor Ding-Zhu Du for his many valuable advices during their study of approximation algorithm.
This work is supported by the NSF of China (No. 11971146), the NSF of Hebei Province (No. A2019205089, No. A2019205092), Hebei Province Foundation for Returnees (CL201714) and Overseas Expertise Introduction Program of Hebei Auspices (25305008).