1. Introduction
The hitting set problem involves a supergraph , where is a finite set with elements called
vertices and is a family of subsets of
called
hyperedges. Let be the maximum size of all hyperedges
and be the maximum degree of
all vertices. A hitting set of , also called a vertex cover of , is a subset such that each hyperedge in
contains at least one
vertex from . The hitting set
problem is to select a vertex subset of minimum size. For each vertex
, if there is an associated
nonnegative weight cost , we
obtain the weighted version of the problem, the goal of which is to
select a vertex subset of minimum covering cost. The minimum hitting set
problem is NP-hard [] and it
is one of the fundamental combinatorial optimization problems. In this
case, it is conventional that for any subset of the covering cost of is the sum of the covering cost of all
vertices in , that is, . Then this
function induces a linear
function from to .
Most results of decision problems in combinatorial optimization, such
as trees or cuts, vertex cover in graphs, facility location, set cover,
are in general asking for the linear costs. However, nonlinear costs are
more realistic in many settings and there have existed some works in the
area of combinatorial optimization problems with submodular costs.
Assume is a nonempty set and
is the power set of . Let be a
nonnegative-valued set function. We call submodular if for any and
modular if the equality holds. It is obvious that all modular functions
are submodular and so all linear functions. We call monotone if for each pair of subsets
with . Submodular functions occur
very often in a variety of subjects such as economics, engineering,
computer science and management science. There is a greedy approximation
which achieves an approximation ratio of for the hitting set problem
with linear costs, where
is the -th harmonic number
[1]. Moreover, using the
greedy approximation, Wan et al. [2] gave some similar results for the hitting set
problem with submodular costs.
One of the most fundamental problems related to submodular functions
is the submodular function minimization problem. In this problem, we are
given a finite set and a
submodular function
defined on , then the goal is
to select a subset of minimizing among the power set of , i.e., to select a minimizer of
submodular function over
. It is known that this
problem can be solved in polynomial time. Recently, constrained variants
of the submodular function minimization problem have been extensively
studied in various fields ([3-6]).
For instance, Hochbaum [4] studied the submodular minimization
problem with linear constraints having at most two variables per
inequality. Kamiyama [5] studied the the submodular function
minimization problem with covering type linear constraints.
Koufogiannakis and Young [3] studied the monotone submodular
function submodular minimization problem with general covering
constrains. Iwata and Nagano [6] studied the submodular function minimization
problem with set cover constraints. In this paper, we are motivated to
study the submodular hitting set problem (SHSP). That is, given a
nonnegative submodular function , the goal of the SHSP is to select a
vertex subset to cover
all the hyperedges that minimizes the submodular cost . We present a rounding algorithm and
a primal-dual program for the SHSP and prove that they both have the
approximation ratio .
The paper is organized as follows. In Section 2, we give some
definitions about submodular function. In Section 3, we present a
rounding algorithm and a primal-dual program respectively for the SHSP
and prove that they both have the approximation ratio , where is the maximum number of vertices in
all hyperedges.
2. Lovász
Extension of a Submodular Function
In this section, we introduce the extension of a submodular
function. Let be a submodular function.
Assume . Then for any
vector and any
non-empty subset , we
define where is the -th coordinate value of vector . It is easy to see that is a modular function from
For any , define
as
follows: called the characteristic
vector of .
Given a submodular function , denote We call a submodular
polyhedron and its elements
subbases. For any subbase , a subset of is called -tight whenever
. There is a
property for the collection of all -tight subsets.
Proposotion 1. <stron For any
given subbase , the
collection of all -tight
subsets is closed under union and intersection.
Proof. Since
is submodular and is
modular, the result holds. 
Now we introduce the extension of a submodular
function and its properties. Given a vector and suppose are all
distinct coordinate values of with . For
each , denote .
Then we define a vector function over as follows:
where . The function is said to be the extension of . It is easy to see that for any
subset of , and .
It is known from [7] that
The problem (1) is a linear
optimization over the submodular polyhedron, which is efficiently solved
by Edmonds [8]. Combining (1) with the primal-dual
theory, we get that is equal to the optimal
value of the following program:
There are some properties of the extension function and we
list them here for later use.
Lemma 1. ([7]) A
set function is submodular
if and only if is
convex.
Lemma 2. [9]
Assume is the
extension function
of submodular function ,
then we have
(positively homogeneous) for any real number and any vector , ;
the function
is a monotonically increasing convex function.
3. Approximation Algorithms
In this section, we construct a rounding algorithm and a primal-dual
algorithm for the SHSP.
For the given supergraph , assume , and
. Denote
Assume there is a submodular function with
, where means the covering cost of for any subset of .
Firstly we define a binary variable for any subset as follows: Therefore, the integer linear program for the SHSP
is
In the above program, the first constraint ensures that each
hyperedge is
covered by some vertex subset .
Since is a submodular
function, there must exist a unique subset of such that in the optimal solution of (3).
Recall that , the convex
programming for the SHSP is: The relaxation of (4) is:
3.1. A Rounding Algorithm for the
SHSP
For a submodular function , we define a set function
as Then is monotone and
submodular [10].
Moreover, this function has the following
properties.
Lemma 3.
([6])
suppose is the
exactly minimal subset satisfying
and . Then and for every subset
;
for each ;
is
monotonically increasing and positively homogeneous.
Since the program (5) can be solved in polynomial time via
the ellipsoid method, we denote an optimal
solution of (5). Based on the program (5), we give
the following rounding algorithm.
]
Step 0: Solve the program (5) and denote as the optimal
solution.
Step 1: Define the subset by Step 2: Define
be the exactly minimal
subset of such that and .
Step 3: Return .
Recall that For every hyperedge , since ,
at least one of must
be greater than which
ensures is a hitting set. Since
, is also a hitting set.
Moreover, the hitting set
can be obtained in polynomial time by submodular function minimization
theory. Therefore, Algorithm 1 can be implemented in polynomial time and
the obtained is a hitting
set. Finally, we analyze the approximation ratio of Algorithm 1.
Theorem 1. With reference to and defined above, then we have
for any
hitting set , where
Proof. Since is an optimal solution of (5) and for any
hitting set , we find Lemma 3(2) guarantees
and hence The fact and
Lemma 3(3) ensure that
Note that .
Then we have Combining (6) with (7), we obtain
as
desired. 
3.2. A Primal-dual Algorithm
In this subsection, we give a primal-dual algorithm for the SHSP. For
any subset , we denote
and we call is a hitting set if
Recall that for any given vector , is equal to the optimal
value of (2). Combining (2) with (5), we
consider the following relaxation program for the SHSP : We do not need to add the constraint that for the second
constraint of (8) ensures it for all
. And the dual program for (8) is: Before proposing the primal-dual algorithm for
the SHSP, we give some necessary facts.
Given a subbase
Let be defined by Then is a submodular
function. Observe that the definition of is related to the
modularity of . So we
get the function
for any subbase and
there is a proposition for the function.
Proposotion 2. Given a
subbase , there
exists the unique maximal subset
of for the function such that
Proof. Note that , it is clear that Moreover, we get that This implies that
the existence of . According to
Proposition 1, we see that for
every pair of minimizers of
function , is also a minimizer. Thus, there
exists the unique maximal subset for the function such that . 
Lemma 4. Let and be the unique maximal subset of satisfying . Assume and a hyperedge . If we define and then we have
Proof. To prove , it suffices to prove for every subset
of . The assumption implies that On the one hand, for every subset of such that , we have On the other hand, for every subset
of such that , we have
where the inequality follows from
the definition of The
result follows.
Next we prove . The assumption indicates . The definition of
implies for every
element Moreover, we get
The definition of implies
. Suppose is a subset of which satisfies both and
Then we have
The definition of implies
. Since , we have
Combining with we obtain . Since and
it is immediate
that . 
Based on the dual program (9), we give the
following primal-dual algorithm.
Algorithm 2(H). Step 0: Set Initialize . Define as the unique maximal subset of
such that .
Step 1: If , go to step 2. If not, select a
hyperedge .
Define the real number by Define
by and let Define to be the unique maximal subset
of such that .
Set .
Step 2: Let and . Return .
To obtain the ratio of Algorithm 2, we first prove the following
lemmas.
Lemma 5. Algorithm 2 can be implemented
in polynomial time.
Proof. Combining the definition of with Lemma 4(2), we get that
the algorithm iterates after at most times. That is, is at most . At every iteration, we can compute
in polynomial time [11]. The Proposition 2 guarantees there
exists the unique maximal such that and we
can compute it in polynomial time [7]. Therefore, Algorithm 2 can be implemented in
polynomial time. 
Algorithm outputs . Next we
will prove it is a feasible region of the dual program (9).
Lemma 6. For each
, is a feasible
solution of the dual program (9).
Proof. Need to show for each , satisfies all constrains of (9). We prove this
result by using induction on .
Since and is a nonnegative function, we
obtain is a
feasible solution of (9). Assume that this
lemma holds when . Suppose and Firstly, according to Lemma 4 (1), it is
obvious that
Combining with
the definition of , we
have , which
implies Next, we
prove that for each satisfies Since is a feasible
solution of (9), we have Note that and . For each , we obtain and From (11)-(13), it is immediate that (10) holds for each . For each , we have
and
which implies that (10) holds for each . Therefore, we have (10) for each . 
We now analyze the approximation ratio of Algorithm 2.
Theorem 2. Algorithm 2 is a -approximation
algorithm, where
Proof. Algorithm 2 returns a hitting set and the final feasible solution for the dual program
(9). Let OPT denote
the optimal objective value for the SHSP. In what follows, we prove that
Based on
Lemma 6 and the dual
program (9), we get Using the definitions of and , it is immediate that The final feasible solution for (9) indicates that
From (14)-(16), we get 
Acknowledgment
The authors would like to thank Professor Ding-Zhu Du for his many
valuable advices during their study of approximation algorithm.
This work is supported by the NSF of China (No. 11971146), the NSF of
Hebei Province (No. A2019205089, No. A2019205092), Hebei Province
Foundation for Returnees (CL201714) and Overseas Expertise Introduction
Program of Hebei Auspices (25305008).