3. The Assembly Steps Problem is NP-Complete
In order to show that any instance of Vertex Cover Problem
, where
is a graph,
is the set of vertices and
is the set of edges and
k is the cardinality of a set of vertices that includes at least one vertex of every edge of
G, which is known to be NP-Hard, can be reformulated in polynomial time as an instance of the Assembly Index Problem, the following procedure is offered (cf. [
1], Section 4.2). For a given instance of the Vertex Cover Problem
, where
, and
is the vertex cover number (the size of a minimum vertex cover), an instance of the Assembly Index Problem
is constructed, where
is a constructed assembly space, and
is the target string for which the assembly index
is to be determined. It is then claimed that a certificate for the Vertex Cover Problem
containing a subset
of vertices of
G that includes at least one vertex of every edge of
G can be used to produce a certificate
for the Assembly Index Problem and vice versa, where
is a rooted subspace (cf. [
1] Definition 15) of the assembly space
containing only a proper subset
of the strings of the form
. Hence, such an instance of the Assembly Index Problem would be logically equivalent to an instance of the Vertex Cover Problem from which it was constructed.
The construction of
(cf. [
1], Section 4.2) begins with defining the basis of the assembly space
(cf. Eqs. (17), (50)), i.e., the unit-length strings
containing
symbols of vertices
, and a special symbol that here we call "0" (it is defined as "#" in [
1]). Hence,
. Then, a set of
vertex strings
is assembled (Eq. (18)). Subsequently, a set of
edge strings
is assembled (Eq. (19)). The last step of the construction of
is a sequence of
strings
and
strings
defined in [
1] by Eqs. (20)-(25), where the target string
is defined as the last string of this sequence and
Finally ([
1], Section 4.2.3) it is claimed that given
is a certificate for the Assembly Index Problem if the set (
5) is a vertex cover of
G with size
k, i.e. a certificate for the Vertex Cover Problem is given, wherein
is the assembly index of string
and
which depends on
k and is minimal if
.
By construction, the basic symbols (
6), the edge strings (
8), and the sequence strings (
9a) and (
9b) contained in
must also be contained in
(certificate). However, the vertex strings (
7) of the form
are the exception, as each of the edge strings (
8) can be assembled from strings (
7) in one of the two mutually exclusive steps (cf. [
1] Eqs. (53), (54))
leaving some of the strings
or
redundant. It can be seen by comparing the cardinalities of the spaces
(
10) and
(
11), which - as expected - leads to
. There are
strings (
7) in
and only
strings in
.
By construction (
9a), (
9b), the target string has the form
where the
is a vertex-specific part of
depending solely on
and its explicit form is given by the formula (
9a), and the
is an edge-specific part of
, generated by the formula (
9b), and depending both on
, edge vertex assignments and the order of labeling of the edges of graph
G. However,
, as the length of each edge string (
8) is five and there are
such strings in
and
. Therefore, the length of the target string is
Furthermore, by construction
contains two copies of the string
of length
having the assembly index equal to
as it does not contain any repetitions of substrings. We can take advantage of the fact that each
m copies of an
n-plet
contained in a string decrease the assembly index of this string at least by
[
3], where
is the assembly index of this
n-plet, to estimate the upper bound for the assembly index of
reduced by the presence of these two copies of
. Furthermore, excluding the degenerate cases of empty and disjoint graphs
G, we can further infer some information about
. That is, since any vertex
is a part of some edge
,
contains at least two repetitions of doublets
(or
), with
as the string
also contains
such doublets, and each repetition decreases the assembly index by one. Hence, the upper bound must be further decreased by
. Finally, each string
contains
repetitions of a doublet
and, hence, the upper bound must be further decreased by
. Therefore, the initial upper bound on the assembly index that amounts to
[
5] if
[
3] decreases to
which, in contrast to
(
11), is independent of
k.
We have examined a few simple graphs, shown in
Figure 2, obtaining the results listed in
Table 1.
As an example, consider the trivial graph
, shown in
Figure 2(b) having two edges connected at one vertex. Hence, its vertex cover number is
. In this case, (
10)
and the target string generated by sequences (
9a) and (
9b) has the form
As the vertex cover of the graph
G is the vertex 2, the subspace
(the certificate) is devoid of triplets
and
, since the edges
and
share the vertex 2, and the edge strings (
8) could be assembled as
and
. Therefore, the number of steps on the assembly pathway of
defined by
, given by the relation (
11), amounts to
, as shown in
Figure 3(a) also illustrating the assembly depth [
6] (
) of this string: 7 steps (1-7) for vertex strings (
7), 2 steps (8, 9) for edge strings (
8), 6 steps (10-15) for sequence strings (
9a), and 2 steps (16, 17) for sequence strings (
9b)) which corresponds to the vertex cover number
, if only the string (
16) is assembled using the set of allowed assembly operations defined by the equations Eqs. (38)-(45) of [
1].
However, imposing such a set of allowed assembly steps deviates from the principles of assembly theory that assume the possibility of assembling any
object from any two
objects in the assembly pool. Even if we assume that only some steps are allowed and some are not due to peculiarities of the assembled data structures, this is certainly not the case for strings, considered in [
1] in the proof of Lemma 3. All strings are possible and mathematically well defined [
7]. What could be the reason for allowing the assembly of a string
and disallowing the assembly of a string
from a set of basic symbols
? The evolution of information became possible as soon as a first bit, not a first
particle or
object, became accessible [
2,
3,
8,
9].
Therefore, the assembly index of the string (
16) is
. One of the shortest pathways of the string (
16) is shown in
Figure 3(c) with
. A quadruplet
present in two independent copies is assembled in step 5, 5-plet
present in two copies is assembled in step 6. Furthermore,
contains two independent copies of
,
, and
. A slightly longer pathway leading to the string of length (
15) is shown in
Figure 3(b).