Reduction from CLIQUE to Perl Regular Expression Matching

Credit: Greg Bacon

The CLIQUE Problem

A graph G = (V, E) is complete if u-v is an element of E for all distinct vertices u and v in V, i.e., all vertices in V are adjacent. The CLIQUE problem is: Given a graph and an integer k at least one, does the graph contain a complete subgraph of size k (also known as a k-clique)?

Transformation to Perl Regular Expression Match

Label the vertices of the graph with distinct natural numbers and sort the labels in ascending order. Given this labeling, construct an adjacency list according to the following criterion: the adjacency list contains [u, v] if the graph contains an edge u-v and u < v. For example, the following graph

[Example graph]

is represented as

    @V = (1, 2, 3, 4);
    @E = ([1,2], [1,3], [2,3], [2,4], [3,4]);

Construct a string as follows:

    $string = (join ',' => @V) .  ';'
            . (join ',' => map "$_->[0]-$_->[1]", @E);

Then construct a regular expression as follows:

    $regex = '^ .*\b '
           . join(' , .*\b ' => ('(\d+)') x $k)
           . '\b .* ;'
           . "\n";

    for ($i = 1; $i < $k; $i++) {
        for ($j = $i+1; $j <= $k; $j++) {
            $regex .= '(?= .* \b ' . "\\$i-\\$j" . ' \b)' . "\n";

Our graph contains a k-clique if and only if

    $string =~ /$regex/x

is true. If so, the backreference variables $1, $2, ..., $k contain the labels of the vertices in a particular k-clique of G.


Suppose the graph is as depicted above. Then the value of $string is


and the value of $regex is

    ^ .*\b (\d+) , .*\b (\d+) , .*\b (\d+)\b .* ;
    (?= .* \b \1-\2 \b)
    (?= .* \b \1-\3 \b)
    (?= .* \b \2-\3 \b)

For $k = 3, the test $string =~ /$regex/x succeeds, so the answer to the question is yes. Moreover, the process of matching sets $1, $2, and $3 to 2, 3, and 4 respectively, indicating that vertices 2, 3, and 4 form a 3-clique. (Finding 3-cliques is easy for humans because a 3-clique usually looks like a triangle.) For an example of a no result, set $k to 4.


Clique detection is NP-complete. If there were an efficient (polynomial-time) algorithm for computing whether a regular expression matches a string, we could use it to quickly compute solutions to CLIQUE and, by extension, to the knapsack problem, the travelling salesman problem, etc.

See Also

More reductions from NP-complete problems to Perl regular expression matching, compiled by Mark-Jason Dominus.