Chained Matrix Multiplication(연쇄행렬곱셈)

Basic Concept

Consider the multiplication of the following four maxtrices.

$$A \quad\times\quad B \quad\times\quad C \quad\times\quad D$$

$$20\times 2 \qquad 2\times 30 \qquad 30\times 12 \qquad 12\times 8$$

There are five different orders in which we can multiply four matrices, each possibly resulting in a different number of elementary multiplications. The third order is the optimal order for multiplying the four matrices. Our goal is to develop an algorithm that determines the optimal order for multiplying n matrices. The optimal order depends only on the dimension of the matrices.

Example Five different orders in which we can multiply four matrices

A(B(CD))  (30×12×8) + ( 2×30× 8) + (20× 2× 8) =  3,680
(AB)(CD)  (20×2×30) + (30×12× 8) + (20×30× 8) =  8,880
A((BC)D)  (2×30×12) + ( 2×12× 8) + (20× 2× 8) =  1,232 // optimal order
((AB)C)D  (20×2×30) + (20×30×12) + (20×12× 8) = 10,320
(A(BC))D  (2×30×12) + (20× 2×12) + (20×12× 8) =  3,120

👉 연쇄적으로 행렬을 곱할 때, 어떤 행렬곱셈을 먼저 수행하느냐에 따라서 필요한 기본적인 곱셈의 횟수가 달라지게 된다. 이번 알고리즘의 목적은 n 개의 행렬을 연쇄적으로 곱할 때 최적의 순서를 찾는 것이다.

네 개의 행렬 A, B, C, D를 연쇄적으로 곱하는 경우, 총 다섯 가지 방법의 곱셈 순서가 존재한다. 이 중 세 번째 순서(A((BC)D))로 곱셈을 수행하는 것이 네 개의 행렬을 곱할 때의 최적의 순서이다.

Chained Matrix Multiplication Algorithm

Algorithm Design

We can obtain the following recursive property when multiplying n matrices.
For 1 ≤ i ≤ j ≤ n

$$ \begin{align} M[i][j] = \quad& minimum_{(i \le k \le j-1)}(M[i][k] + M[k+1][j] + d_{i-1}d_kd_j), \text{ if } i < j \notag{}\\M[i][i] = \quad& 0 \notag{}\end{align}$$

Example Suppose we have the following six matrices:

$$ A_1 \quad \times \quad A_2 \quad \times \quad A_3 \quad \times \quad A_4 \quad \times \quad A_5 \quad \times \quad A_6 \\ (5 \ \times \ 2) \quad (2 \ \times \ 3) \quad (3 \ \times \ 4) \quad (4 \ \times \ 6) \quad (6 \ \times \ 7) \quad (7 \ \times \ 8) \\ (d_0 \times d_1) \quad (d_1\times d_2) \quad (d_2\times d_3) \quad (d_3\times d_4) \quad (d_4\times d_5) \quad (d_5\times d_6)$$

Compute Diagonal 1: M[1][2]는 두 개의 행렬을 곱할 때의 연산횟수를 계산한 값이다. M[2][3], M[3][4], M[4][5], M[5][6] 값들 역시 같은 방법으로 계산해준다.

M[1][2] = minimum(M[1][k] + M[k+1][2] + d0dkd2) // (1 ≤ k ≤ 1)
        = M[1][1] + M[2][2] + d0d1d2
        = 0 + 0 + 5 × 2 × 3 = 30.

Compute Diagonal 2: M[1][3]은 A1 부터 A3 까지의 행렬을 곱하는데 필요한 최소 횟수이다. 위에서 알 수 있듯이 세 개의 행렬을 곱하는 경우의 수는 두 가지(A1(A2A3), (A1A2)A3)이며, 이 중 더 적은 연산을 수행하는 값이 M[1][3]이 된다.

M[1][3] 값을 계산하기 위해 앞서 구한 M[1][2], M[2][3] 값들이 이용된다. M[2][4], M[3][5], M[4][6] 값들 역시 같은 방법으로 계산해준다.

M[1][3] = minimum(M[1][k] + M[k+1][3] + d0dkd3) // (1 ≤ k ≤ 2)
        = minimum(M[1][1] + M[2][3] + d0d1d3,   // ( A1(A2A3) )
                  M[1][2] + M[3][3] + d0d2d3)   // ( (A1A2)A3 )
        = minimum(0 + 24 + 5 × 2 × 4,
                  30 + 0 + 5 × 3 × 4) = 64.

Compute Diagonal 3: M[1][4]는 네 개의 행렬을 곱할 때의 연산횟수를 계산한 값이다. 앞서 구한 M[1][2], M[2][4], …, M[1][3] 등이 계산에 이용된다. 값을 재계산하지 않고 찾아서 재사용하는 것이 동적계획법의 특징이다. M[2][5] and M[3][6] 값들 역시 같은 방법으로 계산한다.

M[1][4] = minimum(M[1][k] + M[k+1][4] + d0dkd4) // (1 ≤ k ≤ 3)
        = minimum(M[1][1] + M[2][4] + d0d1d4,
                  M[1][2] + M[3][4] + d0d2d4,
                  M[1][3] + M[4][4] + d0d3d4)
        = minimum(0 + 72 + 5 × 2 × 6,
                  30 + 72 + 5 × 3 × 6,
                  64 + 0 + 5 × 4 × 6) = 132.

Compute Diagonal 4 and 5: diagonal 4와 5 둘다 위의 과정을 반복해주면 된다. 예제의 경우 diagonal 5의 계산이 끝나면 최소한의 연쇄행렬곱셈의 결과값 M[1][6] = 348 을 구할 수 있다.

The Optimal Order is.. (A1((((A2A3)A4)A5)A6)) = 348.

Pseudo Code

Chained Matrix Multiplication from Godbole (1973)

int minmult(int n, const int d[ ], index p[ ][ ])
{
  index i, j, k , diagonal;
  int M[1...n][1...n];

  for (i = 1; i <= n; i++)
    M[i][i] = 0;

  for (diagonal = 1; diagonal <= n-1; diagonal++)
    for (i = 1; i <= n-diagonal; i++)
    {
      j = i + diagonal;
      M[i][j] = minimum(M[i][k] + M[k+1][j] + d[i-1]*d[k]*d[j]);
          i <= k <= j-1
      P[i][j] = a value of k that gave the minimum;
    }
  return M[1][n];
}

Source Code

minmult.h

// File: minmult.h
#ifndef CHAINED_MATRIX_MULTI_H
#define CHAINED_MATRIX_MULTI_H
#include <iostream>

namespace algorithms
{
  int minmult(int n, const int *d, int **P);
  // Problem: Determining the minimum number of elementary multiplications
  // needed to multiply n matrices and an order that produces that minimum
  // number.
  // Inputs: the number of matrices n, and an array of integers d, indexed
  // from 0 to n, where d[i-1] * d[i] is the dimension of the ith matrix.
  // Outputs: minmult, the minimum number of elementary multiplications
  // need to multiply the n matrices; a two-dimensional array P from which
  // the optimal order can be obtained. P has its rows indexed from 1
  // to n-1 and its columns indexed from 1 to n. P[i][j] is the point
  // where matrices i through j are split in an optimal order for
  // multiplying the matrices.

  void print_order(int i, int j, int **P);
  // Problem: Print the optimal order for multiplying n matrices.
  // Inputs: Positive integer n, and the array P produced by Algorithm 3.6.
  // P[i][j] is the point where matrices i through j are split in an optimal
  // order for multiplying those matrices.
  // Outputs: the optimal order for multiplying the matrices.

  template <typename Item>
  void print_matrix(Item **data, const int m, const int n);

  template <typename Item>
  Item **get_matrix_space(const int m, const int n)
  // Postcondition: Return the array containing m*n spaces
  {
    Item **a;
    a = new Item *[m];
    --a; // Offset the pointer to get row index from 1 to m
    for (int i = 1; i <= m; ++i)
    {
      a[i] = new Item[n];
      --a[i];
    }

    // Initialize the matrix
    for (int i = 1; i <= m; ++i)
      for (int j = 1; j <= n; ++j)
        a[i][j] = Item();

    return a;
  }

  template <typename Item>
  void release_matrix_space(Item **d, int n)
  {
    for (int i = 1; i <= n; ++i)
      delete[](d[i] + 1);
    delete (d + 1);
  }
} // namespace algorithms

#endif

minmult.cpp

// File: minmult.cpp
#include <iostream>
#include <climits> // Provides INT_MAX
#include <iomanip> // Provides setw
#include "minmult.h"
using namespace std;

namespace algorithms
{
  int minmult(int n, const int* d, int** P)
  {
    int i, j, k, diagonal;
    int** M = get_matrix_space<int>(n, n);

    for (i = 1; i <= n; ++i)
      M[i][i] = 0;

    for (diagonal = 1; diagonal <= n-1; ++diagonal)
    {
      for (i = 1; i <= n-diagonal; ++i)
      {
        j = i + diagonal;
        int minimum = INT_MAX;
        for (k = i; k <= j-1; ++k)
        {
          if ((M[i][k]+M[k+1][j] + d[i-1]*d[k]*d[j]) < minimum)
          {
            M[i][j] = M[i][k] + M[k+1][j] + d[i-1] * d[k] * d[j];
            minimum = M[i][k] + M[k+1][j] + d[i-1] * d[k] * d[j];
            P[i][j] = k; // a value of k that gave minimum
          }
        }
      }
    }

    print_matrix(M, n, n);
    return M[1][n];
  }

  void print_order(int i, int j, int** P)
  {
    int k;

    if (i == j)
      cout << "A" << i;
    else
    {
      k = P[i][j];
      cout << "(";
      print_order(i, k, P);
      print_order(k+1, j, P);
      cout << ")";
    }
  }

  template <typename Item>
    void print_matrix(Item **data, const int m, const int n)
    {
        // Print matrix's index
        cout << "index";
        for (int i = 1; i <= m; ++i)
            cout << setw(7) << i;
        cout << endl
            << "     ___________________________________________" << endl;
        // Print matrix's data
        for (int i = 1; i <= m; ++i)
        {
            cout << "  " << i << " |";
            for (int j = 1; j <= n; ++j)
                cout << setw(7) << data[i][j];
            cout << endl;
        }
        cout << endl;
    }
} // namespace algorithms

minmulttest.cpp

// File: minmulttest.cpp
#include <iostream>
#include "minmult.h"
using namespace std;
using namespace algorithms;

int main( )
{
  const int NUM_OF_MATRICES = 5; // A1, A2, A3, A4, A5, A6
  int* d;
  int** P;

  d = new int[NUM_OF_MATRICES]; // Do NOT need for offset
  d[0] = 10;
  d[1] = 4;
  d[2] = 5;
  d[3] = 20;
  d[4] = 2;
  d[5] = 50;

  P = get_matrix_space<int>(NUM_OF_MATRICES, NUM_OF_MATRICES);

  minmult(NUM_OF_MATRICES, d, P); // Minimum multiplications
  print_matrix(P, NUM_OF_MATRICES, NUM_OF_MATRICES);

  cout << "The Optimal Order is.. ";
  print_order(1, NUM_OF_MATRICES, P); // Print optimal order
  cout << endl;

  delete d;
  release_matrix_space(P, NUM_OF_MATRICES);
  return EXIT_SUCCESS;
}

Time Complexity Analysis

Basic operation

The instructions executed for each value of k.

Input size

n, the number of matrices to be multiplied.

Every-Case Time Complexity

$\displaystyle T(n) = \sum_{diagonal=1}^{n-1}[(n-diagonal) \times diagonal] = \frac{n(n-1)(n+1)}{6} \in \Theta(n^3)$

구분	수행 조건	수행 횟수
diagonal루프 수행 횟수	1 to n - 1	n - 1
i-루프 수행 횟수	1 to n - diagonal	n - diagonal
j-루프 수행 횟수	i to j - 1	(j - 1) - i + 1 = (i + diagonal -1) -i + 1 = diagonal ※ j = i + diagonal

The bruth-force Time Complexity

n 개의 행렬을 곱하는 경우의 수는 $t_{n} \ge 2^{(n-2)}$ 이므로, 무작정 방법으로 연쇄행렬 곱셈의 최적의 순서를 찾는다면 그 시간복잡도는 exponential-time과 같다.

두 개의 행렬을 곱하는 경우의 수 = $t_{2}$ = 1.

AB

세 개의 행렬을 곱하는 경우의 수 = $t_{3}$ = 2.

 (AB)C, A(BC)

네 개의 행렬을 곱하는 경우의 수 = $t_{4}$ = 5.

((AB)C)D, (A(BC))D, A((BC)D), A(B(CD)), (AB)(CD)

n개의 행렬을 곱하는 경우의 수 = $t_{n} \ge 2^{(n-2)}$

Other Algorithms Time Complexity

Θ(n3) – Godbole (1973)

Θ(n2) – Yao (1982)

(n lg n) – Hu and Shing (1982, 1984)

Principle of Optimality

Chained Matrix Multiplication problem is an optimization problem. Therefore, the optimal solution to the instance contains optimal solutions to all subinstances, and the principle of optimality applies.

🔗 Dynamic Programming and Optimization Problems(동적계획법과 최적화 문제)

Exercises

Find the optimal order, and its cost, for evaluating the product A1 × A2 × A3 × A4 × A5, where

$$A_1 \quad\times\quad A_2 \quad\times\quad A_3 \quad\times\quad A_4 \quad\times\quad A_5 \\(10 \times 4) \quad (4 \times 5) \quad (5 \times 20) \quad (20 \times 2) \quad (2 \times 50)$$

Show the final matrices M and P produced by our Minimum Multiplications algorithm.

My Answer

matrices M and P

/* matrix M */
index      1      2      3      4      5
     _____________________________________
  1 |      0    200   1200    320   1320
  2 |      0      0    400    240    640
  3 |      0      0      0    200    700
  4 |      0      0      0      0   2000
  5 |      0      0      0      0      0

/* matrix P */
index      1      2      3      4      5
     _____________________________________
  1 |      0      1      1      1      4
  2 |      0      0      2      2      4
  3 |      0      0      0      3      4
  4 |      0      0      0      0      4
  5 |      0      0      0      0      0

The Optimal Order is.. ((A1(A2(A3A4)))A5)

'Algorithms > Dynamic Programming' 카테고리의 다른 글

Optimal Binary Search Trees(최적이진탐색트리) (4)	2011.01.14
Floyd's Algorithm for Shortest Paths(플로이드 알고리즘) (0)	2011.01.12
Binomial Coefficient(이항계수) (0)	2011.01.11
Dynamic Programming(동적계획법) (0)	2011.01.10

Basic Concept

Chained Matrix Multiplication Algorithm

Algorithm Design

Pseudo Code

Source Code

Time Complexity Analysis

Basic operation

Input size

Every-Case Time Complexity

The bruth-force Time Complexity

Other Algorithms Time Complexity

Principle of Optimality

Exercises

My Answer

'Algorithms > Dynamic Programming' 카테고리의 다른 글

티스토리툴바