Projected Variadic Fold

Projected variadic fold is a fold over transformed arguments, and is functionally equivalent to std::inner_product but the input is passed as function arguments rather than iterators.

This article is a continuation of Variadic Fold, and will use the same techniques described therein, such as parameter pack expansion and integer sequences, without further explanation.

Prologue

While fold is useful for simple cumulative algorithms such as calculating the sum or the product, other cumulative algorithms are more involved. An example is the dot product, which exists in the standard library as std::inner_product.

The dot product of two $N$-dimensional vectors

$$\begin{eqnarray*} \mathbf{a} &=& (a_0, a_1, \ldots, a_{N-1}) \\ \mathbf{b} &=& (b_0, b_1, \ldots, b_{N-1}) \end{eqnarray*}$$

is given by

$$\begin{eqnarray*} \mathbf{a} \cdot \mathbf{b} &=& \sum_{k=0}^{N-1} a_k\,b_k \end{eqnarray*}$$

The elements $a_k$ and $b_k$ are multiplied pair-wise and the results are summed up.

Projected variadic fold is a generalization of the above, where the addition and multiplication are replaced by arbitrary numeric functions.

template <typename F, typename G, typename... Args>
constexpr auto variadic_fold_over(F&& f, G&& g, Args&&... args);

The algorithm first applies an inner function g repeatedly across the parameter pack, and then folds the results using an outer function f. The first step is called transform or projection, and the second is called reduction or fold.

Each projection consumes W arguments from the parameter pack, where W is the arity of the inner function g. The parameter pack must contain a multiple of W elements. Using a binary inner function splits the parameter pack into slices with two elements each.

variadic_fold_over(
  f2, // Outer function (fold)
  g2, // Inner function (projection)
  a0, b0, a1, b1, a2, b2);

becomes the fold of

variadic_fold(
  f2,
  g2(a0, b0),
  g2(a1, b1),
  g2(a2, b2));

which will be expanded as

f2(f2(g2(a0, b0),
      g2(a1, b1)),
   g2(a2, b2));

We assume that the arguments $a_k$ and $b_k$ are interleaved by the user when calling this algorithm.

The dot product is a projected fold using multiplication as the projection and addition as the fold.

// Dot product
auto result = variadic_fold_over(
  std::plus<>{},
  std::multiplies<>{},
  a0, b0, a1, b1, a2, b2);

assert(result == a0 * b0 + a1 * b1 + a2 * b2);

Before the implementation of projected fold is explained, we will take a detour into extended invocation to pick up some useful building-blocks.

Extended Invocation

The standard library contains std::invoke which invokes the given function with the given arguments. For instance

std::invoke(f, x0, x1, x2)

becomes

f(x0, x1, x2)

Selective Invocation

Selected invocation invokes the given function with selected arguments determined by an integer sequence.

Selective invocation is declared as

template <typename F, std::size_t... I, typename... Args>
constexpr auto invoke_sequence(F&& f, std::index_sequence<I...>, Args&&... args);

The integer sequence holds the indices of the arguments to be used.

invoke_sequence(f, std::index_sequence<0, 2>{}, x0, x1, x2)

becomes

std::invoke(f, args[0], args[2])

using C++26 pack indexing, which amounts to

std::invoke(f, x0, x2)

Selective invocation is fairly easy to implement. We are not going to use pack indexing because of the C++26 requirement, but rather we are going to forward the arguments as a tuple to a helper function that uses std::get to select the arguments.

template <typename F, std::size_t... I, typename Tuple>
constexpr auto tuple_apply_sequence(F&& f,
                                    std::index_sequence<I...>,
				    Tuple&& tuple) {
  return std::invoke(std::forward<F>(f),
                     std::get<I>(tuple)...);
}

template <typename F, std::size_t... I, typename... Args>
constexpr auto invoke_sequence(F&& f,
                               std::index_sequence<I...> seq,
			       Args&&... args) {
  return tuple_apply_sequence(
    std::forward<F>(f),
    seq,
    std::forward_as_tuple(std::forward<Args>(args)...));
}

As a side-note, tuple_apply_sequence can also be used to implement std::apply.

template <typename F, typename T>
constexpr auto apply(F&& f, T&& tuple) {
  return tuple_apply_sequence(
    std::forward<F>(f),
    std::make_index_sequence<std::tuple_size<std::remove_cvref_t<T>>::value>{},
    std::forward<T>(tuple));
}

Enumerated Integer Sequences

The standard provides functionality to create an integer sequence with the numbers from $0$ to $N-1$, but we are going to need a facility to create integer sequences where we can choose the starting value and the step size between values.

We therefore introduce an extension to create integer sequences that takes 3 arguments where N is the number of integers, Offset is the starting value, and Step is the step size. Offset defaults to 0, and Step defaults to 1.

index_sequence_enumerate<N, Offset, Step>

becomes

index_sequence<Offset,
               Offset + Step,
               Offset + Step * 2,
               Offset + Step * 3,
               ...,
               Offset + Step * (N - 1)>

A couple of concrete examples

// Same as std::make_index_sequence<4>
index_sequence_enumerate<4, 0, 1> // becomes index_sequence<0, 1, 2, 3>

// Shifted
index_sequence_enumerate<4, 1, 1> // becomes index_sequence<1, 2, 3, 4>

// Even
index_sequence_enumerate<4, 0, 2> // becomes index_sequence<0, 2, 4, 6>

// Odd
index_sequence_enumerate<4, 1, 2> // becomes index_sequence<1, 3, 5, 7>

// Repeated
index_sequence_enumerate<4, 1, 0> // becomes index_sequence<1, 1, 1, 1>

The implementation is outside the scope of this article so we just stipulate that such a facility is available.

Projected Invocation

Projected invocation is similar to the before-mentioned projected fold, except the outer function f is not a fold function, but just a normal function that must take as many arguments as produced by the projection.

Projected invocation is declared as

template <typename F, typename G, typename... Args>
constexpr auto invoke_over(F&& f, G&& g, Args&&... args);

If f3 takes three arguments, and g2 takes two then

invoke_over(f3, g2, a0, b0, a1, b1, a2, b2)

becomes

f3(g2(a0, b0),
   g2(a1, b1),
   g2(a2, b2))

Projected invocation has to split the parameter pack into smaller slices, which is done using selective invocation. The intermediate steps can be illustrated as follows.

invoke_over(f3, g2, args...)

The projection applies g2 to slices of the parameter pack using generated integer sequences

std::invoke(f3,
            invoke_sequence(g2, index_sequence<0, 1>{}, args...),
            invoke_sequence(g2, index_sequence<2, 3>{}, args...),
            invoke_sequence(g2, index_sequence<4, 5>{}, args...))

which expands to

std::invoke(f3,
            std::invoke(g2, args[0], args[1]),
            std::invoke(g2, args[2], args[3]),
            std::invoke(g2, args[4], args[5]))

The generic implementation may use a projection function of arbitrary arity, so the injection of invoke_sequence becomes a matter of juggling with integer sequences to select the proper slices. While it is possible to determine the arity automatically, we pass it as an argument below for simplicity.

template <typename F, typename G, std::size_t Rank, typename... Args>
constexpr auto invoke_over(F&& f,
                           G&& g,
                           std::integral_constant<std::size_t, Rank> rank,
			   Args&&... args) {
  static_assert((sizeof...(Args) % Rank) == 0,
                "Number of arguments must be a multiple of projection arity");

  return invoke_over_sequence(
    std::forward<F>(f),
    std::forward<G>(g),
    rank,
    // Determine starting offsets of slices.
    // index_sequence<0, Rank, Rank * 2, ..., Rank * (M - 1)>
    // where M is the number of slices (sizeof...(Args) / Rank)
    index_sequence_enumerate<sizeof...(Args) / Rank, 0, Rank>{},
    std::forward<Args>(args)...);
}

This injects an integer sequence with the starting offsets of the slices and calls a helper function invoke_over_sequence below, which generates the calls to invoke_sequence for each slice.

We start with an integer sequence containing the start of each slice, and the individual slices will be generated from that.

The helper function is given two parameter packs: Args... and Offset... which are expanded at different levels, so Args... is expanded for each Offset.

template <typename F, typename G, typename... Args,
          size_t Rank, size_t... Offset>
constexpr auto invoke_over_sequence(F&& f,
                                    G&& g,
				    std::integral_constant<std::size_t, Rank>,
				    std::index_sequence<Offset...>,
				    Args&&... args) {
  return std::invoke(
    std::forward<F>(f),
    invoke_sequence(
      std::forward<G>(g),
      // The kth offset produces
      // index_sequence<Offset[k],
      //                Offset[k] + 1,
      //                ...,
      //                Offset[k] + (Rank - 1)>
      //
      // If Rank = 3 then Offset... = {0, 3, 6} and thus
      //   k=0: index_sequence<0, 1, 2>
      //   k=1: index_sequence<3, 4, 5>
      //   k=2: index_sequence<6, 7, 8>
      index_sequence_enumerate<Rank, Offset>{},
      std::forward<Args>(args)... // Expands args...
      )... // Expands Offset...
    );
}

Projected Fold

Projected fold is a projection invocation where the outer function is a fold.

template <typename F, typename G, typename... Args>
constexpr variadic_fold_over(F&& f, G&& g, Args&&... args) {
  return invoke_over(std::bind_front(variadic::fold, std::forward<F>(f)),
                     std::forward<G>(g),
                     std::forward<Args>(args)...);
}

This creates a function that uses f to fold over the results. The fold uses partial application in the form of C++20 std::bind_front where

bind_front(fold, f)(args...)

becomes

fold(f, args...)

In order for the binding to work we have changed the variadic fold function into a function object.

namespace variadic {

inline constexpr struct {
  template <typename F, typename G, typename... Args>
  constexpr auto operator()(F&& f, G&& g, Args&&... args) const {
    return variadic_fold(std::forward<F>(f),
                         std::forward<G>(g),
                         std::forward<Args>(args)...);
  }
} fold{};

} // namespace variadic

More on function objects in a future article.

This article has used various facilities from newer C++ standards, but these facilities can be backported to C++11.