Skip to content

Introduce exec::function<...>#2040

Open
ispeters wants to merge 15 commits intoNVIDIA:mainfrom
ispeters:frame_allocator
Open

Introduce exec::function<...>#2040
ispeters wants to merge 15 commits intoNVIDIA:mainfrom
ispeters:frame_allocator

Conversation

@ispeters
Copy link
Copy Markdown
Contributor

This PR proposes a new type-erased sender named exec::function. There's an in-code comment giving a bunch of examples, but a simple example is:

exec::function<int(int)> fn(42, [](int i) { return ex::just(i); });

auto [result] = ex::sync_wait(std::move(fn)).value();

assert(result == 42);

There are a bunch of TODOs left, including lots of tests that are missing, but the API is ready to collect early feedback. If this looks like a promising direction, I intend to write a paper for Brno proposing this type for inclusion in C++29.

ispeters added 15 commits April 21, 2026 15:35
This diff starts the work to add a type-erased sender named
`io_sender<Return(Args...)>`. The intent is for such a sender to
represent "an async function from `Args...` to `Return`", a bit like a
task coroutine, but with different trade offs. The sender itself stores
a `std::tuple<Args...>` and a `sender auto(Args&&...)` factory that can
construct the intended erased sender from the stored arguments on
demand. This representation allows us to defer allocation of the
type-erased operation state until `connect` time, giving us
coroutine-like behaviour but allowing us to choose the frame allocator
by querying the eventual receiver's environment.

The completion signatures for an `io_sender<Return(Args...)>` are:
 - `set_value_t(R&&)`
 - `set_error_t(std::exception_ptr)`
 - `set_stopped_t()`

We may be able to eliminate the error channel for
`io_sender<R(A...) noexcept>` but that direction requires more thought.

This first diff proves that we can store a tuple of arguments and a
factory and, at `connect` time, use those values to allocate a
type-erased operation state. The test cases cover only basic cases, and
all allocations happen through `::operator new`. Future changes will
expand the test cases and invent a `get_frame_allocator` environment
query that can be used to control frame allocations. The expectation is
that we can meet Capy's performance characteristics with a slightly
different API in a sender-first way.
This diff changes the name of `io_sender<R(A...)>` to
`function<R(A...)>` after some discussion with other folks working on
`std::execution`. `exec::function<...>` is a type-erased wrapper around
an async function with the given signature (elided here as `...`). More
features are coming in future diffs.
Move to an implementation that spreads `completion_signatures`
throughout the internals so that we're not restricted to `R(A...)`-style
constraints.

The tests still only validate `R(A...)`-style constraints, with no
validation of no-throw functions, or controlling the completion
signature and environment; that'll come next.

This implementation also relies on virtual inheritance of a pack of
abstract base classes, which feels like a kludge. I should figure out
how to reimplement the virtual dispatch in terms of a hand-rolled
vtable.
Thanks to a suggestion from @RobertLeahy, I've been able to rework the
virtual function inheritance to not need virtual inheritance.
`function<ex::sender_tag(Args...), ex::completion_signatures<...>>` now
declares an async function mapping `Args...` to the explicitly specified
completion signatures.
Support for explicit completion signatures, environment, or both in the
declaration of an `exec:function`.
Rework the dynamically allocated operation state type to support
allocators, but always use `std::allocator` for now.
This diff needs tests, but the existing tests build and pass, which
seems like a good signal. I've added a `get_frame_allocator` query, and
a defaulting cascade from `get_frame_allocator` -> `get_allocator` ->
`std::allocator<std::byte>` to the allocation of `_derived_op`.
This diff marks almost every function `constexpr`. It doesn't mark the
imlementation of `complete` in the CRTP `_func_op_completion` class
template because Clang rejects the down-cast to `Derived` as not a
core constant expression; apparently, `Derived` is incomplete when it's
being evaluated as a side effect of constraint satisfaction testing.

This `constexpr` "hole" means `exec::function` can't be used at compile
time, but maybe it can be worked around later.
Validate that more kinds of senders can be erased and then connected and
started. Also clean up the captures in some lambdas in `connect` and
`clang-format`.
Still TODO is that the `get_frame_allocator` query shouldn't have to be
specified in the `function`'s custom environment (and, come to think of
it, neither should `get_allocator`), but, when specified, it works.
This needs cleaning up and a *lot* more tests, but the current tests
build and pass with a synthesized polymorphic environment.
This diff does some tidying and adds documentation. There are still some
TODOs, but this is in good enough shape that I can start sharing it, I
think.
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 21, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Copy link
Copy Markdown
Collaborator

@ericniebler ericniebler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why capture a function and arguments instead of just a sender that aggregates the arguments? what does lazy construction of the sender offer here?

could this be implemented in terms of:

template <class Result,
          class ReceiverQueries = queries<>,
          class Completions = completion_signatures<set_error_t(exception_ptr),
                                                    set_stopped_t()>,
          class SenderQueries = queries<>>
auto function(auto sndr) 
{
  using _completions_t =
    __minvoke<__mpush_back<__q<completion_signatures>>, Completions, set_value_t(Result)>;

  using _sender_t =
    any_sender<any_receiver<_completions_t, ReceiverQueries>, SenderQueries>;

   return _sender_t(let_value(read_env(get_frame_allocator),
                              [=](auto const& alloc)
                              {
                                return __uses_frame_allocator(sndr, alloc);
                              }));
}

EDIT: also, the exec::function interface suggests to me that it would be used like:

exec::function<int(int)> fn([](int i) { return ex::just(i); });

auto [result] = ex::sync_wait(fn(42)).value();

the lazy construction of the sender would then makes sense.

Comment thread include/exec/function.hpp
Comment on lines +84 to +85
template <class Return, class Query, class... Args, bool NoThrow>
inline constexpr bool is_query_function_v<Return(Query, Args...) noexcept(NoThrow)> = true;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think NoThrow can be deduced this way in standard C++, despite the fact that some compilers accept it. you need two specializations, one with noexcept and the other without.

Comment thread include/exec/function.hpp
Comment on lines +96 to +99
template <class... Queries>
requires(_qry_detail::is_query_function_v<Queries> && ...)
struct queries
{};
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there already is an exec::queries type list thing in exec/any_sender_of.hpp. we should share it.

Comment thread include/exec/function.hpp
// a special case in the recursion: when there is only one query in the pack, there's
// no base implementation of query to put in the using statement
template <class Return, class Query, class... Args, bool NoThrow>
struct _env_of_queries<Return(Query, Args...) noexcept(NoThrow)>
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same thing here and elsewhere about deducing NoThrow

Comment thread include/exec/function.hpp

protected:
~_env_of_queries() = default;
};
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i feel like we can reuse the _iquery_memfn interface from exec/any_sender_of.hpp and __any from stdexec/__detail/__any.hpp.

Comment thread include/exec/function.hpp
{
return STDEXEC::get_env(*completer_);
}
};
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this duplicates a ton of any_sender_of.hpp.

Comment thread include/exec/function.hpp

private:
connect_result_t<Sender, Receiver> op_;
[[no_unique_address]]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[[no_unique_address]]
STDEXEC_ATTRIBUTE(no_unique_address)

there are too many compilers that generate bad code with this attribute

Comment thread include/exec/function.hpp
// Some testing shows it's being evaluated when Derived is incomplete
// during constraint satisfaction testing.
//
// static_assert(std::derived_from<_func_op_completion, Derived>);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could turn this into a runtime assert instead, i think. could be wrong about that.

Comment thread include/exec/function.hpp
// and/or pointer-to-member functions can be made to work
template <STDEXEC::__callable<Args...> Factory>
requires STDEXEC::__not_decays_to<Factory, _func_impl> //
&& std::constructible_from<Factory> //
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
&& std::constructible_from<Factory> //
&& STDEXEC::__std::constructible_from<Factory> //

Comment thread include/exec/function.hpp
&& STDEXEC::__callable<Factory, Args...>
&& STDEXEC::sender_to<STDEXEC::__invoke_result_t<Factory, Args...>,
_func_rcvr<completion_signatures<Sigs...>, Queries...>>
constexpr explicit(sizeof...(Args) == 0) _func_impl(Args &&...args, Factory &&factory)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
constexpr explicit(sizeof...(Args) == 0) _func_impl(Args &&...args, Factory &&factory)
constexpr explicit _func_impl(Args &&...args, Factory &&factory)

Comment thread include/exec/function.hpp
&& STDEXEC::sender_to<STDEXEC::__invoke_result_t<Factory, Args...>,
_func_rcvr<completion_signatures<Sigs...>, Queries...>>
constexpr explicit(sizeof...(Args) == 0) _func_impl(Args &&...args, Factory &&factory)
noexcept((std::is_nothrow_constructible_v<Args, Args> && ...))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
noexcept((std::is_nothrow_constructible_v<Args, Args> && ...))
noexcept(STDEXEC::__nothrow_move_constructible<Args...>)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants