This commit migrates "deno_core" from using "FuturesUnordered" to
"tokio::task::JoinSet". This makes every op to be a separate Tokio task
and should unlock better utilization of kqueue/epoll.
There were two quirks added to this PR:
- because of the fact that "JoinSet" immediately polls spawn tasks,
op sanitizers can give false positives in some cases, this was
alleviated by polling event loop once before running a test with
"deno test", which gives canceled ops an opportunity to settle
- "JsRuntimeState::waker" was moved to "OpState::waker" so that FFI
API can still use threadsafe functions - without this change the
registered wakers were wrong as they would not wake up the
whole "JsRuntime" but the task associated with an op
---------
Co-authored-by: Matt Mastracci <matthew@mastracci.com>
It's not used anymore. Subsequently allows removing
`ModuleMap::op_state`, allowing `ModuleMap` to have a sane default so
`JsRuntime::module_map` no longer needs to be optional.
This cleans up `JsRuntime` a bit more:
* We no longer print cargo's rerun-if-changed messages in `JsRuntime` --
those are printed elsewhere
* We no longer special case the OwnedIsolate for snapshots. Instead we
make use of an inner object that has the `Drop` impl and allows us to
`std::mem::forget` it if we need to extract the isolate for a snapshot
* The `snapshot` method is only available on `JsRuntimeForSnapshot`, not
`JsRuntime`.
* `OpState` construction is slightly cleaner, though I'd still like to
extract more
---------
Co-authored-by: Bartek Iwańczuk <biwanczuk@gmail.com>
Rather than disallowing `ext:` resolution, clear the module map after
initializing extensions so extension modules are anonymized. This
operation is explicitly called in `deno_runtime`. Re-inject `node:`
specifiers into the module map after doing this.
Fixes #17717.
Dispatches op per-realm, and allows JsRealm to be garbage collected.
Slight improvement to benchmarks, but opens opportunity to clean up event loop.
---------
Co-authored-by: Matt Mastracci <matthew@mastracci.com>
This commit ensures that JavaScript functions generated for registered
ops have proper names set up - the function name matches the name
of the op.
A test was added in `core/runtime.rs` that verifies this.
Closes https://github.com/denoland/deno/issues/19206
This now allows circular imports across extensions.
Instead of load + eval of all ESM files in declaration order, all files
are only loaded. Eval is done recursively by V8, only evaluating
files that are listed in `Extension::esm_entry_point` fields.
---------
Co-authored-by: Bartek Iwańczuk <biwanczuk@gmail.com>
The root cause of denoland/deno_std#3320, I believe, is that
`pending_promise_rejections` is a `HashMap` whose entries are in
arbitrary order, and as a result either of the two errors (`AddrInUse`
and `TypeError`) may be selected when determining which one to report. I
changed the field to a `VecDeque` so that the first error (`AddrInUse`
in this case) is always selected.
Fixes #18979.
This changes the predicate for allowing `ext:` specifier resolution from
`snapshot_loaded_and_not_snapshotting` to `ext_resolution_allowed` which
is only set to true during the extension module loading phase. Module
loaders as used in core
are now declared as `ExtModuleLoader` rather than `dyn ModuleLoader`.
This commits changes "deno_core" to use jemalloc allocator as an
allocator
for V8 array buffers. This greatly improves our GC characteristics as we
are using
a lot of short lived array buffers. They no longer go through the
expensive
malloc/free cycle using the default Rust allocator, but instead use
jemallocator's
memory pool.
As a result the flamegraphs for WS/HTTP server flamegraphs no longer
show
stacks for malloc/free around ops that use ZeroCopyBuf and &[u8].
---------
Co-authored-by: Bartek Iwańczuk <biwanczuk@gmail.com>
Migrates some of existing async ops to generated wrappers introduced in
https://github.com/denoland/deno/pull/18887. As a result "core.opAsync2"
was removed.
I will follow up with more PRs that migrate all the async ops to
generated wrappers.
About 2% improvement on WS/HTTP benchmarks, possibly unlocking more
optimizations in the future.
---------
Co-authored-by: Matt Mastracci <matthew@mastracci.com>
This commit changes how "disabled" ops behave. Instead of using "void"
functions under the hood, they now explicitly throw errors saying
that a given op doesn't exist.
Alternative to https://github.com/denoland/deno/pull/18726.
This was suggested by @piscisaureus. It's a bit ugly, but it does the
work and makes cloning `JsRealm` very cheap, while not requiring
invasive changes.
Also managed to remove some vector and `v8::Global` clones which yields
about 5% improvement in the "async_ops_deferred.js" benchmark.
This PR:
```
time 1689 ms rate 592066
time 1722 ms rate 580720
time 1629 ms rate 613873
time 1578 ms rate 633713
time 1585 ms rate 630914
time 1574 ms rate 635324
```
`main` branch:
```
time 1687 ms rate 592768
time 1676 ms rate 596658
time 1651 ms rate 605693
time 1652 ms rate 605326
time 1638 ms rate 610500
```
Currently we are "waking up" the runtime if at the end of the event loop
tick there are ops that haven't been polled. Waking up incurs a syscall
and it appears we can do another tick of the event loop, without
going through the "wake up" machinery.
This commit refactors "deno_core" to do fewer boundary crossings
from Rust to V8. In other words we are now calling V8 from Rust fewer
times.
This is done by merging 3 distinct callbacks into a single one. Instead
of having "op resolve" callback, "next tick" callback and "macrotask
queue" callback, we now have only "Deno.core.eventLoopTick" callback,
which is responsible for doing the same actions previous 3 callbacks.
On each of the event loop we were doing at least 2 boundary crosses
(timers macrotask queue callback and unhandled promise rejection
callback) and up to 4 crosses if there were op response and next tick
callbacks coming from Node.js compatibility layer. Now this is all done
in a single callback.
Closes https://github.com/denoland/deno/issues/18620
This commit changes "eager ops" to directly return a response value
instead of calling "opresponse" callback in JavaScript. This saves
one boundary crossing and has a fantastic impact on the "async_ops.js"
benchmark:
```
v1.32.4
$ deno run cli/bench/async_ops.js
time 329 ms rate 3039513
time 322 ms rate 3105590
time 307 ms rate 3257328
time 301 ms rate 3322259
time 303 ms rate 3300330
time 306 ms rate 3267973
time 300 ms rate 3333333
time 301 ms rate 3322259
time 301 ms rate 3322259
time 301 ms rate 3322259
time 302 ms rate 3311258
time 301 ms rate 3322259
time 302 ms rate 3311258
time 302 ms rate 3311258
time 303 ms rate 3300330
```
```
this branch
$ ./target/release/deno run -A cli/bench/async_ops.js
time 257 ms rate 3891050
time 248 ms rate 4032258
time 251 ms rate 3984063
time 246 ms rate 4065040
time 238 ms rate 4201680
time 227 ms rate 4405286
time 228 ms rate 4385964
time 229 ms rate 4366812
time 228 ms rate 4385964
time 226 ms rate 4424778
time 226 ms rate 4424778
time 227 ms rate 4405286
time 228 ms rate 4385964
time 227 ms rate 4405286
time 228 ms rate 4385964
time 227 ms rate 4405286
time 229 ms rate 4366812
time 228 ms rate 4385964
```
Prerequisite for https://github.com/denoland/deno/pull/18652
This is a follow-on to the earlier work in reducing string copies,
mainly focused on ensuring that ASCII strings are easy to provide to the
JS runtime.
While we are replacing a 16-byte reference in a number of places with a
24-byte structure (measured via `std::mem::size_of`), the reduction in
copies wins out over the additional size of the arguments passed into
functions.
Benchmarking shows approximately the same if not slightly less wallclock
time/instructions retired, but I believe this continues to open up
further refactoring opportunities.
This commit adds a new core API `opAsync2` to call an async op with
atmost 2 arguments. Spread argument iterators has a pretty big perf hit
when calling ops.
| name | avg msg/sec/core |
| --- | --- |
| 1.32.1 | `127820.750000` |
| #18506 | `140079.000000` |
| #18506 + #18509 | `150104.250000` |
| #18506 + #18509 + this | `157340.000000` |
This will improve diagnostics and catch any non-ASCII extension code
early.
This will use `debug_assert!` rather than `assert!` to avoid runtime
costs, and ensures (in debug_assert mode only) that all extension source
files are ASCII as we load them.
Reduce the number of copies and allocations of script code by carrying
around ownership/reference information from creation time.
As an advantage, this allows us to maintain the identity of `&'static
str`-based scripts and use v8's external 1-byte strings (to avoid
incorrectly passing non-ASCII strings, debug `assert!`s gate all string
reference paths).
Benchmark results:
Perf improvements -- ~0.1 - 0.2ms faster, but should reduce garbage
w/external strings and reduces data copies overall. May also unlock some
more interesting optimizations in the future.
This requires adding some generics to functions, but manual
monomorphization has been applied (outer/inner function) to avoid code
bloat.