About 2% improvement on WS/HTTP benchmarks, possibly unlocking more
optimizations in the future.
---------
Co-authored-by: Matt Mastracci <matthew@mastracci.com>
This commit changes how "disabled" ops behave. Instead of using "void"
functions under the hood, they now explicitly throw errors saying
that a given op doesn't exist.
Alternative to https://github.com/denoland/deno/pull/18726.
This was suggested by @piscisaureus. It's a bit ugly, but it does the
work and makes cloning `JsRealm` very cheap, while not requiring
invasive changes.
Also managed to remove some vector and `v8::Global` clones which yields
about 5% improvement in the "async_ops_deferred.js" benchmark.
This PR:
```
time 1689 ms rate 592066
time 1722 ms rate 580720
time 1629 ms rate 613873
time 1578 ms rate 633713
time 1585 ms rate 630914
time 1574 ms rate 635324
```
`main` branch:
```
time 1687 ms rate 592768
time 1676 ms rate 596658
time 1651 ms rate 605693
time 1652 ms rate 605326
time 1638 ms rate 610500
```
Currently we are "waking up" the runtime if at the end of the event loop
tick there are ops that haven't been polled. Waking up incurs a syscall
and it appears we can do another tick of the event loop, without
going through the "wake up" machinery.
This commit refactors "deno_core" to do fewer boundary crossings
from Rust to V8. In other words we are now calling V8 from Rust fewer
times.
This is done by merging 3 distinct callbacks into a single one. Instead
of having "op resolve" callback, "next tick" callback and "macrotask
queue" callback, we now have only "Deno.core.eventLoopTick" callback,
which is responsible for doing the same actions previous 3 callbacks.
On each of the event loop we were doing at least 2 boundary crosses
(timers macrotask queue callback and unhandled promise rejection
callback) and up to 4 crosses if there were op response and next tick
callbacks coming from Node.js compatibility layer. Now this is all done
in a single callback.
Closes https://github.com/denoland/deno/issues/18620
This commit changes "eager ops" to directly return a response value
instead of calling "opresponse" callback in JavaScript. This saves
one boundary crossing and has a fantastic impact on the "async_ops.js"
benchmark:
```
v1.32.4
$ deno run cli/bench/async_ops.js
time 329 ms rate 3039513
time 322 ms rate 3105590
time 307 ms rate 3257328
time 301 ms rate 3322259
time 303 ms rate 3300330
time 306 ms rate 3267973
time 300 ms rate 3333333
time 301 ms rate 3322259
time 301 ms rate 3322259
time 301 ms rate 3322259
time 302 ms rate 3311258
time 301 ms rate 3322259
time 302 ms rate 3311258
time 302 ms rate 3311258
time 303 ms rate 3300330
```
```
this branch
$ ./target/release/deno run -A cli/bench/async_ops.js
time 257 ms rate 3891050
time 248 ms rate 4032258
time 251 ms rate 3984063
time 246 ms rate 4065040
time 238 ms rate 4201680
time 227 ms rate 4405286
time 228 ms rate 4385964
time 229 ms rate 4366812
time 228 ms rate 4385964
time 226 ms rate 4424778
time 226 ms rate 4424778
time 227 ms rate 4405286
time 228 ms rate 4385964
time 227 ms rate 4405286
time 228 ms rate 4385964
time 227 ms rate 4405286
time 229 ms rate 4366812
time 228 ms rate 4385964
```
Prerequisite for https://github.com/denoland/deno/pull/18652
This is a follow-on to the earlier work in reducing string copies,
mainly focused on ensuring that ASCII strings are easy to provide to the
JS runtime.
While we are replacing a 16-byte reference in a number of places with a
24-byte structure (measured via `std::mem::size_of`), the reduction in
copies wins out over the additional size of the arguments passed into
functions.
Benchmarking shows approximately the same if not slightly less wallclock
time/instructions retired, but I believe this continues to open up
further refactoring opportunities.
This commit adds a new core API `opAsync2` to call an async op with
atmost 2 arguments. Spread argument iterators has a pretty big perf hit
when calling ops.
| name | avg msg/sec/core |
| --- | --- |
| 1.32.1 | `127820.750000` |
| #18506 | `140079.000000` |
| #18506 + #18509 | `150104.250000` |
| #18506 + #18509 + this | `157340.000000` |
This will improve diagnostics and catch any non-ASCII extension code
early.
This will use `debug_assert!` rather than `assert!` to avoid runtime
costs, and ensures (in debug_assert mode only) that all extension source
files are ASCII as we load them.
Reduce the number of copies and allocations of script code by carrying
around ownership/reference information from creation time.
As an advantage, this allows us to maintain the identity of `&'static
str`-based scripts and use v8's external 1-byte strings (to avoid
incorrectly passing non-ASCII strings, debug `assert!`s gate all string
reference paths).
Benchmark results:
Perf improvements -- ~0.1 - 0.2ms faster, but should reduce garbage
w/external strings and reduces data copies overall. May also unlock some
more interesting optimizations in the future.
This requires adding some generics to functions, but manual
monomorphization has been applied (outer/inner function) to avoid code
bloat.
This commit changes the build process in a way that preserves already
registered ops in the snapshot. This allows us to skip creating hundreds of
"v8::String" on each startup, but sadly there is still some op registration
going on startup (however we're registering 49 ops instead of >200 ops).
This situation could be further improved, by moving some of the ops
from "runtime/" to a separate extension crates.
---------
Co-authored-by: Divy Srivastava <dj.srivastava23@gmail.com>
Follow-up to #18210:
* we are passing the generated `cfg` object into the state function
rather than passing individual config fields
* reduce cloning dramatically by making the state_fn `FnOnce`
* `take` for `ExtensionBuilder` to avoid more unnecessary copies
* renamed `config` to `options`
This implements two macros to simplify extension registration and centralize a lot of the boilerplate as a base for future improvements:
* `deno_core::ops!` registers a block of `#[op]`s, optionally with type
parameters, useful for places where we share lists of ops
* `deno_core::extension!` is used to register an extension, and creates
two methods that can be used at runtime/snapshot generation time:
`init_ops` and `init_ops_and_esm`.
---------
Co-authored-by: Bartek Iwańczuk <biwanczuk@gmail.com>
This PR cleans up APIs related to snapshot creation and how ops are
initialized.
Prerequisite for #18080
---------
Co-authored-by: Divy Srivastava <dj.srivastava23@gmail.com>
This commit removes "deno_core::RuntimeOptions::extensions_with_js".
Now it's embedders' responsibility to properly register extensions
that will not contains JavaScript sources when running from an existing
snapshot.
Prerequisite for https://github.com/denoland/deno/pull/18080
This commit changes "ModuleMap" initialization to over-allocate by 16
for vectors storing module information and module V8 handles. In 99%
cases there's gonna be at least one additional module loaded, so it's
very wasteful to have to reallocate when the module is executed (IIRC
Rust will double the size of the vector) and move all of the elements.
```
Benchmark 1: deno run -A ../empty.js
Time (mean ± σ): 20.5 ms ± 0.5 ms [User: 13.4 ms, System: 5.1 ms]
Range (min … max): 19.8 ms … 24.0 ms 119 runs
Benchmark 2: target/release/deno run -A ../empty.js
Time (mean ± σ): 18.8 ms ± 0.3 ms [User: 13.0 ms, System: 4.9 ms]
Range (min … max): 18.3 ms … 19.9 ms 129 runs
```
---------
Co-authored-by: Bartek Iwańczuk <biwanczuk@gmail.com>
This commit renames "deno_core::InternalModuleLoader" to
"ExtModuleLoader" and changes the specifiers used by the
modules loaded from this loader to "ext:".
"internal:" scheme was really ambiguous and it's more characters than
"ext:", which should result in slightly smaller snapshot size.
Closes https://github.com/denoland/deno/issues/18020
There's no point for this API to expect result. If something fails it should
result in a panic during build time to signal to embedder that setup is
wrong.
Runtime generation of async op wrappers contributed to increased startup
time and core became unusable with
`--disallow-code-generation-from-strings` flag. The optimization only
affects very small microbenchmarks so this revert will not cause any
regressions.
This commit further improves startup time by:
- no relying on "JsRuntime::execute_script" for runtime bootstrapping,
this is instead done using V8 APIs directly
- registering error classes during the snapshot time, instead of on
startup
Further improvements can be made, mainly around removing
"core.initializeAsyncOps()" which takes around 2ms.
This commit should result in ~1ms startup time improvement.
Instead of relying on "serde_v8" which is very inefficient in
serializing enums, I'm hand rolling serde for "ModuleMap" data
that is stored in the V8 snapshot to make ES modules
snapshottable.
```
// this branch
Benchmark #2: ./target/release/deno run empty.js
Time (mean ± σ): 21.4 ms ± 0.9 ms [User: 15.6 ms, System: 6.4 ms]
Range (min … max): 20.2 ms … 24.4 ms
// main branch
Benchmark #2: ./target/release/deno run empty.js
Time (mean ± σ): 23.1 ms ± 1.2 ms [User: 17.0 ms, System: 6.2 ms]
Range (min … max): 21.0 ms … 26.0 ms
```
This allows to not include source code into the binary (because
it will already be included in the V8 snapshot).
Nothing changes for the embedders - everything should still build the
same.
This commit brings the binary size from 87Mb to 82Mb on M1.
Alternative to https://github.com/denoland/deno/pull/17820 and
https://github.com/denoland/deno/pull/17653
---------
Co-authored-by: Leo Kettmeir <crowlkats@toaxl.com>
This commit changes definition of "ExtensionFileSource", by changing
"code" field to being "ExtensionFileSourceCode" enum. Currently the enum
has only a single variant "IncludedInBinary". It is done in preparation
to allow embedders to decide if they want to include the source code in the
binary when snapshotting (in most cases they shouldn't do that).
In the follow up commit we'll add more variants to
"ExtensionFileSourceCode".
"include_js_files_dir!" macro was removed in favor "include_js_files!"
macro which can now accept "dir" option.