About 2% improvement on WS/HTTP benchmarks, possibly unlocking more
optimizations in the future.
---------
Co-authored-by: Matt Mastracci <matthew@mastracci.com>
This commit changes "eager ops" to directly return a response value
instead of calling "opresponse" callback in JavaScript. This saves
one boundary crossing and has a fantastic impact on the "async_ops.js"
benchmark:
```
v1.32.4
$ deno run cli/bench/async_ops.js
time 329 ms rate 3039513
time 322 ms rate 3105590
time 307 ms rate 3257328
time 301 ms rate 3322259
time 303 ms rate 3300330
time 306 ms rate 3267973
time 300 ms rate 3333333
time 301 ms rate 3322259
time 301 ms rate 3322259
time 301 ms rate 3322259
time 302 ms rate 3311258
time 301 ms rate 3322259
time 302 ms rate 3311258
time 302 ms rate 3311258
time 303 ms rate 3300330
```
```
this branch
$ ./target/release/deno run -A cli/bench/async_ops.js
time 257 ms rate 3891050
time 248 ms rate 4032258
time 251 ms rate 3984063
time 246 ms rate 4065040
time 238 ms rate 4201680
time 227 ms rate 4405286
time 228 ms rate 4385964
time 229 ms rate 4366812
time 228 ms rate 4385964
time 226 ms rate 4424778
time 226 ms rate 4424778
time 227 ms rate 4405286
time 228 ms rate 4385964
time 227 ms rate 4405286
time 228 ms rate 4385964
time 227 ms rate 4405286
time 229 ms rate 4366812
time 228 ms rate 4385964
```
Prerequisite for https://github.com/denoland/deno/pull/18652
- bump deps: the newest `lazy-regex` need newer `oncecell` and
`regex`
- reduce `unwrap`
- remove dep `lazy_static`
- make more regex cached
---------
Co-authored-by: Bartek Iwańczuk <biwanczuk@gmail.com>
This commit changes the build process in a way that preserves already
registered ops in the snapshot. This allows us to skip creating hundreds of
"v8::String" on each startup, but sadly there is still some op registration
going on startup (however we're registering 49 ops instead of >200 ops).
This situation could be further improved, by moving some of the ops
from "runtime/" to a separate extension crates.
---------
Co-authored-by: Divy Srivastava <dj.srivastava23@gmail.com>
Join two independent ops into one. A fast impl of one + a slow callback
of another. Here's an example showing optimized paths for latin-1 via
fast call and the next-best fallback using V8 apis.
```rust
#[op(v8)]
fn op_encoding_encode_into_fallback(
scope: &mut v8::HandleScope,
input: serde_v8::Value,
// ...
#[op(fast, slow = op_encoding_encode_into_fallback)]
fn op_encoding_encode_into(
input: Cow<'_, str>,
// ...
```
Benchmark results of the fallback path:
```
time target/release/deno run -A --unstable ./cli/tests/testdata/benches/text_encoder_into_perf.js
________________________________________________________
Executed in 70.90 millis fish external
usr time 57.76 millis 0.23 millis 57.53 millis
sys time 17.02 millis 1.28 millis 15.74 millis
target/release/deno_main run -A --unstable ./cli/tests/testdata/benches/text_encoder_into_perf.js
________________________________________________________
Executed in 154.00 millis fish external
usr time 67.14 millis 0.26 millis 66.88 millis
sys time 38.82 millis 1.47 millis 37.35 millis
```
In Rust, it is UB if a slice is mutated while borrowed except through
the slice itself, and it is also UB if a mutable slice is read while
borrowed. The op macro allows borrowing an `ArrayBuffer{,View}` as a
memory slice for the duration of an op, but this is not sound for async
ops, since the `ArrayBuffer` could be accessed from JS during the await
points. This PR therefore disallows such automatic borrowing only for
async ops.
Co-authored-by: Divy Srivastava <dj.srivastava23@gmail.com>
Currently realms are supported on `deno_core`, but there was no support
for async ops anywhere other than the main realm. The main issue is that
the `js_recv_cb` callback, which resolves promises corresponding to
async ops, was only set for the main realm, so async ops in other realms
would never resolve. Furthermore, promise ID's are specific to each
realm, which meant that async ops from other realms would result in a
wrong promise from the main realm being resolved.
This change takes the `ContextState` struct added in #17050, and adds to
it a `js_recv_cb` callback for each realm. Combined with the fact that
that same PR also added a list of known realms to `JsRuntimeState`, and
that #17174 made `OpCtx` instances realm-specific and had them include
an index into that list of known realms, this makes it possible to know
the current realm in the `queue_async_op` and `queue_fast_async_op`
methods, and therefore to send the results of promises for each realm to
that realm, and prevent the ID's from getting mixed up.
Additionally, since promise ID's are no longer unique to the isolate,
having a single set of unrefed ops doesn't work. This change therefore
also moves `unrefed_ops` from `JsRuntimeState` to `ContextState`, and
adds the lengths of the unrefed op sets for all known realms to get the
total number of unrefed ops to compare in the event loop.
This PR is a reland of #14734 after it was reverted in #16366, except
that `ContextState` and `JsRuntimeState::known_realms` were previously
relanded in #17050. Another significant difference with the original PR
is passing around an index into `JsRuntimeState::known_realms` instead
of a `v8::Global<v8::Context>` to identify the realm, because async op
queuing in fast calls cannot call into V8, and therefore cannot have
access to V8 globals. This also simplified the implementation of
`resolve_async_ops`.
Co-authored-by: Luis Malheiro <luismalheiro@gmail.com>
Fixes https://github.com/denoland/deno/issues/16934
Example compiler error:
```
error: mutable opstate is not supported in async ops
--> core/ops_builtin.rs:122:1
|
122 | #[op]
| ^^^^^
|
= note: this error originates in the attribute macro `op` (in Nightly builds, run with -Z macro-backtrace for more info)
```
Uses SeqOneByteString optimization to do zero-copy `&str` arguments in
fast calls.
- [x] Depends on https://github.com/denoland/rusty_v8/pull/1129
- [x] Depends on
https://chromium-review.googlesource.com/c/v8/v8/+/4036884
- [x] Disable in async ops
- [x] Make it work with owned `String` with an extra alloc in fast path.
- [x] Support `Cow<'_, str>`. Owned for slow case, Borrowed for fast
case
```rust
#[op]
fn op_string_len(s: &str) -> u32 {
str.len() as u32
}
```
This PR introduces Wasm ops. These calls are optimized for entry from
Wasm land.
The `#[op(wasm)]` attribute is opt-in.
Last parameter `Option<&mut [u8]>` is the memory slice of the Wasm
module *when entered from a Fast API call*. Otherwise, the user is
expected to implement logic to obtain the memory if `None`
```rust
#[op(wasm)]
pub fn op_args_get(
offset: i32,
buffer_offset: i32,
memory: Option<&mut [u8]>,
) {
// ...
}
```
- [x] Avoid copying buffers.
https://encoding.spec.whatwg.org/#dom-textdecoder-decode
> Implementations are strongly encouraged to use an implementation
strategy that avoids this copy. When doing so they will have to make
sure that changes to input do not affect future calls to
[decode()](https://encoding.spec.whatwg.org/#dom-textdecoder-decode).
- [x] Special op to avoid string label deserialization and parsing.
(Ideally we should map labels to integers in JS)
- [x] Avoid webidl `Object.assign` when options is undefined.
Implements fast scheduling of deferred op futures.
```rs
#[op(fast)]
async fn op_read(
state: Rc<RefCell<OpState>>,
rid: ResourceId,
buf: &mut [u8],
) -> Result<u32, Error> {
// ...
}
```
The future is scheduled via a fast API call and polled by the event loop
after being woken up by its waker.
V8's JIT can do a better job knowing the argument count and also enable
fast call path (in future).
This also lets us call async ops without `opAsync`:
```js
const { ops } = Deno.core;
await ops.op_void_async();
```
this patch: 4405286 ops/sec
main: 3508771 ops/sec
example writeFile benchmark:
```
# before
time 188 ms rate 53191
time 168 ms rate 59523
time 167 ms rate 59880
time 166 ms rate 60240
time 168 ms rate 59523
time 173 ms rate 57803
time 183 ms rate 54644
# after
time 157 ms rate 63694
time 152 ms rate 65789
time 151 ms rate 66225
time 151 ms rate 66225
time 152 ms rate 65789
```
This revert has been discussed at length out-of-band (including with
@andreubotella). The realms work in impeding ongoing event loop and
performance work. We very much want to land realms but it needs to wait
until these lower-level refactors are complete. We hope to bring realms
back in a couple weeks.
<!--
Before submitting a PR, please read http://deno.land/manual/contributing
1. Give the PR a descriptive title.
Examples of good title:
- fix(std/http): Fix race condition in server
- docs(console): Update docstrings
- feat(doc): Handle nested reexports
Examples of bad title:
- fix #7123
- update docs
- fix bugs
2. Ensure there is a related issue and it is referenced in the PR text.
3. Ensure there are tests that cover the changes.
4. Ensure `cargo test` passes.
5. Ensure `./tools/format.js` passes without changing files.
6. Ensure `./tools/lint.js` passes.
-->
This PR makes pointer read methods of `Deno.UnsafePointerView` Fast API
compliant, with the exception of `getCString` which cannot be made fast
with current V8 Fast API.