This is a follow-on to the earlier work in reducing string copies,
mainly focused on ensuring that ASCII strings are easy to provide to the
JS runtime.
While we are replacing a 16-byte reference in a number of places with a
24-byte structure (measured via `std::mem::size_of`), the reduction in
copies wins out over the additional size of the arguments passed into
functions.
Benchmarking shows approximately the same if not slightly less wallclock
time/instructions retired, but I believe this continues to open up
further refactoring opportunities.
This commit adds a new core API `opAsync2` to call an async op with
atmost 2 arguments. Spread argument iterators has a pretty big perf hit
when calling ops.
| name | avg msg/sec/core |
| --- | --- |
| 1.32.1 | `127820.750000` |
| #18506 | `140079.000000` |
| #18506 + #18509 | `150104.250000` |
| #18506 + #18509 + this | `157340.000000` |
Reduce the number of copies and allocations of script code by carrying
around ownership/reference information from creation time.
As an advantage, this allows us to maintain the identity of `&'static
str`-based scripts and use v8's external 1-byte strings (to avoid
incorrectly passing non-ASCII strings, debug `assert!`s gate all string
reference paths).
Benchmark results:
Perf improvements -- ~0.1 - 0.2ms faster, but should reduce garbage
w/external strings and reduces data copies overall. May also unlock some
more interesting optimizations in the future.
This requires adding some generics to functions, but manual
monomorphization has been applied (outer/inner function) to avoid code
bloat.
There's no point for this API to expect result. If something fails it should
result in a panic during build time to signal to embedder that setup is
wrong.
Runtime generation of async op wrappers contributed to increased startup
time and core became unusable with
`--disallow-code-generation-from-strings` flag. The optimization only
affects very small microbenchmarks so this revert will not cause any
regressions.
This commit changes signature of "deno_core::ModuleLoader::resolve" to pass
an enum indicating whether or not we're resolving a specifier for dynamic import.
Additionally "CliModuleLoader" was changes to store both "parent permissions" (or
"root permissions") as well as "dynamic permissions" that allow to check for permissions
in top-level module load an dynamic imports.
Then all code paths that have anything to do with Node/npm compat are now checking
for permissions which are passed from module loader instance associated with given
worker.
This PR introduces Wasm ops. These calls are optimized for entry from
Wasm land.
The `#[op(wasm)]` attribute is opt-in.
Last parameter `Option<&mut [u8]>` is the memory slice of the Wasm
module *when entered from a Fast API call*. Otherwise, the user is
expected to implement logic to obtain the memory if `None`
```rust
#[op(wasm)]
pub fn op_args_get(
offset: i32,
buffer_offset: i32,
memory: Option<&mut [u8]>,
) {
// ...
}
```
V8's JIT can do a better job knowing the argument count and also enable
fast call path (in future).
This also lets us call async ops without `opAsync`:
```js
const { ops } = Deno.core;
await ops.op_void_async();
```
this patch: 4405286 ops/sec
main: 3508771 ops/sec
This commit adds a new op_write_all to core that allows writing an
entire chunk in a single async op call. Internally this calls
`Resource::write_all`.
The `writableStreamForRid` has been moved to `06_streams.js` now, and
uses this new op. Various other code paths now also use this new op.
Closes #16227
This commit introduces two new buffer wrapper types to `deno_core`. The
main benefit of these new wrappers is that they can wrap a number of
different underlying buffer types. This allows for a more flexible read
and write API on resources that will require less copying of data
between different buffer representations.
- `BufView` is a read-only view onto a buffer. It can be backed by
`ZeroCopyBuf`, `Vec<u8>`, and `bytes::Bytes`.
- `BufViewMut` is a read-write view onto a buffer. It can be cheaply
converted into a `BufView`. It can be backed by `ZeroCopyBuf` or
`Vec<u8>`.
Both new buffer views have a cursor. This means that the start point of
the view can be constrained to write / read from just a slice of the
view. Only the start point of the slice can be adjusted. The end point
is fixed. To adjust the end point, the underlying buffer needs to be
truncated.
Readable resources have been changed to better cater to resources that
do not support BYOB reads. The basic `read` method now returns a
`BufView` instead of taking a `ZeroCopyBuf` to fill. This allows the
operation to return buffers that the resource has already allocated,
instead of forcing the caller to allocate the buffer. BYOB reads are
still very useful for resources that support them, so a new `read_byob`
method has been added that takes a `BufViewMut` to fill. `op_read`
attempts to use `read_byob` if the resource supports it, which falls
back to `read` and performs an additional copy if it does not. For
Rust->JS reads this change should have no impact, but for Rust->Rust
reads, this allows the caller to avoid an additional copy in many
scenarios. This combined with the support for `BufView` to be backed by
`bytes::Bytes` allows us to avoid one data copy when piping from a
`fetch` response into an `ext/http` response.
Writable resources have been changed to take a `BufView` instead of a
`ZeroCopyBuf` as an argument. This allows for less copying of data in
certain scenarios, as described above. Additionally a new
`Resource::write_all` method has been added that takes a `BufView` and
continually attempts to write the resource until the entire buffer has
been written. Certain resources like files can override this method to
provide a more efficient `write_all` implementation.
Welcome to better optimised op calls! Currently opSync is called with parameters of every type and count. This most definitely makes the call megamorphic. Additionally, it seems that spread params leads to V8 not being able to optimise the calls quite as well (apparently Fast Calls cannot be used with spread params).
Monomorphising op calls should lead to some improved performance. Now that unwrapping of sync ops results is done on Rust side, this is pretty simple:
```
opSync("op_foo", param1, param2);
// -> turns to
ops.op_foo(param1, param2);
```
This means sync op calls are now just directly calling the native binding function. When V8 Fast API Calls are enabled, this will enable those to be called on the optimised path.
Monomorphising async ops likely requires using callbacks and is left as an exercise to the reader.
Streamlines a common middleware pattern and provides foundations for avoiding variably sized v8::ExternalReferences & enabling fully monomorphic op callpaths
This allows resources to be "streams" by implementing read/write/shutdown. These streams are implicit since their nature (read/write/duplex) isn't known until called, but we could easily add another method to explicitly tag resources as streams.
`op_read/op_write/op_shutdown` are now builtin ops provided by `deno_core`
Note: this current implementation is simple & straightforward but it results in an additional alloc per read/write call
Closes #12556
* refactor(ops): return BadResource errors in ResourceTable calls
Instead of relying on callers to map Options to Results via `.ok_or_else(bad_resource_id)` at over 176 different call sites ...