Commit Graph

922 Commits

Author SHA1 Message Date
Eric Anderson 2e96fbf1e8 netty: Associate netty stream eagerly to avoid client hang
In #12185, RPCs were randomly hanging. In #12207 this was tracked down
to the headers promise completing successfully, but the netty stream
was null. This was because the headers write hadn't completed but
stream.close() had been called by goingAway().
2025-07-17 21:55:53 +00:00
George Gensure a37d3eb349 Guarantee missing stream promise delivery
In observed cases, whether RST_STREAM or another failure from netty or
the server, listeners can fail to be notified when a connection yields a
null stream for the selected streamId. This causes hangs in clients,
despite deadlines, with no obvious resolution.

Tests which relied upon this promise succeeding must now change.
2025-07-17 21:55:16 +00:00
Alex Panchenko 22cf7cf2ac
tests: Replace usages of deprecated junit ExpectedException with assertThrows (#12103) 2025-05-30 10:55:37 +05:30
Abhishek Agrawal a57c14a51e
refactor: Stops exception allocation on channel shutdown
This fixes #11955.

Stops exception allocation and its propagation on channel shutdown.
2025-03-19 09:27:34 +05:30
MV Shiva 2f52a00364
netty: Swap to UniformStreamByteDistributor (#11954) 2025-03-11 22:39:54 +05:30
Kannan J cdab410b81
netty: Per-rpc authority verification against peer cert subject names (#11724)
Per-rpc verification of authority specified via call options or set by the LB API against peer cert's subject names.
2025-02-24 20:28:11 +05:30
Alex Panchenko 5a7f350537
optimize number of buffer allocations (#11879)
Currently this improves 2 flows

1. Known length message which length is greater than 1Mb. Previously the
first buffer was 1Mb, and then many buffers of 4096 bytes (from
CodedOutputStream), now subsequent buffers are also up to 1Mb

2. In case of compression, the first write is always 10 bytes buffer
(gzip header), but worth allocating more space
2025-02-14 05:59:21 -08:00
Abhishek Agrawal 7153ff8522
netty: Removed 4096 min buffer size (#11856)
* netty: Removed 4096 min buffer size
2025-01-30 12:52:37 +05:30
Eric Anderson 7b5d0692cc
Replace jsr305's CheckReturnValue with Error Prone's (#11811)
We should avoid jsr305 and error prone's has the same semantics.

Fixes #8687
2025-01-09 13:45:35 -08:00
Eric Anderson e61b03cb9f netty: Fix getAttributes() data races in NettyClientTransportTest
Since approximately the LBv2 API (the current API) was introduced, gRPC
won't use a transport until it is ready. Long ago, transports could be
used before they were ready and these old tests were not waiting for the
negotiator to complete before starting. We need them to wait for the
handshake to complete to avoid a test-only data race in getAttributes()
noticed by TSAN.

Throwing away data frames in the Noop handshaker is necessary to act
like a normal handshaker; they don't allow data frames to pass until the
handshake is complete. Without the handling, it goes through invalid
code paths in NettyClientHandler where a terminated transport becomes
ready, and a similar data race.

```
  Write of size 4 at 0x00008db31e2c by thread T37:
    #0 io.grpc.netty.NettyClientHandler.handleProtocolNegotiationCompleted(Lio/grpc/Attributes;Lio/grpc/InternalChannelz$Security;)V NettyClientHandler.java:517
    #1 io.grpc.netty.ProtocolNegotiators$GrpcNegotiationHandler.userEventTriggered(Lio/netty/channel/ChannelHandlerContext;Ljava/lang/Object;)V ProtocolNegotiators.java:937
    #2 io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(Ljava/lang/Object;)V AbstractChannelHandlerContext.java:398
    #3 io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(Lio/netty/channel/AbstractChannelHandlerContext;Ljava/lang/Object;)V AbstractChannelHandlerContext.java:376
    #4 io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(Ljava/lang/Object;)Lio/netty/channel/ChannelHandlerContext; AbstractChannelHandlerContext.java:368
    #5 io.grpc.netty.ProtocolNegotiators$ProtocolNegotiationHandler.fireProtocolNegotiationEvent(Lio/netty/channel/ChannelHandlerContext;)V ProtocolNegotiators.java:1107
    #6 io.grpc.netty.ProtocolNegotiators$WaitUntilActiveHandler.channelActive(Lio/netty/channel/ChannelHandlerContext;)V ProtocolNegotiators.java:1011
    ...

  Previous read of size 4 at 0x00008db31e2c by thread T4 (mutexes: write M0, write M1, write M2, write M3):
    #0 io.grpc.netty.NettyClientHandler.getAttributes()Lio/grpc/Attributes; NettyClientHandler.java:345
    #1 io.grpc.netty.NettyClientTransport.getAttributes()Lio/grpc/Attributes; NettyClientTransport.java:387
    #2 io.grpc.netty.NettyClientTransport.newStream(Lio/grpc/MethodDescriptor;Lio/grpc/Metadata;Lio/grpc/CallOptions;[Lio/grpc/ClientStreamTracer;)Lio/grpc/internal/ClientStream; NettyClientTransport.java:198
    #3 io.grpc.netty.NettyClientTransportTest$Rpc.<init>(Lio/grpc/netty/NettyClientTransport;Lio/grpc/Metadata;)V NettyClientTransportTest.java:953
    #4 io.grpc.netty.NettyClientTransportTest.huffmanCodingShouldNotBePerformed()V NettyClientTransportTest.java:631
    ...
```

```
  Read of size 4 at 0x00008f983a3c by thread T4 (mutexes: write M0, write M1):
    #0 io.grpc.netty.NettyClientHandler.getAttributes()Lio/grpc/Attributes; NettyClientHandler.java:345
    #1 io.grpc.netty.NettyClientTransport.getAttributes()Lio/grpc/Attributes; NettyClientTransport.java:387
    #2 io.grpc.netty.NettyClientTransport.newStream(Lio/grpc/MethodDescriptor;Lio/grpc/Metadata;Lio/grpc/CallOptions;[Lio/grpc/ClientStreamTracer;)Lio/grpc/internal/ClientStream; NettyClientTransport.java:198
    #3 io.grpc.netty.NettyClientTransportTest$Rpc.<init>(Lio/grpc/netty/NettyClientTransport;Lio/grpc/Metadata;)V NettyClientTransportTest.java:973
    #4 io.grpc.netty.NettyClientTransportTest$Rpc.<init>(Lio/grpc/netty/NettyClientTransport;)V NettyClientTransportTest.java:969
    #5 io.grpc.netty.NettyClientTransportTest.handlerExceptionDuringNegotiatonPropagatesToStatus()V NettyClientTransportTest.java:425
    ...

  Previous write of size 4 at 0x00008f983a3c by thread T56:
    #0 io.grpc.netty.NettyClientHandler$FrameListener.onSettingsRead(Lio/netty/channel/ChannelHandlerContext;Lio/netty/handler/codec/http2/Http2Settings;)V NettyClientHandler.java:960
    ...
```
2025-01-07 10:25:22 -08:00
Benjamin Peterson bac8b32043
Fix equality and hashcode of CancelServerStreamCommand. (#11785)
In e036b1b198, CancelServerStreamCommand got another field. But, its hashCode and equals methods were not updated.
2025-01-03 10:42:42 -08:00
Eric Anderson aafab74087 api: Use package-private IgnoreJRERequirement
This avoids the dependency on animalsniffer-annotations. grpc-api, and
particularly grpc-context, are used many low-level places and it is
beneficial for them to be very low dependency. This brings grpc-context
back to zero-dependency.
2024-12-23 12:45:26 -08:00
Eric Anderson 8ea3629378
Re-enable animalsniffer, fixing violations
In 61f19d707a I swapped the signatures to use the version catalog. But I
failed to preserve the `@signature` extension and it all seemed to
work... But in fact all the animalsniffer tasks were completing as
SKIPPED as they lacked signatures. The build.gradle changes in this
commit are to fix that while still using version catalog.

But while it was broken violations crept in. Most violations weren't
too important and we're not surprised went unnoticed. For example, Netty
with TLS has long required the Java 8 API
`setEndpointIdentificationAlgorithm()`, so using `Optional` in the same
code path didn't harm anything in particular. I still swapped it to
Guava's `Optional` to avoid overuse of `@IgnoreJRERequirement`.

One important violation has not been fixed and instead I've disabled the
android signature in api/build.gradle for the moment.  The violation is
in StatusException using the `fillInStackTrace` overload of Exception.
This problem [had been noticed][PR11066], but we couldn't figure out
what was going on. AnimalSniffer is now noticing this and agreeing with
the internal linter. There is still a question of why our interop tests
failed to notice this, but given they are no longer running on pre-API
level 24, that may forever be a mystery.

[PR11066]: https://github.com/grpc/grpc-java/pull/11066
2024-12-19 07:54:54 -08:00
vinodhabib f66d7fc54d
netty: Fix ByteBuf leaks in tests (#11593)
Part of #3353
2024-12-02 11:09:25 -08:00
Eric Anderson 3562380da5 Upgrade Gradle to 8.10.2 and upgrade plugins
com.github.johnrengelman.shadow is now com.gradleup.shadow (note the
redirect)
https://github.com/johnrengelman/shadow/releases/tag/8.3.0
2024-10-30 07:00:57 -07:00
Ran 735b3f3fe6
netty: add soft Metadata size limit enforcement. (#11603) 2024-10-28 10:25:17 -07:00
hlx502 62f409810d
netty: Avoid TCP_USER_TIMEOUT warning when not using epoll (#11564)
In NettyClientTransport, the TCP_USER_TIMEOUT attribute can be set only
if the channel is of the AbstractEpollStreamChannel.

Fixes #11517
2024-10-22 12:17:39 -07:00
Riya Mehta d628396ec7
s2a: Add S2AStub cleanup handler. (#11600)
* Add S2AStub cleanup handler.

* Give TLS and Cleanup handlers name + update comment.

* Don't add TLS handler twice.

* Don't remove explicitly, since done by fireProtocolNegotiationEvent.

* plumb S2AStub close to handshake end + add integration test.

* close stub when TLS negotiation fails.
2024-10-10 16:31:18 -07:00
Kannan J 1ded8aff81
On result2 resolution result have addresses or error (#11330)
Combined success / error status passed via ResolutionResult to the NameResolver.Listener2 interface's onResult2 method - Addresses in the success case or address resolution error in the failure case now get set in ResolutionResult::addressesOrError by the internal name resolvers.
2024-10-07 17:55:56 +05:30
Riya Mehta e75a044107
s2a,netty: S2AHandshakerServiceChannel doesn't use custom event loop. (#11539)
* S2AHandshakerServiceChannel doesn't use custom event loop.

* use executorPool.

* log when channel not shutdown.

* use a cached threadpool.

* update non-executor version.
2024-09-20 12:32:54 -07:00
Kurt Alfred Kluever 2fe1a13cd0
Migrate from `Charsets` to `StandardCharsets`. (#11482) 2024-08-22 12:11:43 +05:30
Eric Anderson ff8e413760
Remove direct dependency on j2objc
Bazel had the dependency added because of #5046, where Guava was
depending on it as compile-only and Bazel build have "unknown enum
constant" warnings. Guava now has a compile dependency on j2objc, so
this workaround is no longer needed. There are currently no version skew
issues in Gradle, which was the only usage.
2024-08-13 21:33:55 -07:00
Eric Anderson 4ab34229fb netty: Use DefaultELG with LocalChannel in test
LocalChannel is not guaranteed to be compatible with NioEventLoopGroup,
and is failing with Netty 4.2.0.Alpha3-SNAPSHOT.

See #11447
2024-08-12 15:39:12 -07:00
Kannan J 70ae83288d
Upgrade Netty to 4.1.110 and tcnative to 2.0.65 (#11444)
Upgrade Netty to 4.1.110 and tcnative to 2.0.65.
2024-08-06 20:38:08 +05:30
Kurt Alfred Kluever 06135a0745 Migrate from the deprecated `Charsets` constants (in Guava) to the `StandardCharsets` constants (in the JDK)
cl/658539667
2024-08-05 13:31:08 -07:00
Eric Anderson 9bed655c56 Revert "Netty upgrade to 4.1.110 in grpc-java (#11273)"
This reverts commit f9b072cfe2.

Changes from the release process got mixed in with the commit.
2024-08-02 15:30:31 -07:00
Kannan J f9b072cfe2
Netty upgrade to 4.1.110 in grpc-java (#11273)
* Bump Netty to 4.1.110.Final.
2024-08-03 01:05:44 +05:30
erm-g 516dec989a
util: Align AdvancedTlsX509{Key and Trust}Manager (#11385)
* Swap internal key/cert args

* Deprecate *FromFile methods

* Test for client trusted socket
2024-07-16 12:33:19 -04:00
Larry Safran 506192872e
Restore old behavior of NettyAdaptiveCumulator, but avoid using that class if Netty is on version 4.1.111 or later. (#11367) 2024-07-10 10:38:46 -07:00
erm-g ecae9b797d
security: Add updateIdentityCredentials methods to AdvancedTlsX509KeyManager. (#11358)
Add new 'updateIdentityCredentials' methods with swapped (chain, key) signatures.
2024-07-10 12:30:27 +05:30
Eric Anderson 47aa7b9bca Upgrade Checkstyle to 10.17.0
The code changes are to place all overloaded methods next to each other.
2024-07-09 08:12:33 -07:00
Larry Safran 0fea7dd32e
netty:Fix Netty composite buffer merging to be compatible with Netty 4.1.111 (#11294)
* Use addComponent instead of addFlattenedComponent and do not append to components that are composites.
2024-06-20 15:55:06 -07:00
erm-g 781b4c4575
security: Stabilize AdvancedTlsX509KeyManager. (#11139)
* Clean up and de-experimentalization of KeyManager

* Unit tests for API validity.
2024-05-30 13:54:11 -04:00
hakusai22 6ec744f2a0
Fix various typos (#11144) 2024-05-06 20:29:44 -07:00
Ryan P. Brewster e036b1b198 netty: Allow deframer errors to close stream with a status code
Today, deframer errors cancel the stream without communicating a status code
to the peer. This change causes deframer errors to trigger a best-effort
attempt to send trailers with a status code so that the peer understands
why the stream is being closed.

Fixes #3996
2024-04-24 14:37:37 -07:00
Benjamin Peterson fb9a10809f
netty: Release SendGrpcFrameCommand when stream is missing (#11116)
`sendGrpcFrame` owns the buffer in `SendGrpcFrameCommand`. If the frame is not handed off to netty, it needs to be released in the method.

https://github.com/grpc/grpc-java/issues/11115
2024-04-22 10:27:39 -07:00
Sergii Tkachenko e490273edd
netty: Handle write queue promise failures (#11016)
Handles Netty write frame failures caused by issues in the Netty
itself.

Normally we don't need to do anything on frame write failures because
the cause of a failed future would be an IO error that resulted in
the stream closure.  Prior to this PR we treated these issues as a
noop, except the initial headers write on the client side.

However, a case like netty/netty#13805 (a bug in generating next
stream id) resulted in an unclosed stream on our side. This PR adds
write frame future failure handlers that ensures the stream is
cancelled, and the cause is propagated via Status.

Fixes #10849
2024-04-16 16:27:51 -07:00
David Burns 00649913b0
bazel: Use the `artifact` macro for loading maven deps
The recommended way to load dependencies from `rules_jvm_external`
is to make use of the `@maven` workspace, and the most readable
way of doing that is to use the `artifact` macro provides.

This removes the need to generate the "compat" namespaces, which
`rules_jvm_external` provided for backwards compatibility with
older releases. This change also sets things up for supporting
`bzlmod`: this requires all workspaces accessed by a library to
be named "up front" in the `MODULE.bazel` file. This way, the
only repo that needs to be exported is `@maven`, rather than the
current huge list.
2024-03-28 14:33:32 -07:00
James Duong 2c83ef0632
Allow configuration of the queued byte threshold at which a Stream is considered not ready (#10977)
* Allow the queued byte threshold for a Stream to be ready to be configurable

- on clients this is exposed by setting a CallOption
- on servers this is configured by calling a method on ServerCall or ServerStreamListener
2024-03-21 15:37:26 -07:00
Eric Anderson 8feb919167 netty/shaded: Add missing evaluationDependsOn
`project(':grpc-netty').configurations` requires the grpc-netty project
to be configured, which requires evaluationDependsOn. Without
evaluationDependsOn, project loading order is arbitrary and you can get
random errors after small configuration changes.
2024-03-05 10:38:55 -08:00
Eric Anderson ac62c8b055 Fix tests and warnings on Java 17
SelfSignedCertificate is not available on Java 17 because
OpenJdkSelfSignedCertGenerator is not available. This only impacted
tests.

AccessController is being removed, and these locations are doing simple
reflection which is unlikely to require it even when a security policy
is in effect. There's other places we do reflection without the
AccessController, so either no security policies care or the users can
update their policies to allow it.
2024-02-29 16:55:46 -08:00
Eric Anderson 5f8958f65c Use Javadoc's -linkoffline instead of -link
`-link` does I/O to download the package list, for every javadoc
invocation. There is no caching, so this happens many times per build.
Swap to offline mode to avoid spamming the servers, and avoid build
failures if the servers aren't entirely healthy.
2024-02-23 17:22:48 -08:00
Eric Anderson f768c4222b Remove build usages of Jetty ALPN
It wasn't actually being used. Since Java 8u252 in early 2020 we've been
using ALPN from the JDK. The Jetty ALPN Agent has been a noop.

We do keep the Jetty ALPN support in the code and tests, but we don't
have the infrastructure to actually run it.
2024-02-23 15:27:33 -08:00
Eric Anderson a231d80756 Remove semi-circular dependency between core and util
Add the 'fake' dependency to grpc-netty instead of grpc-core.
grpc-okhttp already depends on grpc-util and probably would be fine
without round_robin on Android.

There's not actually a circular dependency, but some tools can't handle
the compile vs runtime distinction. Such tools are broken, but fixes
have been slow and this approach works with no real downfalls.

Works around #10576 #10701
2024-02-16 12:31:54 -08:00
Benjamin Peterson a68399a9b6
netty: improve server handling of writes to reset streams (#10258)
* netty: improve server handling of writes to reset streams

A server stream can be reset by the client while server writes are still queued. After the stream is reset, the netty connection will forget the stream object. The `NettyServerHandler` must deal with that situation. `sendResponseHandlers` already had some code to do that. This change standardizes that code and adds it to `sendGrpcFrame`. This fixes a potential bug where a `SendGrpcFrameCommand` with `endOfStream=true` would raise an `AssertionError` if written to a reset stream. (This bug is not currently reachable because `endOfStream=false` for all server `SendGrpcFrameCommand` objects.)

* Do not call into the encoder when we know the stream is gone.
2024-02-16 11:08:09 -08:00
Eric Anderson 91cfa97c28 netty: Avoid volatile attributes on client-side
6efa9ee3 added `volatile` to `attributes` after TSAN detected a data
race that was added in 91d15ce4. The race was because attributes may be
read from another thread after `transportReady()`, and the
post-filtering assignment occurred after `transportReady()`. The code
now filters the attributes separately so they are updated before calling
`transportReady()`.

Original TSAN failure:
```
  Read of size 4 at 0x0000cd44769c by thread T23:
    #0 io.grpc.netty.NettyClientHandler.getAttributes()Lio/grpc/Attributes; NettyClientHandler.java:327
    #1 io.grpc.netty.NettyClientTransport.getAttributes()Lio/grpc/Attributes; NettyClientTransport.java:363
    #2 io.grpc.netty.NettyClientTransport.newStream(Lio/grpc/MethodDescriptor;Lio/grpc/Metadata;Lio/grpc/CallOptions;[Lio/grpc/ClientStreamTracer;)Lio/grpc/internal/ClientStream; NettyClientTransport.java:183
    #3 io.grpc.internal.MetadataApplierImpl.apply(Lio/grpc/Metadata;)V MetadataApplierImpl.java:74
    #4 io.grpc.auth.GoogleAuthLibraryCallCredentials$1.onSuccess(Ljava/util/Map;)V GoogleAuthLibraryCallCredentials.java:141
    #5 com.google.auth.oauth2.OAuth2Credentials$FutureCallbackToMetadataCallbackAdapter.onSuccess(Lcom/google/auth/oauth2/OAuth2Credentials$OAuthValue;)V OAuth2Credentials.java:534
    #6 com.google.auth.oauth2.OAuth2Credentials$FutureCallbackToMetadataCallbackAdapter.onSuccess(Ljava/lang/Object;)V OAuth2Credentials.java:525
    ...

  Previous write of size 4 at 0x0000cd44769c by thread T24:
    #0 io.grpc.netty.NettyClientHandler$FrameListener.onSettingsRead(Lio/netty/channel/ChannelHandlerContext;Lio/netty/handler/codec/http2/Http2Settings;)V NettyClientHandler.java:920
    #1 io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder$FrameReadListener.onSettingsRead(Lio/netty/channel/ChannelHandlerContext;Lio/netty/handler/codec/http2/Http2Settings;)V DefaultHttp2ConnectionDecoder.java:515
    ...
```
2024-01-12 07:29:48 -08:00
yifeizhuang 6efa9ee37d
fix tsan data race in attributes (#10805) 2024-01-08 10:54:06 -08:00
joybestourous 91d15ce4e6
Add ClientTransportFilter (#10646)
* Add ClientTransportFilter
2024-01-03 10:45:22 -08:00
Eric Anderson d6830d7f99
Change many api deps to implementation deps
These look pretty fair now, mostly only exposing grpc-api and
annotations as api dependencies.
2023-12-15 15:14:29 -08:00
Ulf Adams 692696e2ed NettyServerBuilder: remove dead code
The KeepAliveManager clamps the keepalive and keepalive time values such that the result is larger than the minimum values specified here. Therefore, a second check here is unnecessary.
2023-12-12 14:02:00 -08:00