Skip to content

chore: Record metrics transactions (latency and count)#12051

Draft
lqiu96 wants to merge 6 commits intomainfrom
feat/datastore-transaction-metrics
Draft

chore: Record metrics transactions (latency and count)#12051
lqiu96 wants to merge 6 commits intomainfrom
feat/datastore-transaction-metrics

Conversation

@lqiu96
Copy link
Member

@lqiu96 lqiu96 commented Mar 3, 2026

Initial draft for adding transaction metrics (latency + count)

lqiu96 added 2 commits March 3, 2026 17:28
…count

- Add ATTRIBUTES_KEY_STATUS and ATTRIBUTES_KEY_METHOD_NAME to TelemetryConstants
- Make MetricsRecorder public with @InternalExtensionOnly annotation
- Add MetricsRecorder field and getter to DatastoreOptions
- Wire MetricsRecorder into DatastoreImpl for transaction metrics
- Refactor TracedReadWriteTransactionCallable to delegate to ReadWriteTransactionCallable
- Record per-attempt transaction count with gRPC status code and method name
- Record overall transaction latency using Guava Stopwatch
- Add unit tests for OpenTelemetryMetricsRecorder, MetricsRecorder, and DatastoreImpl
The per-attempt transaction count should use METHOD_COMMIT (Commit) since
each attempt records a commit operation. The overall transaction latency
continues to use METHOD_TRANSACTION_RUN (Transaction.Run).
@lqiu96 lqiu96 requested a review from jinseopkim0 March 3, 2026 23:19
lqiu96 added 4 commits March 3, 2026 18:27
Move extractStatus and extractGrpcStatusCode into a single shared
DatastoreException.extractGrpcStatusCode(Throwable) method that walks
the exception cause chain. Both call sites in DatastoreImpl now delegate
to this shared method.
Rename extractGrpcStatusCode to extractStatusCode and remove the
io.grpc.Status dependency. The reason string on DatastoreException is
already set from GAX's StatusCode.Code which supports both gRPC and
HttpJson transports. Use a plain "UNKNOWN" string as fallback.
Record transaction latency and attempt count at each individual RPC
when it involves a transaction:
- commit() when isTransactional
- lookup() when isTransactional
- runQuery() when isTransactional
- beginTransaction() (always transactional)
- rollback() (always transactional)
- AggregationQueryExecutor.execute() when transactional

Each RPC uses its own TelemetryConstants.METHOD_* constant. Removed
per-attempt recording from ReadWriteTransactionCallable since individual
RPCs now handle their own metrics. Made extractStatusCode public for
cross-package access from AggregationQueryExecutor.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant