Skip to content

Start iterating on #1682 On branch edburns/1682-java-tool-ergonomics#1692

Draft
edburns wants to merge 7 commits into
mainfrom
edburns/1682-java-tool-ergonomics
Draft

Start iterating on #1682 On branch edburns/1682-java-tool-ergonomics#1692
edburns wants to merge 7 commits into
mainfrom
edburns/1682-java-tool-ergonomics

Conversation

@edburns

@edburns edburns commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

new file: 1682-java-tool-ergonomics-prompts-remove-before-merge/20260615-prompts.md

@edburns edburns force-pushed the edburns/1682-java-tool-ergonomics branch from 6c3fe31 to 299c3e4 Compare June 16, 2026 23:52
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

edburns and others added 6 commits June 17, 2026 11:25
new file:   1682-java-tool-ergonomics-prompts-remove-before-merge/20260615-prompts.md

Signed-off-by: Ed Burns <edburns@microsoft.com>
Packages the knowledge of creating net-new Java E2E integration tests
with handcrafted replay proxy YAML snapshots into a reusable Copilot skill.

New files:

- .github/skills/new-java-e2e-test-yaml-and-test/SKILL.md
  Main skill instructions covering: YAML snapshot format, proxy matching
  logic, Java IT test template, verification commands, key classes/files
  reference table, and common pitfalls. Includes the critical constraint
  that Java's CapiProxy always sets GITHUB_ACTIONS=true so snapshots must
  be handcrafted rather than recorded.

- .github/skills/new-java-e2e-test-yaml-and-test/examples.md
  Two working examples from the codebase: a simple single-turn test
  (Botanica identity REPLACE) and a multi-turn test with tool calls
  (system message transform with view tool).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add a new Java failsafe integration test and replay snapshot that exercise the
current explicit tool-definition APIs before ergonomic annotations are added.

- add `LowLevelToolDefinitionIT` covering `create`, `createOverride`,
  `getArgumentsAs(record)`, `getArguments()`, and `ToolSet` available tools
- add `tools/low_level_tool_definition.yaml` with multi-tool call and final
  response replay conversations
- assert handler-driven state mutation (`currentPhase`) and expected response
  content

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@edburns edburns force-pushed the edburns/1682-java-tool-ergonomics branch from 721eeb4 to ac67ecf Compare June 17, 2026 15:26
@github-actions

Copy link
Copy Markdown
Contributor

Cross-SDK Consistency Review ✅

This PR adds a Java E2E integration test (LowLevelToolDefinitionIT) and supporting YAML snapshot for the low-level tool definition APIs. I reviewed the APIs being exercised against all six SDK implementations.

APIs Verified Across All SDKs

Feature Java (this PR) Node.js Python Go .NET Rust
Custom tool definition ToolDefinition.create() defineTool() @define_tool DefineTool[T]() CopilotTool.DefineTool() define_tool()
Override built-in tool ToolDefinition.createOverride() overridesBuiltInTool: true overrides_built_in_tool=True Tool.OverridesBuiltInTool = true CopilotToolOptions { OverridesBuiltInTool = true } .with_overrides_built_in_tool(true)
Typed arg deserialization invocation.getArgumentsAs(Class<T>) Handler receives typed args: TArgs Handler receives typed Pydantic model Generic T via JSON round-trip Method params via AIFunctionFactory reflection serde_json::from_value(inv.arguments)
Raw map access invocation.getArguments() invocation.arguments: unknown invocation.arguments: Any inv.Arguments map[string]any via context inv.arguments: Value
ToolSet builder new ToolSet().addCustom().addBuiltIn() new ToolSet().addCustom().addBuiltIn() ToolSet().add_custom().add_builtin() NewToolSet().AddCustom().AddBuiltIn() new ToolSet().AddCustom().AddBuiltIn() ToolSet::new().add_custom().add_builtin()

All features tested in this PR already have consistent equivalents in every other SDK. No new APIs are being introduced that would require changes to other language implementations.

Minor Observations (not cross-SDK issues)

  1. grepOverrideTool handler is never invoked in the snapshot: The test registers ToolDefinition.createOverride("grep", ...) and includes it in setTools(...), but test/snapshots/tools/low_level_tool_definition.yaml only replays set_current_phase and search_items being called — the grep override handler never executes during the test run. This means createOverride usage is exercised at the registration level, but the handler path itself isn't end-to-end tested. Consider either:

    • Adding a second prompt/conversation to the snapshot that triggers grep, or
    • Noting in a comment that the grep tool is registered to verify schema/registration, not handler invocation.
  2. Planning files should be removed before merge: The directory 1682-java-tool-ergonomics-prompts-remove-before-merge/ is explicitly called out in the PR description as needing removal, and these files remain in the current commits.

  3. PR is still marked as draft — flagging in case this review was triggered prematurely.

Overall the test is a good addition — it provides concrete coverage of the low-level tool definition ergonomics in Java and parallels the existing tool E2E tests in Go (tools_e2e_test.go tests OverridesBuiltInTool) and other SDKs.

Generated by SDK Consistency Review Agent for issue #1692 · sonnet46 3.2M ·

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant