improvement(integrations): validate BigQuery/Forms/PageSpeed + regenerate integration docs#5109
Conversation
…rate integration docs - BigQuery: mark null-defaulted outputs optional (get_table type/numRows/numBytes/creationTime/lastModifiedTime/location, list_datasets location, list_tables type, query totalBytesProcessed) - Google Forms: add response pagination (pageToken + filter params, nextPageToken output), fix pageSize visibility, advanced-mode pagination subBlocks + filter wandConfig - PageSpeed: add a 7th BlockMeta template (competitor benchmark) - Regenerate integration docs; add manual intro sections to new datagma/dropcontact/enrow/icypeas/leadmagic pages
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
PR SummaryLow Risk Overview New enrichment integrations in docs: Adds full MDX pages and nav entries for Datagma, Dropcontact, Enrow, Icypeas, and LeadMagic, with hand-written intro sections and matching brand icons wired through Docs accuracy pass (generated + corrected): Many pages move from truncated or shared “mega” output tables to per-action inputs/outputs—notably RB2B, ClickHouse, File (compress/decompress, clearer read/get/fetch params), Google Sheets/Excel ( Risk: Low for users—mostly documentation and optional output metadata; Forms pagination is a behavioral addition worth verifying in workflows that list large response sets. Reviewed by Cursor Bugbot for commit 689cae6. Configure here. |
Greptile SummaryThis PR validates and improves three integrations (BigQuery, Google Forms, PageSpeed) and regenerates/extends integration docs. The code changes are targeted and correct: BigQuery output fields that can be absent for views/external tables are marked
Confidence Score: 5/5Safe to merge; all functional changes are additive (new optional params, optional output fields) with no breaking changes to existing behaviour. The BigQuery and PageSpeed changes are purely additive metadata corrections. The Google Forms pagination additions are isolated to the get_responses path and introduce no regressions for callers that don't supply the new params. The docs-generator regex fixes only affect the offline script output. No auth, data-loss, or breaking-change risk is introduced. apps/sim/tools/google_forms/get_responses.ts and apps/sim/tools/google_forms/utils.ts — the interaction between pageToken and filter when both are supplied warrants a second look. Important Files Changed
Sequence Diagram%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
participant User
participant Block as Google Forms Block
participant Tool as get_responses Tool
participant API as Google Forms API
User->>Block: execute(formId, pageSize, filter, pageToken?)
Block->>Tool: "params {formId, pageSize, filter, pageToken}"
alt responseId provided
Tool->>API: "GET /forms/{formId}/responses/{responseId}"
API-->>Tool: FormResponse
Tool-->>Block: "{response, raw}"
else list mode
Tool->>API: "GET /forms/{formId}/responses?pageSize=N&filter=...&pageToken=..."
API-->>Tool: "{responses[], nextPageToken?}"
Tool->>Tool: sort + normalizeResponse()
Tool-->>Block: "{responses[], nextPageToken (null if last page), raw}"
end
Block-->>User: output
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
participant User
participant Block as Google Forms Block
participant Tool as get_responses Tool
participant API as Google Forms API
User->>Block: execute(formId, pageSize, filter, pageToken?)
Block->>Tool: "params {formId, pageSize, filter, pageToken}"
alt responseId provided
Tool->>API: "GET /forms/{formId}/responses/{responseId}"
API-->>Tool: FormResponse
Tool-->>Block: "{response, raw}"
else list mode
Tool->>API: "GET /forms/{formId}/responses?pageSize=N&filter=...&pageToken=..."
API-->>Tool: "{responses[], nextPageToken?}"
Tool->>Tool: sort + normalizeResponse()
Tool-->>Block: "{responses[], nextPageToken (null if last page), raw}"
end
Block-->>User: output
Reviews (2): Last reviewed commit: "fix(docs-gen): resolve tools defined in ..." | Re-trigger Greptile |
…ing docs The doc generator extracted tool descriptions with a character class that excluded both quote types (['"]([^'"]...)['"]), so a double-quoted description containing an apostrophe (e.g. "Find someone's email") was truncated at the apostrophe — the generated docs/catalog showed stubs like "Find someone". Anchor extraction on the actual opening quote (single/double/backtick), matching the existing extractDescription helper, in both buildToolDescriptionMap and extractToolInfo. Regenerated docs restore full descriptions across all affected integrations (Apollo, Ahrefs, LeadMagic, Findymail, OpenAI, Slack, etc.).
… per tool The doc generator located a tool's definition only by filename convention (decompress.ts / index.ts), so file_decompress — which lives in compress.ts alongside file_compress — fell back to index.ts and rendered an empty Input table. It also read the params block from the first tool in a multi-tool file, so every tool in such a file inherited the first tool's inputs/outputs. - getToolInfo: when no candidate file declares the exact tool ID, scan the whole tool-prefix directory for the file that does. - extractToolInfo: read the params block scoped to the specific tool, falling back to the full file for tools that inherit params via spread. Regenerated docs eliminate ~50 empty/incorrect input tables across integrations (clickhouse, rb2b, reddit, file, etc.); param-less OAuth-only tools correctly keep an empty input table.
|
Thanks @greptile — both findings were real and are now fixed at the root cause (the docs generator), not patched in the generated output. 1. 2. Stub descriptions ("Find someone" / "Find a person") — these weren't placeholder source text; the tool descriptions are full sentences (e.g. |
|
@greptile review |
|
@cursor review |
|
Both fixes are the right call — patching the generator rather than the generated output means all future regenerations are correct by default. The The regex fix for apostrophe-truncated descriptions is also clean — anchoring on the actual opening quote character rather than a character class that excluded both quote types is the correct approach. Good that it's consistent between Thanks for the detailed explanation of both commits. No further concerns here. |
There was a problem hiding this comment.
✅ Bugbot reviewed your changes and found no new issues!
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit 689cae6. Configure here.
Summary
optional: true(get_tabletype/numRows/numBytes/creationTime/lastModifiedTime/location, list_datasetslocation, list_tablestype, querytotalBytesProcessed); these are genuinely absent for views/external tables/cached queries. Validated all 5 tools against the live REST API — no other issues.get_responses(pageToken+filterparams,nextPageTokenoutput) which was previously missing; fixedpageSizevisibility (user-only→user-or-llm); set pagination subBlocks to advanced and added afilterwandConfig. Validated full tool surface + OAuth scopes.runPagespeedendpoint so the one-tool integration is complete; every response field-path verified.Type of Change
Testing
Tested manually —
bun run lintclean,tsc --noEmitclean, docs regen idempotent.Checklist