Skip to content

v0.7.10: models sorting, compress/decompress file block tools, new enrichment providers, deepseek models, db performance#5106

Open
waleedlatif1 wants to merge 26 commits into
mainfrom
staging
Open

v0.7.10: models sorting, compress/decompress file block tools, new enrichment providers, deepseek models, db performance#5106
waleedlatif1 wants to merge 26 commits into
mainfrom
staging

Conversation

@waleedlatif1

@waleedlatif1 waleedlatif1 commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

waleedlatif1 and others added 9 commits June 16, 2026 11:21
…n each provider (#5099)

* improvement(models): sort model dropdown by latest release date within each provider

* fix(models): preserve input provider order and build catalog index once
…5100)

* feat(file): add Compress operation to bundle files into a .zip archive

* feat(file): add Decompress operation to extract .zip archives

Adds the inbound half of the archive pair: extracts a .zip back into the
workspace with zip-slip path sanitization, symlink skipping, and entry/
size caps to bound zip-bomb expansion. Extracted files are returned in the
files output, ready to chain downstream.

* fix(file): align archive ops with v5 output surface and zip mime

- Drop the single 'file' output reintroduced for compress/decompress; v5
  intentionally exposes only 'files' (plus id/name/size/url scalars), so
  compress/decompress reuse the existing surface with no new block output
- Add zip/gz to EXTENSION_TO_MIME (previously only in the reverse map), so
  archive extensions resolve to a real mime instead of octet-stream
- Update File v5 block test for the two new operations

* fix(file): harden compress naming per review

- Flatten zip entry names to a safe basename so untrusted fileInput names
  with .. or / cannot produce zip-slip entry paths (cursor)
- Treat archiveName as a flat name landing at the workspace root instead of
  passing it through splitWorkspaceFilePath, which silently created folders
  for names with separators (greptile)
- Add the upfront empty-input guard before any DB calls, matching the read
  and content operations (greptile)

* fix(file): make decompress extraction atomic and bound per-entry size

- Read and validate every entry before writing any file, so hitting a size
  cap no longer leaves partially-extracted files in the workspace (cursor)
- Enforce the per-entry cap on the materialized buffer in addition to the
  declared size, covering entries that omit an uncompressed size (cursor)
- Pre-check declared sizes up front to reject standard zip bombs before
  materializing, and return 422 when no files could be extracted (cursor)

* fix(file): exclude skipped entries from caps and reject multi-archive decompress

- Resolve safe (sanitized) zip entries up front so unsafe/skipped entries
  no longer count toward the per-entry and total uncompressed-size caps (cursor)
- Reject decompress input that resolves to more than one archive with a clear
  error instead of silently extracting only the first (cursor)

* fix(file): enforce single-archive decompress at the API boundary

The block already rejects multiple archives, but the manage route is the
real boundary (callable directly and by the LLM tool) and still took the
first of multiple resolved inputs. Add the empty-input and >1-archive guards
in the route so extra archives are rejected with a clear error rather than
silently ignored (cursor).

* docs(file): correct compress description and stale file-output references

- Drop the misleading 'under provider upload limits' claim from the compress
  tool description (models cannot read zip archives)
- Fix bestPractices to reference the 'files' output, not a non-existent 'file'
- Remove the stale 'file' property from the compress test fixture so it
  matches the real API response (greptile)
…emoize Anthropic client (#5098)

* perf(execution): parallelize preflight gates, cache deployed state, memoize Anthropic client

- Memoize Anthropic + Azure-Anthropic SDK clients (new client-cache.ts) keyed
  by apiKey (+beta header; +baseURL/version/pinnedIP for Azure) so HTTP
  keep-alive connections are reused instead of a fresh TLS handshake per call.
  apiKey is the tenant boundary.
- Parallelize the read-only preflight gates in preprocessing.ts (ban +
  subscription, then usage + org-member + rate-limit) while preserving exact
  error precedence (ban 403 -> usage 402 -> rate 429) and keeping the sole
  write (admission reservation) last.
- Parallelize the independent workflow-state and env-var loads in execution-core.
- Cache deployed workflow state by immutable deploymentVersionId with
  deep-clone-on-read, oldest-first eviction, and a 5-min TTL bounding the
  credential-mapping edge across ECS tasks.
- Parallelize the independent personal-subscription + membership queries in
  getHighestPrioritySubscription.
- BYOK: drop the redundant getWorkspaceById existence check (auth already
  validates the workspace); read the key list fresh every call for zero
  cross-instance staleness.

Billing/usage/ban/permission reads stay fresh on the primary (no cache, no
replica). Adds tests for every new mechanism and fixes a pre-existing vitest
class-mock incompatibility that had execution-core.test.ts fully red on staging.

* fix(execution): run rate-limit gate only after ban/usage pass

The rate-limit gate is not read-only — checkRateLimitWithSubscription consumes
a token — so running it in parallel with the read-only gates debited rate-limit
quota for requests that the ban (403) or usage (402) gates reject, which the
original sequential flow never did.

Move the rate-limit gate to run sequentially after the ban and usage gates pass,
preserving the read-only gates' parallelism (ban + subscription + usage) and the
exact ban -> usage -> rate precedence. Add regression tests asserting the rate
limiter is not consumed when an earlier gate rejects, and is consumed once when
they pass.

Caught by Cursor Bugbot review.

* chore(execution): trim redundant preflight comments

Tighten the gate overview to match the sequential rate-limit gate and drop
inline notes that duplicated it or the runRateLimitGate doc.

* refactor(cache): address review — idle TTL for client cache, LRUCache for deployed state

- client-cache: add updateAgeOnGet so the TTL is genuinely idle-based (active
  clients keep their warm keep-alive connections; the JSDoc now matches behavior).
- deployed-state: replace the hand-rolled Map + manual FIFO eviction/TTL with
  LRUCache (real LRU eviction, built-in TTL), matching the effectiveDecryptedEnv
  and integration-tool-schema caches. TTL stays absolute (not reset on read) so
  the credential-migration remap still propagates across ECS tasks.

Both per review feedback from Greptile.

* test(execution): isolate rate-limit gate test from STEP 7 reservation

The 'consumes the rate-limit gate once' test reached the STEP 7 admission
reservation, which depends on Redis — it passed locally (reserve throws and is
swallowed) but failed in CI (reserve returns not-reserved -> 429). Pass
skipConcurrencyReservation so the test isolates the rate gate deterministically.

* perf(providers): memoize SDK clients where the pool is per-client (bedrock, vllm)

Generalize the Anthropic client cache into one shared memoizer
(providers/client-cache.ts) and apply it only where each new client owns its own
connection pool — so reuse actually keeps connections warm:

- bedrock: AWS SDK clients hold a per-client connection pool (reuse is the AWS
  best practice). Keyed by region + credential identity.
- vllm: a pinned endpoint creates its own undici Agent per call; key by the
  resolved IP so DNS re-validation still runs each request.
- anthropic + azure-anthropic: migrated onto the shared memoizer.

Deliberately NOT applied to the OpenAI-compatible providers, groq, cerebras, or
google: their SDKs share a process-global keep-alive pool (Node openai-sdk module
singleton agent; anthropic/global undici), so a fresh client per request already
reuses connections and memoization would add complexity with ~no benefit. litellm
uses a plain shared-agent client (no pinning) and is likewise skipped.

Bounded LRU (max 1000, 30m idle TTL) with no close-on-eviction, avoiding the
unbounded-growth and eviction-closes-in-use-client failure modes seen in similar
client caches.

* chore(perf): trim verbose comments to terse why-notes

* chore(perf): drop obvious inline comments, keep nuance as TSDoc

* fix(bedrock): key client cache on full credential, not just access key id

A corrected secret under the same access key id would otherwise keep serving the
stale cached client until TTL/eviction. Caught by Cursor Bugbot.

* test(execution,providers): fix preflight mock reset + isolate provider client cache in tests

- preprocessing.test: re-establish the checkOrgMemberUsageLimit mock in beforeEach
  (the only gate mock not re-set). In the full suite its implementation was reset
  so the success-path test got undefined -> threw -> 500 -> success:false. Mirrors
  how checkServerSideUsageLimits is handled.
- client-cache: add clearProviderClientCacheForTests; call it in the bedrock and
  vllm test beforeEach so construction assertions always start from a cache miss
  now that those providers memoize their client.

* test(execution): make RateLimiter mock constructable under vitest 4.x

The RateLimiter mock used an arrow factory (vi.fn(() => ({...}))). vitest 4.x
(CI) rejects `new` on an arrow-implemented mock ("not a constructor"); 3.2.4
allowed it. The new rate-gate test is the first to actually `new RateLimiter()`,
so it surfaced the failure only in CI. Switch the mock to a regular function and
drop the speculative beforeEach re-establishments that didn't address it.
…crease connector limits + better error propagation (#5089)

* fix(execution,connectors): offload large function inputs; harden KB connector size limits

Addresses a class of 10 MB limit failures:

- executor/variables: offload over-budget function block-output context values to
  durable large-value refs (lazy `sim.values.read`) so JS function blocks can merge
  medium files without exceeding the 10 MB inter-block request-body cap.
- connectors: stream downloads via `readBodyWithLimit` (memory-safe), and surface
  oversized files as visible `failed` KB documents instead of silently dropping them
  — listing-time for github/s3/dropbox/onedrive/sharepoint, fetch-time for
  gitlab/azure/google-drive via a shared `ConnectorFileTooLargeError`. Raise the
  per-file cap from a hardcoded 10 MB to the canonical 100 MB KB document limit
  (`CONNECTOR_MAX_FILE_BYTES`), except Google Drive's export path (Google's hard
  10 MB export-API limit).
- sync-engine: `classifyExternalDoc` + bulk `skipDocuments` (failed rows with a
  reason, excluded from retry), byte-bounded batch concurrency to cap peak worker
  memory at the raised cap, and a `metadata.fileSize ?? size` fallback.

* fix zoom

* update skill

* address comments + fix terminal event in sse stream

* fix accounting issue
#5087)

* feat(integrations): hosted email-enrichment providers + cascade wiring

Add Datagma, Dropcontact, LeadMagic, Icypeas, and Enrow integrations —
tools, blocks, brand icons, and BYOK + metered hosted-key support — and
register each in the tool/block registries and BYOK provider list.

Wire the new finders/verifiers into the enrichment cascades:
- work-email: Datagma, LeadMagic, Dropcontact, Icypeas, Enrow
- phone-number: LeadMagic, Datagma, Dropcontact
- email-verification: Icypeas, Enrow
- company-info: Datagma, LeadMagic
- company-domain: Datagma

Add hosting tests for all five providers and cascade tests covering the
new providers (incl. new test files for email-verification, company-info,
and company-domain).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(enrichment): address PR review on Icypeas success + Datagma billing

- Icypeas find_email/verify_email postProcess return success:true for all
  terminal statuses (NOT_FOUND/DEBITED_NOT_FOUND included) so the cascade
  runner calls mapOutput and records invalid/not-found verdicts instead of
  throwing and inflating the error count
- Bill Icypeas verify FOUND (not just DEBITED*) per the documented 0.1-credit
  charge
- Datagma enrich_person only applies the 30-credit phone surcharge when a
  phone lookup (phoneFull) was requested
- Note Datagma's URL-param (apiId) auth in the hosted-key doc comment
- Update hosting tests to match

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(enrichment): only bill Enrow verify on a completed verification

getCost returned a flat 0.25 credits regardless of output, so a job that
fell back to the initial submit response (poll never completed, no
qualification) was still metered. Charge 0.25 only when a qualification is
present; 0 otherwise. Add a no-qualification test case.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* chore(enrichment): peg hosted credit cost to each provider's lowest paid plan

Align *_CREDIT_USD to the entry tier Sim will provision:
- Datagma: Regular $49/3,000 emails → $0.0163 (was Popular $0.0132)
- LeadMagic: Basic $49/2,000 → $0.0245 (was Growth $0.0104)

Icypeas (Basic $0.019), Enrow (Starter $0.012), and Dropcontact (Starter
~$0.17) already reflect their lowest plan. Tests derive from the constants,
so values stay consistent.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(enrichment): address PR review on mononyms + Icypeas verify email map

- work-email LeadMagic: pass full_name + domain so single-token (mononym)
  names are no longer skipped
- work-email Icypeas: firstname/lastname are optional on the API, so run a
  mononym with firstname alone instead of self-skipping
- icypeas_verify_email mapItem reads item.email (verify payload shape) with a
  fallback to the nested results.emails[0].email

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(enrichment): case-insensitive Enrow find billing

getCost compared qualification to exactly 'valid' while the cascade
normalizes with toLowerCase(), so a differently-cased API qualifier could
zero out billing on a valid email. Lowercase before comparing; add a test.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(enrichment): drop Dropcontact from the phone-number cascade

Dropcontact is an email/company-data enrichment service, not a phone-discovery
provider — its phone/mobile_phone fields are unreliable and were surfacing
firmographic data (an employee-count range like "5000-20077") as the phone.
Keep the two purpose-built phone finders (LeadMagic find_mobile, Datagma
find_phone); Dropcontact stays in the work-email and company cascades where
its data is reliable.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(enrichment): only accept valid-qualified Enrow emails in work-email

Enrow's finder qualifies each email valid/invalid. The work-email mapOutput
accepted any non-empty email, so an invalid-qualified address could fill the
cell while hosted billing (which only charges on valid) charged zero. Gate the
cell on qualification === 'valid', consistent with billing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
* fix(input-format): field not editable race condition

* remove dead code

* simplify
…ot-path write cleanups (#5105)

* perf(db): logs-list index, drop redundant indexes, replica routing, hot-path write cleanups

* fix(logs): keep /api/v1/logs on primary db — its permissions join is the auth gate, not replica-safe
…eletons (#5104)

* fix(sidebar): prefetch chats + workflows so cold loads don't flash skeletons

On a cold load (e.g. when the browser discards an idle tab and reloads),
the persistent sidebar started with an empty React Query cache and
client-fetched its chat + workflow lists, flashing loading skeletons.

Prefetch both lists server-side in the workspace layout and hydrate them
via HydrationBoundary, under the same query keys and mappers the client
hooks use, so the sidebar paints populated on the first render. The
prefetch runs concurrently with the existing org-settings fetch and
never throws, so it adds no blocking work in the common case and falls
back to client fetching on error.

* refactor(prefetch): call data layer directly instead of internal HTTP self-fetch

The sidebar and settings prefetches fetched their data by making internal
HTTP requests to our own API routes. Replace those self-fetches with direct
calls to shared server-side data functions, so each route handler and its
prefetch read from one source with no extra network hop, serialization, or
re-auth.

- Extract listWorkflowsForUser (lib/workflows/queries) and listMothershipChats
  (lib/copilot/chat) from their routes; both routes and the sidebar prefetch
  now call them.
- Extract getUserSettings/getUserProfile (lib/users/queries) shared by the
  settings/profile routes and their prefetches.
- Subscription prefetch calls the existing getSimplifiedBillingSummary +
  getEffectiveBillingStatus directly.
- Sidebar prefetch checks workspace access once via checkWorkspaceAccess and
  skips silently when denied.

* refactor(prefetch): share mothership chat list staleTime constant

Export MOTHERSHIP_CHAT_LIST_STALE_TIME from the chats hook and use it in both
useMothershipChats and the sidebar prefetch, mirroring WORKFLOW_LIST_STALE_TIME
so the prefetch and client hook can't drift.

* fix(prefetch): keep subscription prefetch on the wire shape via internal billing API

The billing summary returns Date fields (and an untyped metadata blob) that the
JSON API serializes to strings. Calling the data layer directly would cache Date
objects (App Router preserves them through RSC serialization), mismatching the
string wire shape the client useSubscriptionData hook caches. Route the
subscription prefetch through the internal billing API so server-hydrated and
client-fetched data share the exact same shape. The date-free general-settings
and profile prefetches keep calling the data layer directly.
@vercel

vercel Bot commented Jun 17, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
docs Ready Ready Preview, Comment Jun 18, 2026 2:43am

Request Review

@cursor

cursor Bot commented Jun 17, 2026

Copy link
Copy Markdown

PR Summary

Low Risk
Changes are mostly documentation, CI pinning, and CODEOWNERS; runtime risk is low aside from future supply-chain enforcement once bun audit becomes blocking.

Overview
This diff tightens CI and supply-chain hygiene and brings public integration docs in line with shipped product behavior from the broader v0.7.10 release.

CI / repo governance: GitHub Actions across workflows move from floating @vN tags to commit-SHA pins. test-build adds a bun audit step (continue-on-error until backlog is triaged). Release workflow pins Bun 1.3.13 instead of latest. CODEOWNERS assigns @simstudioai/deps to root and nested package.json, bun.lock, bunfig.toml, and .npmrc.

Review guidance: The memory-load-check agent skill adds KB connector file-download rules—CONNECTOR_MAX_FILE_BYTES, readBodyWithLimit, listing-time skip, and visible skipped rows—and a red flag against silently dropping oversized connector files.

Docs / UX catalog: New integration pages and icons for Datagma, Dropcontact, Enrow, Icypeas, and LeadMagic (plus meta.json and icon mapping). The File integration doc adds compress/decompress, clearer per-action inputs/outputs, and updated descriptions. Large regeneration passes fix truncated action copy and I/O tables for ClickHouse, Google (Sheets, Calendar, Forms, Maps Pollen/Solar, Custom Search, Slides, Groups), Kalshi, Azure DevOps, and others. Extend, Mistral Parse, and Pulse parser docs now require a single file input instead of URL/upload variants.

Reviewed by Cursor Bugbot for commit 597d7ea. Bugbot is set up for automated code reviews on this repo. Configure here.

…-UI create gaps (#5107)

* fix(locks): enforce workflow/folder locks on the agent + close manual-UI create gaps

The copilot/agent workflow & folder mutation tools and the edit_workflow
tool bypassed lock enforcement, so the agent could edit a locked workflow
and move/create workflows into a locked folder. Add assertWorkflowMutable/
assertFolderMutable guards (from @sim/workflow-authz) to every agent
mutation path, mirroring the REST API.

Also close two parity gaps on the manual-UI REST side: creating a workflow
into a locked folder and creating a subfolder under a locked parent were
previously unguarded. The realtime collaborative canvas already enforced
workflow-level locks server-side.

* fix(locks): normalize optional folderId to null for assertFolderMutable

* refactor(locks): hoist constant folder-lock check out of move loop; scope test mocks with Once

* refactor(locks): drop redundant ensureWorkflowAccess fetch in rename
…rate integration docs (#5109)

* improvement(integrations): validate BigQuery/Forms/PageSpeed + regenerate integration docs

- BigQuery: mark null-defaulted outputs optional (get_table type/numRows/numBytes/creationTime/lastModifiedTime/location, list_datasets location, list_tables type, query totalBytesProcessed)
- Google Forms: add response pagination (pageToken + filter params, nextPageToken output), fix pageSize visibility, advanced-mode pagination subBlocks + filter wandConfig
- PageSpeed: add a 7th BlockMeta template (competitor benchmark)
- Regenerate integration docs; add manual intro sections to new datagma/dropcontact/enrow/icypeas/leadmagic pages

* fix(docs-gen): preserve apostrophes in tool descriptions when generating docs

The doc generator extracted tool descriptions with a character class that
excluded both quote types (['"]([^'"]...)['"]), so a double-quoted description
containing an apostrophe (e.g. "Find someone's email") was truncated at the
apostrophe — the generated docs/catalog showed stubs like "Find someone".

Anchor extraction on the actual opening quote (single/double/backtick), matching
the existing extractDescription helper, in both buildToolDescriptionMap and
extractToolInfo. Regenerated docs restore full descriptions across all affected
integrations (Apollo, Ahrefs, LeadMagic, Findymail, OpenAI, Slack, etc.).

* fix(docs-gen): resolve tools defined in a sibling file + scope params per tool

The doc generator located a tool's definition only by filename convention
(decompress.ts / index.ts), so file_decompress — which lives in compress.ts
alongside file_compress — fell back to index.ts and rendered an empty Input
table. It also read the params block from the first tool in a multi-tool file,
so every tool in such a file inherited the first tool's inputs/outputs.

- getToolInfo: when no candidate file declares the exact tool ID, scan the whole
  tool-prefix directory for the file that does.
- extractToolInfo: read the params block scoped to the specific tool, falling
  back to the full file for tools that inherit params via spread.

Regenerated docs eliminate ~50 empty/incorrect input tables across integrations
(clickhouse, rb2b, reddit, file, etc.); param-less OAuth-only tools correctly
keep an empty input table.
…tte (#5110)

* feat(search): actions, fuzzy matching, and highlighting in cmd+k palette

Add a context-aware actions layer to the cmd+k search palette (Run workflow,
Create workflow/folder, Import workflow, Fit to view, Copy link, Invite
teammates, Toggle theme), replace the substring matcher with a boundary-anchored
fuzzy matcher (initialisms, typos, multi-word) that is a strict superset of the
old behavior, highlight matched characters, and rank against clean human text
instead of structural id/uuid tokens. Expose invoke() on the global commands
provider so the palette runs real registered commands.

* fix(search): highlight the matched substring, not an earlier scattered occurrence

Contiguous substring matches (exact/prefix/contains) now report the substring's
own indices instead of the greedy subsequence scan positions, so HighlightedText
bolds the characters the user actually matched. Restructures fuzzyMatch to handle
the substring tier first; scores are unchanged for these cases.

* fix(search): log clipboard copy failures and make fuzzy positions read-only

- Copy workflow link now logs on clipboard write failure instead of silently
  swallowing the error, matching the sidebar's copy-link convention.
- FuzzyResult.positions is now readonly and the NO_MATCH singleton's array is
  frozen, so the shared instance can never be mutated by a caller.
… flashes (#5111)

* fix(realtime): debounce the reconnecting toast to stop transient-blip flashes

The "Reconnecting..." persistent toast fired the instant isReconnecting
flipped true, so sub-second transport blips that self-heal on the first
retry flashed a scary alert. Add useStableFlag, an anti-flicker boolean
that delays the rising edge (2s, so brief blips never surface) and holds
the falling edge (1.5s min visible, so a drop just past the delay does not
flash-and-vanish). The socket flag stays accurate; only the user-facing
alarm is smoothed. State machine extracted into a framework-agnostic
controller with unit coverage for both flicker modes.

* fix(realtime): reset stable-flag React state on options change; de-vacuous blip test

Address Greptile review:
- useStableFlag: reset React state to the fresh controller's baseline when
  the controller is recreated on an options change, so a dynamic consumer
  changing delayMs/minVisibleMs while active with value already false can no
  longer strand the flag at true.
- test: read the live probe.active getter in the blip test instead of a
  destructured snapshot, which was bound to false at destructure time and
  made the assertion vacuous.
…sign system (#5114)

* improvement(search): align cmd+k action icons + highlight with the design system

- Each Actions verb now uses the exact icon from its real location: Fit to view
  -> Scan (workflow-controls), Copy workflow link -> Duplicate (nav context menu),
  Invite teammates -> User (settings teammates nav). Run/Create/Import already matched.
- Remove the Toggle theme action and its now-dead useTheme wiring.
- Matched-text highlight now uses the design-system search tokens
  (--highlight-match-bg / --highlight-match-text), matching the SearchHighlight
  component used in knowledge-base and code search, instead of an ad-hoc font-semibold.

* improvement(search): use font-medium for matched-text emphasis in cmd+k

Drop the colored background highlight in favor of the design system's standard
emphasis weight (font-medium, used by Button/Label/Input/Table). Lighter than
the previous semibold and avoids a background, keeping the palette's clean,
undecorated text style.
…I fixes across Google integrations (#5113)

* feat(google): add Maps Pollen/Solar, expand Custom Search, fix Ads/Groups/Contacts/Slides

New capability:
- Google Maps: add Pollen Forecast and Solar Potential tools (API-key, google_cloud BYOK)
- Google Custom Search: add start/dateRestrict/fileType/safe/searchType/siteSearch/
  siteSearchFilter/lr/gl/sort params, htmlTitle/htmlSnippet/formattedUrl/mime/fileFormat/
  cacheId/image result fields, and nextPageStartIndex pagination

Fixes (validated against live API docs):
- Google Ads: bump all tools from sunset v19 to v24
- Google Groups: forward OAuth credential under oauthCredential (was dropping token in 11
  ops), forward all update_settings fields, JSON.stringify update_settings/add_alias bodies
- Google Contacts: include required metadata.sources[].etag in updateContact body (fixed 400)
- Google Slides: remove unsupported GIF thumbnail mimeType (API only allows PNG)
- Google Sheets: wire delete_rows/delete_sheet/delete_spreadsheet into the V2 block
- Google Custom Search: throw on API error responses instead of returning empty success;
  num optional + Number-coerced; pagemap typed unknown

* docs(google): regenerate integration docs for new and updated operations

* fix(google_maps): correct Solar requiredQuality enum to BASE

The Solar API ImageryQuality enum is HIGH/MEDIUM/BASE (+ UNSPECIFIED) per the
live docs; there is no LOW. Selecting "Low" sent requiredQuality=LOW which the
API rejects as INVALID_ARGUMENT, and the valid BASE tier was unreachable.
Replace LOW with BASE in the tool param/output descriptions, the type union,
and the block dropdown.

* fix(google_maps): guard !response.ok in Pollen/Solar; use ?? for color channels

Address Greptile review:
- Pollen and Solar transformResponse now check !response.ok || data.error
  (matches the Custom Search fix); a gateway error without an error key in the
  body no longer returns empty/zeroed output silently.
- Pollen color channels use ?? instead of || so a legitimate 0 isn't treated
  as missing (consistent with the other numeric fields in the file).

* fix(google_maps): guard against NaN days in Pollen forecast

Address Cursor Bugbot: a non-numeric `days` input parsed to NaN and was
forwarded as `days=NaN` (the tool's `?? 1` only catches undefined, not NaN),
breaking the forecast call. The block now coerces invalid input to undefined,
and the tool defaults to 1 unless `days` is a finite number.

* fix(google): clamp Pollen days to 1-5; stop forwarding stale group settings fields

Address Cursor Bugbot:
- Pollen: clamp days to the documented 1-5 range (truncating fractionals) so 0,
  negatives, or >5 can't be sent to the API.
- Google Groups update_settings: the block has no dedicated settings subblocks,
  so forwarding name/description from params could leak stale values from
  create_group/update_group and unintentionally rename the group. Forward only
  oauthCredential + groupEmail from the block (the tool's own param schema still
  exposes the settings fields for the agent path).

* fix(google_sheets): fail fast on non-numeric delete indices

Address Cursor Bugbot: delete_sheet/delete_rows parsed deleteSheetId/startIndex/
endIndex with Number.parseInt but didn't validate, so non-numeric UI input became
NaN and was forwarded (the v2 delete tools only reject null/undefined), breaking
the batchUpdate. The block now throws a clear error when any of these is not a
valid number.

* fix(google_search): clamp num to 1-10 and normalize start

Address Cursor Bugbot: num was coerced with Number() but not bounded, so values
like 11 or fractionals reached the API and failed. The tool now truncates and
clamps num to the documented 1-10 range and only sends a positive integer start,
ignoring non-numeric/out-of-range input.
…t shapes + harden tools (#5112)

* improvement(supabase): add Edge Functions tool; correct storage output shapes + harden tools

- Add supabase_invoke_function tool (POST/GET/PUT/PATCH/DELETE /functions/v1/{name})
- upsert: support on_conflict; storage upload: support cache-control
- Fix storage copy/move/upload/delete-bucket output properties to match live API
- get_public_url: build URL via directExecution (no spurious network call)
- text_search: validate column identifier
- Strip non-TSDoc section-label comments

* fix(supabase): harden rpc/text_search identifiers; drop unused get_public_url apiKey

- rpc: validate + encode functionName (SSRF/injection parity with vector_search)
- text_search: validate language config interpolated into the PostgREST operator
- get_public_url: remove unused apiKey param + dead auth headers (public endpoint needs no auth)
- create_bucket: tighten output description to match the {name}-only response

* improvement(tavily): mark optional params advanced; fix empty content output

- Mark 29 optional search/extract/crawl/map subBlocks as mode: advanced (keep query/urls/url/apiKey basic)
- Fix search transformResponse: populate content from result.content (was result.snippet, always empty)
- Guard data.results with ?? []; correct country placeholder to a lowercase name
- Rewrite stale TavilySearch/Extract response interfaces; drop dead duplicate interfaces

* fix(supabase): valid Cache-Control directive on upload; clarify functionName description

- storage upload: expand a bare numeric cache-control to `max-age=<n>` (a raw number is not a valid Cache-Control header)
- block: functionName input description now covers RPC, vector search, and Edge Function invoke

* fix(supabase): edge-function error handling + reject array headers

- invoke_function: drop unreachable !response.ok branch (executor throws on non-OK before transformResponse runs and surfaces the error body); document the success-only contract
- invoke_function: ignore non-object (array) headers so JSON arrays can't produce numeric-index header names
- block: reject array/non-object Edge Function headers with a clear error in config.params

* fix(supabase): scope Edge Function method/body/headers to invoke_function

Prevents a stale `method` value (e.g. from the Edge Function field) from leaking
into other operations' params. The tool executor lets `params.method` override a
tool's static verb (tools/utils.ts), so an unscoped value could turn a read into
DELETE/POST against PostgREST. Now method/body/headers are only passed for the
invoke_function operation. Adds a block-level regression test.

* fix(supabase): only parse Edge Function body/headers for invoke_function

Stale or invalid functionBody/functionHeaders left in the block (common when
switching operations) were parsed and validated for every operation, so they
could throw before unrelated tools ran even while hidden. Moved parsing and
validation inside the invoke_function guard; added a regression test.

* fix(supabase): include last_accessed_at as a storage list sort option

The Storage list API accepts last_accessed_at for sortBy; add it to the tool
description and the block dropdown so the surfaced options match the API.
…parallel multipart uploader (#5108)

* improvement(tables): versioned CSV snapshot cache for table mounts + parallel multipart uploader

* chore(db): drop colliding 0239 migration (renumber pending)

* chore(db): renumber rows_version migration to 0240 (off staging's 0239)

* improvement(tables): mount snapshots by presigned URL so the sandbox fetches directly (raise cap to 500MB)

* fix(tables): allow url sandbox entries in the function-execute contract; key snapshot by column shape so schema edits invalidate it

* chore(e2b): log sandbox inputs split by url-fetch vs inline write

* improvement(tables): order export + snapshot rows by order_key so the CSV matches the grid under fractional ordering
* feat(connectors): use resource selectors for KB connector config

Replace raw ID text inputs with selector pickers (canonical selector +
manual-input pairs) across Google Drive/Docs/Forms/Sheets, Notion, Monday,
and Webflow KB connectors, so users pick folders/spreadsheets/pages/boards/
collections instead of pasting IDs — matching the workflow blocks.

- Add multi-select where the sync handler supports it (Drive/Docs/Forms
  folders, Monday boards, Webflow collections) via parseMultiValue
- Add shared escapeDriveQueryValue/buildDriveParentsClause helpers for safe
  multi-folder Drive queries
- Add ConnectorConfigField.mimeType, plumbed into the selector context
- Fix Webflow listingCapped not set on maxItems truncation (deletion-
  reconciliation data-loss safety)

Fully backward compatible: legacy single-string IDs and CSV both normalize
via parseMultiValue; resolved canonical keys are unchanged.

* fix(webflow): set listingCapped on within-page maxItems truncation

When a collection's items fit in a single API page but maxItems cuts the
list within that page, neither hasMoreInCollection nor hasMoreCollections is
true, so listingCapped was not set and the sync engine could hard-delete
still-existing documents. Add the within-page drop signal to the guard.
…chip left of filter/sort (#5117)

* improvement(knowledge): align connected-sources rows and move source chip left of filter/sort

- Drop the -mx-2 on the connectors list so rows respect the ChipModalBody
  gutter: the row hover no longer bleeds to the modal edges and row content
  lines up with the px-4 header.
- Add a 'leading' slot to ResourceOptions (left of the filter/sort cluster) and
  render the knowledge connected-source chip there instead of the far-right
  'aside', so it reads as part of the control row. 'aside' stays right-aligned
  for the table editor's run/stop control.

* improvement(resource): render options aside left of filter/sort

The options-bar aside has a single other consumer (the table editor's embedded
run/stop control), so instead of adding a separate slot, render aside itself to
the left of the filter/sort cluster. Drops the extra slot and keeps one
canonical control position; the run/stop control moves left too, which is fine
for a status widget.

* fix(resource): keep options aside grouped with filter/sort without a search bar

Group aside + the filter/sort cluster in one ml-auto right-aligned container
instead of relying on the search's flex-1 to anchor them. Without this, an
options bar with no search (the embedded mothership table editor) split aside
to the far left and filter/sort to the far right via justify-between.

* docs(resource): clarify aside groups with filter/sort regardless of search
…5119)

* chore(deps): remove unused dependencies and harden CI supply chain

Dependency cleanup:
- Remove unused deps: papaparse, unified, and 6 unused Radix primitives
  (alert-dialog, radio-group, scroll-area, separator, toggle, visually-hidden)
  plus @tanstack/react-query-devtools (all verified zero imports repo-wide)
- Consolidate jwt-decode into the existing jose dependency (decodeJwt)
- Migrate react-window to @tanstack/react-virtual to drop a redundant
  virtualization library (terminal, structured-output, code viewer)
- Remove the better-auth-harmony plugin and its gating env flag

Supply-chain hardening:
- SHA-pin every GitHub Action to a full commit SHA with a version comment
- Pin CI bun-version to 1.3.13 (was "latest" in the release job)
- Raise bun minimumReleaseAge cooldown from 3 to 7 days
- Add a non-blocking `bun audit` step in CI
- Add a CODEOWNERS gate routing dependency-manifest changes to @simstudioai/deps

* chore(deps): remove unused apps/docs dependencies (@tabler/icons-react, dotenv-cli)

* style(search-modal): use Send icon for Invite teammates action

* feat(search-modal): surface New chat as the top action above Create workflow

* feat(search-modal): add Secrets to the pages list
…ground import/delete/update jobs (#5012)

* improvement(mothership): user_table speed parity — limit bounds, async import/delete/update jobs

- query_rows / filter ops clamp limit to the contract maxes; query_rows
  skips execution metadata.
- import_file / create_from_file (large CSV/TSV) and delete_rows_by_filter
  (>1000 unbounded matches) dispatch background table jobs, claiming the
  per-table job slot; inline paths claim the slot too.
- update_rows_by_filter now escalates the same way: >1000 unbounded matches
  run as a background table job (new 'update' job type + runTableUpdate worker
  + tableUpdateTask), so a broad update on a huge table no longer loads every
  row into the request. Best-effort/non-atomic and skips workflow recompute
  (documented); unique-column patches stay inline. Pagination is limit/offset.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(mothership): trim user_table catalog copy to the essentials

Drop the verbose doomedCount/affectedCount, delete-mask, workflow-recompute,
and unique-column asides from the bulk-op descriptions. The model only needs:
large ops return { jobId }, limit maxes at 1000, one job per table.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* improvement(mothership): make user_table limit cap internal, not model-facing

The model can now pass any limit — no "cannot exceed 1000" rejection. 1000
becomes an internal threshold: query_rows clamps the page to MAX_QUERY_LIMIT
(totalCount signals truncation; the model pages with offset), and bulk filter
ops above the cap run as background jobs.

update_rows_by_filter loads full row data inline, so an explicit limit above
the cap escalates to the background worker with a new maxRows budget (the worker
stops after maxRows; update has no read mask so the cap is exact). delete only
loads ids inline, so an explicit limit (any size) stays inline — only unbounded
deletes use the masked background path, which would over-hide a bounded delete.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* improvement(mothership): bounded delete above the cap runs async, not inline

An explicit delete limit now mirrors update: ≤1000 runs inline, above the cap it
escalates to the background worker honoring the limit via maxRows — instead of
always staying inline. The worker stops after maxRows (per-page fetch capped to
the remaining budget).

Bounded background deletes skip pendingDeleteMask: the filter-based mask hides
every match, which would over-hide the rows beyond the cap the job never deletes.
Unmasked, a bounded delete is eventually consistent like a bounded update (rows
disappear as deleted), and doomedCount is omitted from the payload so the count
isn't double-subtracted.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(mothership): tidy user_table limit/offset param copy

Drop "Any value is allowed" from the limit description and restore the original
offset description.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(tables): skip pendingDeleteMask for bounded background deletes

The bounded-delete commit (f1ee3e9) persisted maxRows and omitted doomedCount
but the pendingDeleteMask guard that makes it work was left uncommitted, so the
shipped mask still hid every filter+cutoff match — over-hiding the rows beyond
maxRows that the job never deletes (they vanished from reads until the job ended,
then reappeared). Return no mask when maxRows is set: a bounded delete is
eventually consistent (rows disappear as deleted), like a bounded update.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(mothership): drop redundant background note from limit arg

The op descriptions already cover background escalation; the limit arg only
needs to say what the param does.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
* feat(files): stream large CSV previews and add import-as-table

* fix(files): validate fileId in csv-preview route, guard double-import, fix sniff perf and toggle flash

* fix(files): scope mothership preview-toggle loading guard to CSV files only
@github-actions github-actions Bot added the requires-mothership-merge Has a companion PR on the mothership/copilot side — merge in lockstep label Jun 18, 2026
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown

⚠️ Cross-repo companion check

One or more companion PRs aren't merged into main yet (aggregated across the feature PRs in this release). Merging this without them will leave copilot and sim out of sync — merge them in lockstep.

  • ⚠️ simstudioai/copilot#312 — merged into staging (this PR targets main) — contract(user_table): limit bounds, background import/delete/update docs

…5128)

The unconditional ml-auto from #5117 right-aligned the embedded table
editor's filter/sort cluster, which has no search bar. Only push the
aside + filter/sort group right when a search occupies the left; without
a search it stays left-aligned as before.
…n per-table cap (#5120)

* fix(tables): enforce row limits against the current plan, not a frozen per-table cap

* fix(tables): gate multi-batch CSV create + initial rows against the plan, harden limits cache bound

* fix(tables): thread running row count through copilot batchInsertAll capacity check

* chore(tables): align tx-variant capacity docstrings

* fix(tables): map row-limit errors to 400 in create-from-CSV import

* feat(tables): add Upgrade action to the row-limit toast

* fix(tables): keep CreateTableData.maxRows so staging callers type-check after merge

* improvement(tables): route row-limit Upgrade action to the explore-plans page

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 597d7ea. Configure here.

Comment thread apps/sim/connectors/utils.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

requires-mothership-merge Has a companion PR on the mothership/copilot side — merge in lockstep

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants