fix(media): don't crash when media block data is raw bytes#1695
Open
devteamaegis wants to merge 1 commit into
Open
fix(media): don't crash when media block data is raw bytes#1695devteamaegis wants to merge 1 commit into
devteamaegis wants to merge 1 commit into
Conversation
The Anthropic ('type':'base64') and Vertex ('type':'media') branches in
_find_and_process_media built a data URI via string concatenation with
data['data'], assuming it is always a str. When a logged payload carries
raw bytes there, this raised an uncaught TypeError that propagated out of
span creation. Guard both branches with isinstance(data['data'], str) so
non-string media data passes through unprocessed instead of crashing.
68b0a20 to
cf2af23
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What's broken
MediaManager._find_and_process_mediabuilds a data URI for Anthropic- and Vertex-style media blocks withf"...;base64," + data["data"]. The branch guards only check that thedatakey exists, not that its value is astr. If a logged payload carries raw image bytes there, thestr + bytesconcatenation raises an uncaughtTypeError. Because the caller runs outside the maskingtry/except, the crash propagates out of span/observation creation.Why it happens
The format guards verify the dict shape but assume
data["data"]is a base64 string.Fix
Add an
isinstance(data["data"], str)check to both branches, so non-string data falls through to normal pass-through handling instead of crashing.Test
Processing a base64 block whose
datais bytes passes through unchanged instead of raising.Greptile Summary
This PR guards two branches in
_find_and_process_mediathat build base64 data URIs by addingisinstance(data["data"], str)checks, preventing aTypeErrorcrash when a logged payload carries raw bytes instead of a base64 string in the"data"field of Anthropic- or Vertex-style media blocks.type == "base64"): now skips media processing and passes the dict through unchanged whendata["data"]is not astr.type == "media"): same guard added for themime_type-keyed variant.bytesdata and asserts no media is enqueued.Confidence Score: 5/5
Safe to merge — the change is a minimal two-line guard that prevents a crash without altering any existing code path for valid string data.
Both guards touch only the condition that selects a code path; the string-data path is identical to before, and the bytes-data path now falls through to the existing generic dict recursion rather than throwing. The new test directly exercises both block shapes with raw bytes and confirms pass-through with no queued uploads.
No files require special attention.
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[_process_data_recursively] --> B{isinstance data dict?} B -- No --> C[Other checks / pass-through] B -- Yes --> D{Anthropic block?\ntype==base64, media_type, data} D -- No --> E{Vertex block?\ntype==media, mime_type, data} D -- Yes --> F{isinstance data.data str?} E -- Yes --> G{isinstance data.data str?} F -- Yes --> H[Build base64 URI\nUpload media\nReturn copied dict with LangfuseMedia] F -- No --> I[Fall through to dict recursion\nPass-through unchanged] G -- Yes --> J[Build base64 URI\nUpload media\nReturn copied dict with LangfuseMedia] G -- No --> I E -- No --> K[Recurse into dict values]Reviews (1): Last reviewed commit: "fix(media): don't crash when Anthropic/V..." | Re-trigger Greptile