Vinayak Arannil
Sr. Applied Scientist, AWS
Vinayak Arannil is a Sr. Applied Scientist at Amazon Web Services helping build capabilities on AgentCore and Strands to enable customers to evaluate agentic applications.
3 posts
Multimodal evaluators: MLLM-as-a-judge for image-to-text tasks in Strands Evals
Announcing four new MLLM-as-a-Judge evaluators for image-to-text tasks in Strands Evals: Overall Quality, Correctness, Faithfulness, and Instruction Following — automated, image-grounded scoring with reasoning.
ToolSimulator: scalable tool testing for AI agents
ToolSimulator is an LLM-powered framework within Strands Evals that enables safe, scalable agent testing by using simulated tool responses instead of risky live API calls.
Simulate realistic users to evaluate multi-turn AI agents in Strands Evals
ActorSimulator in the Strands Evals SDK enables teams to test conversational agents through realistic, goal-driven simulated users rather than relying on static test cases or manual testing.