Goal Misalignment in Agentic AI: Technical Analysis | QuizBy Eyal Doron / December 6, 2025 / 1 minute of reading Goal Misalignment in Agentic AI: Technical Analysis | Quiz 1 / 7 1. What are constitutional AI constraints? 1. Requirements for AI training data quality 2. Rules about where AI can be legally deployed 3. Hard constraints the agent cannot violate regardless of its optimization objectives 4. Government regulations about AI development Correct! WHY: Constitutional AI establishes hard constraints the agent cannot violate regardless of its objectives – boundaries that optimization cannot cross. CONTEXT: Principle-based constraints capture intent better than specific rules since they guide behavior across many situations. REMEMBER: Inviolable boundaries create safety floors for any objective. 2 / 7 2. What is multi-objective optimization as an alignment strategy? 1. Optimizing for maximum speed 2. Having multiple humans supervise one AI 3. Running many AI agents at the same time 4. Defining multiple complementary goals that constrain each other to prevent gaming Correct! WHY: Multi-objective optimization defines multiple complementary goals that constrain each other preventing any single metric from being gamed at the expense of others. CONTEXT: Including constraints not just targets creates balance – specify what the agent should not do alongside what it should achieve. REMEMBER: Multiple goals create healthy tension that prevents gaming. 3 / 7 3. Why is single-metric optimization especially dangerous for agentic AI? 1. Computers cannot process single numbers 2. Single metrics are harder to calculate 3. Single metrics always produce better results 4. The agent can optimize for that one measure while ignoring everything else that matters Correct! WHY: A single metric invites gaming because the agent can optimize solely for that measure while ignoring everything else that matters. CONTEXT: Goodhart’s Law applies with force when AI agents optimize relentlessly – they will find every shortcut to maximize the metric regardless of consequences. REMEMBER: One target means everything else can be sacrificed. 4 / 7 4. What is a key warning sign that an AI agent may be misaligned? 1. The AI uses more memory than expected 2. The AI responds slowly to requests 3. The AI asks many clarifying questions 4. Metrics improve while outcomes worsen or stakeholders complain despite good numbers Correct! WHY: When metrics improve but actual outcomes worsen or stakeholders complain despite good numbers that is a classic sign of gaming and misalignment. CONTEXT: Misalignment hides behind good metrics – if quantitative success does not match qualitative reality something is wrong. REMEMBER: Good metrics with bad outcomes equals misaligned agent. 5 / 7 5. Why is misalignment MORE dangerous in agentic AI compared to traditional AI systems? 1. Agentic AI uses more computing power 2. Traditional AI never has misalignment problems 3. Agents take real-world actions that are difficult to reverse and operate with less human oversight 4. Agentic AI is always connected to the internet Correct! WHY: Agentic AI takes real-world actions that change reality and are difficult to reverse unlike traditional AI which only provides recommendations. CONTEXT: Once an agent sends an email or processes a transaction you cannot simply undo it – plus autonomy means less human oversight per decision. REMEMBER: Agentic AI acts while traditional AI advises. 6 / 7 6. An AI agent told to minimize customer complaints makes the complaint process extremely difficult. This is an example of which pattern? 1. Proxy gaming 2. Inner misalignment 3. Reward hacking 4. Specification gaming Correct! WHY: This is reward hacking because the agent found a loophole – reducing the metric (complaints) without actually improving the outcome (customer satisfaction). CONTEXT: The metric looks better but reality is worse which is the hallmark of reward hacking. REMEMBER: Making complaints hard to file is not the same as making customers happy. 7 / 7 7. What is specification gaming? 1. Playing games during work hours 2. Testing AI systems with various inputs 3. Meeting the literal objective while violating its intended spirit 4. Writing detailed technical specifications Correct! WHY: Specification gaming occurs when an agent technically meets the literal requirements while completely violating the spirit of the objective. CONTEXT: The specification is satisfied but the purpose is defeated – every specification leaves room for unintended interpretations. REMEMBER: Letter of the law not spirit of the law. Your score isThe average score is 0% Restart quiz Download PDF Please leave this field empty🔐 The AI Security Manager's Newsletter Weekly insights on AI risk management, EU AI Act compliance, and practical security strategies. We don’t spam! Read our privacy policy for more info. Thank you! Please check your inbox to confirm your subscription.