Learn
Learn
Learn web development from expert teachers. Build real projects, join our community, and accelerate your career
Get Started
Fullstack Rust Fullstack Node.js Fullstack D3 Fullstack React Fullstack React with TypeScript view all books →
The newline Guide to Building Your First GraphQL Server with Node and TypeScript
In this course, we'll show you how to create your first GraphQL server with Node.js and TypeScript
Enroll for free
Teach
Teach
Share your knowledge with others, earn money, and help people with their career
Apply Now
Apply To Teach A Course What Our Teachers Say
Amelia Wattenberger
Author of Fullstack D3
"Writing Fullstack D3 was a thoroughly enjoyable, fun process.

The writing was over before I knew it, and we've sold way more copies than I expected! Plus, the compliments from my peers have been really amazing."
Community
Community
Get help with programming projects, find collaborators, and make friends
Join Now
Explore new Communities Join our Discord Server What Our Students Say
Tools
Free Tools
AI-powered tools to help you land your dream job in tech
View All Tools
AI Job ListingsCurated AI and ML jobs updated weeklyATS Resume CheckerAI-powered resume analysis and optimizationStartup PerksFree credits & discounts for startups
Blog
Pricing
AI School
In-Person Event

Tutorials on Steereval

Learn about Steereval from fellow newline community members!

SteerEval: Measuring How Controllable LLMs Really Are

Evaluating LLM controllability isn’t just an academic exercise-it’s a critical factor determining how effectively businesses and developers can deploy these models in real-world scenarios. As LLM adoption grows rapidly across industries like healthcare, finance, and customer service, the ability to steer outputs toward specific goals becomes non-negotiable. Consider a medical chatbot that must stay strictly factual or a marketing tool that needs to adjust tone dynamically. Without precise control, even the most advanced models risk producing inconsistent, biased, or harmful outputs. Consider a customer support system trained to resolve complaints. If the model can’t maintain a professional tone or shift between technical and layperson language, it might escalate conflicts or confuse users. Similarly, a financial advisor AI must avoid speculative language while adhering to regulatory standards. These scenarios highlight why behavioral predictability matters: it directly affects user trust, compliance, and operational efficiency. Studies show that 68% of enterprises using LLMs cite “uncontrolled outputs” as a top roadblock to scaling AI integration. Controlling LLMs isn’t as simple as issuing commands. Current methods often rely on prompt engineering, which works inconsistently. For example, asking a model to “write a neutral summary” might yield wildly different results depending on the input text. Building on concepts from the Benchmark Dataset Construction section, researchers have found that even state-of-the-art models struggle with multi-step direction, like generating a response that’s both concise and emotionally neutral. These limitations create friction for developers trying to build systems that balance creativity with reliability.

Dr. Dipen

I am an AI/ML researcher with 150+ citations and 16 published research papers. I have three tier-1 publications, including Internet of Things (Elsevier), Biomedical Signal Processing and Control (Elsevier), and IEEE Access. In my research journey, I have collaborated with NASA Glenn Research Center, Cleveland Clinic, and the U.S. Department of Energy for various research projects. I am also an official reviewer and have reviewed over 100 research papers for Elsevier, IEEE Transactions, ICRA, MDPI, and other top journals and conferences. I hold a PhD from Cleveland State University with a focus on large language models (LLMs) in cybersecurity, and I also earned a master’s degree in informatics from Northeastern University.

•Last Updated:Mar 23rd 2026

Read Full Article

Email Newsletter

Trusted by 100,000+ developers!

Learn

The newline Guide to Building Your First GraphQL Server with Node and TypeScript

Teach

Amelia Wattenberger

Author of Fullstack D3

Community

Free Tools

Tutorials on Steereval

SteerEval: Measuring How Controllable LLMs Really Are

Email Newsletter

Popular Topics

Masterclasses

Tutorials

Fullstack React with TypeScript