The Schillace Laws - Part 3: Trade-offs And Challenges When Using LLMs

Lukasz C.

The category of laws, Trade-offs And Challenges When Using LLMs, outlines crucial strategies to balance utility and accuracy in LLM development. Unlike traditional deterministic systems, using prompts and LLMs requires a different approach that embraces non-deterministic outcomes.

The Menu

Today, our menu consists of four new principles:

Trade leverage for precision; use interaction to mitigate.
The system will be as brittle as its most brittle part.
Uncertainty is an exception throw.
Hard for you is hard for the model.

4. Trade leverage for precision; use interaction to mitigate.

The power of LLMs to transform simple tasks into smart, dynamic interactions is like discovering recursion for the first time or leveraging a third-party framework to simplify a complex problem. It feels extraordinary, but it's easy to underutilize this tool, restricting it to a smart translator, smart converter, or smart interpreter.

This law challenges us to break this habit and recognize the broader applications of LLMs. With well-designed prompts and dynamic templates, these models can perform complex tasks, way beyond simple smart components.

To illustrate, let's examine a key Semantic Kernel prompt template:

A planner takes a list of functions, a goal, and chooses which function to use. For each function, the list includes details about the input parameters. [START OF EXAMPLES] {{this.GoodExamples}} {{this.EdgeCaseExamples}} [END OF EXAMPLES] [REAL SCENARIO STARTS HERE] - List of functions: {{this.ListOfFunctions}} - End list of functions. Goal: {{ $input }}

Despite its simplicity, this 'metaprompt' does a lot of heavy lifting: it's at the core of ActionPlanner, which is powerful enough to enable Semantic Kernel to intuitively use skills when needed (equivalent of chatGPT's plugins).

This comes with a price - at any point, any prompt or sub-prompt can result in unexpected behavior. This is why it's important to mitigate the risk by pushing erroneous situations up until they can be handled by interaction either with the user or with the model.

The next rule keeps us reminded of the danger.

5. The system will be as brittle as its most brittle part.

When using LLMs as more than a simple content tool – as an intelligent scaffolding for your conventional code – it's quite easy to encounter a hiccup. The response might not be what you wanted (or missing), and even if the response is what you asked for, the format might be incorrect. To make things worse, these issues might be non-deterministic. The LLM model doesn't care much about the correctness of JSON or XML, which is counter-intuitive after years of working with JSON Serializers, XML writers, etc.

This is why it's crucial to respond appropriately when you discover any of these issues, leading us to the next rule:

6. Uncertainty is an exception throw.

We need to validate the model's responses as much as we can. Since we trade leverage for precision, there will be instances where we do not get exactly what we wanted. There are multiple mitigation measures we can take:

Request a strict response format when transitioning from the "AI" side of the system to the "conventional" side, making it easier to detect incorrect or incomplete responses.
Throw meaningful exceptions when we identify an issue with a response.
Propagate the exception to the point where we can handle it effectively (either interact with the model or with the user to get back on the right track).

7. Hard for you is hard for the model.

When integrating a large language model into our software, we are faced with two pivotal decisions:

What tasks should we delegate to the model?
How should we instruct the model to execute the chosen tasks?

To determine answers to both these questions, remember this guiding principle: what is difficult for you is likely to be difficult for the model too. With respect to the first question, this implies that the model excels at tasks that humans find relatively straightforward. Requesting it to distill a single-word response from a larger text corpus is more likely to yield success than expecting it to unravel intricate mathematical problems.

As for the second question, humans tend to resort to a systematic, step-by-step strategy when faced with complex tasks, a method of breaking down the complex problem into smaller, more manageable fragments. This tactic, known as "divide-and-conquer" in the software development realm, proves equally effective with large language models.

Let's assume we asked the model to find the 10th letter of the Polish alphabet:

Question: What is the 10th letter of Polish alphabet? Let the respond be only the letter, nothing else Answer: Ł

It's probably not obvious to everyone, but this is not correct; it should be G. So let's try our step-by-step strategy with the same problem:

Question: Could you write each letter, from first to 10th and then write the 10th one? Answer: Certainly! Here are the first ten letters of the Polish alphabet, written from the beginning to the tenth letter: A Ą B C Ć D E Ę F G Therefore, the tenth letter of the Polish alphabet is "G"

That's more like it! The verbose strategy made the respond more accurate, though longer. Looking at the response, can you see what letter is the correct one? Obviously. So, based on the rule it should also be simple for the model.

Let's test the hypothesis:

Question: Write a one character response with the letter in question: "Certainly! Here are the first ten letters of the Polish alphabet, written from the beginning to the tenth letter: A Ą B C Ć D E Ę F G Therefore, the tenth letter of the Polish alphabet is "G"." Response: G

Yes, we got it! As you can see, the rules in this chapter guided us to the correct solution.

Next steps

In the next and final part of the series, "The Schillace Laws - Part 4: Limitations and Opportunities of Using LLMs," we will explore the distinction between the way a model and the human brain operate and how to benefit from this distinction.

Stay safe and happy coding!

The Schillace Laws - Part 1: Taming the Beast of Large Language Models

The Schillace Laws - Part 2: Relationships Between Code And Models

The Schillace Laws - Part 3: Trade-offs And Challenges When Using LLMs

The Schillace Laws - Part 4: Limitations and Opportunities of Using LLMs

Lukasz C.

Share this post