A Developers Guide to Prompt Engineering and LLMs

A Developers Guide to Prompt Engineering and LLMs.jpg

With the rise of large language models (LLMs) and artificial intelligence like ChatGPT, developers now have incredibly powerful AI assistants that can help them code faster and be more productive. However, simply prompting an LLM to write code often results in unusable or buggy outputs. To speed up your development time with the help of AI you need a thoughtful strategy. Lets dive deeper into prompt engineering techniques in the next sections of this blog.

Plan Ahead

Don't just jump right into prompting for code. Take time to plan out the overall architecture, schemas, integrations, etc. This gives your AI assistant the full context it needs. Visualize the end state so your generative AI model understands the big picture.

Role play with Experts 

Set up scenarios where your AI model can act as an assistant and role play as a team of experts discussing solutions. Give them personalities and have them break down problems. This encourages creative thinking from different perspectives.

Ask GPT for Insights

If you get stuck, don't be afraid to pause and ask your LLM for specific advice or code examples relevant to your problem. Its knowledge that can spark new ideas.

Prompt Engineering 

Take it step-by-step and refine your prompts over multiple iterations and fine-tuning to provide the optimal context and steering for your AI assistant. Effective prompts lead to great results.

Manual Code Generation

Have your AI generate code manually at first, then inspect the outputs and correct mistakes before copying usable parts back into your prompt. This gives the AI a concrete example of what you are looking for and the repetitive nature of doing this will eventually generate accurate code. For more examples check out the concept chain-of-thought prompting and zero-shot-prompting.

Automatic Code Generation 

Once you're happy with the LLM model output and it’s producing consistently good code, set up pipelines to automatically generate assets based on schemas, tests based on code, etc. This removes bottlenecks.

Handle Failures Gracefully

Yes, occasional failures are expected. Improve your prompts to prevent similar failures in the future. Learn over time what types of tasks your AI handles well.

By blending planning, roleplaying, prompting, iterative training, and automation, you can achieve huge productivity gains from AI assistants like GPT. With the right strategy around designing prompts, improving your development speed is possible!

Generating Code With Prompts

As we stand on the cusp of the AI revolution, we find ourselves in a position to rethink how software development is approached and executed. This tutorial will dive into how we can navigate this transition from a traditional process to one that is supercharged by natural language processing.

Planning Ahead: The Foundation of Success

The first step in this journey is embracing the ability to plan ahead. AI, particularly models like GPT-3, may not be adept at mapping out the future. Their expertise lies in real-time problem-solving within a tight context token window. This is where human developers need to step in.

By painting a detailed picture of the final code – including model schemas, technology stack, and deployment processes – we can set the stage for AI to solve complex tasks. This planning stage might be the most challenging part of the process, but it morphs the role of developers from coders to orchestrators, setting the stage for a harmonious blend of human intelligence and AI prowess.

Assembling a Panel of Experts

Enter the "Panel of Experts". This group can comprise a diverse array of stakeholders, from C-suite executives to AI agents, all contributing to the understanding and planning of a specific task. Using a thoughtfully crafted prompt, you can create a virtual playground where the panel can discuss and resolve complex issues, with AI bots providing assistance and notes.

And if you find yourself grappling with a question, remember you have a powerful tool at your disposal designed for question answering: GPT. Whether you need a summary, a rephrase, or a more in-depth exploration of a subject, GPT can help. Being intentionally vague with your prompts can often yield surprising and innovative solutions.

Example for Code Debugging Prompt:

Act as a virtual playground where a team of Smart AI bots is in their first meeting discussing the problem:  
“We need to debug the following Python test code:

def test_dot_env_def(self):
LOG_TEST.info("Test DotEnvDef")
dotEnvDef = DotEnvDef()
> self.assertEqual(dotEnvDef, {"version_echo": "v0.0"})
E AssertionError: {'version_echo': 'v0.0'} != {'version_echo': 'v0.0'}
tests/test_default_config.py:16: AssertionError

class DotEnvDef:
def init(self):
self.version_echo = "v0.0"
# make the class iterable as dict
def iter(self):
return iter(self.__dict__)
# return the values only
def get_items(self):
for attr, value in self.__dict__.items():
yield value # only yield the values
def repr(self):
return str(self.__dict__)  

To fulfill this request bring in an expert team of AI Agents that are very brief and communicate via emojis and use catchy phrases. Spend your resources wisely as you only have 4 AI Agents.

GPT-4 Response:

Alright, welcome to this first meeting, team. We have an interesting challenge ahead of us. To start, let's assign roles to our four AI agents:

1. Detective Bot 👮‍♀️ (DB): Responsible for deep diving into the code, its syntax, structure, and logic. 
2. Bug Hunter Bot 🐛 (BH): In charge of identifying bugs and pointing them out. 
3. Solution Expert Bot 💡 (SE): Its role is to suggest viable solutions to identified bugs. 
4. Code Fixer Bot 🔧 (CF): It will implement the solutions and perform the debugging.

Our debugging plan will look like this:

1. **Initiation**: Here we discuss the problem and set our goals. The issue at hand is an AssertionError, meaning our test case is failing. Our aim is to make the test pass. 

2. **Investigation**: 👮‍♀️ DB deep-dives into the provided code to understand its structure and logic. The main focus is on the `DotEnvDef` class and the `test_dot_env_def` function.

3. **Bug Hunt**: 🐛 BH goes on a hunt to find the reasons behind the AssertionError.

4. **Solution Proposal**: Once 🐛 BH has identified the issue, 💡 SE jumps in to propose the best way to fix the problem.

5. **Implementation**: 🔧 CF takes the solution and implements it to the code. After the changes are made, the test case is run again.

6. **Verification**: If the test case passes, we conclude that the problem has been resolved. If not, we circle back to step 2.

Remember team, our goal is to squash bugs 🐜🔨! Let's rock'n'roll.

Manual Code Generation & AI

The next step is to harness GPT-4 for manual code generation, a process that promises to exponentially increase productivity. By guiding GPT, you are optimizing a variety of tasks from deploying a web application in a foreign stack to creating an algorithm in a low-level language.

Use commands on your Agents example:

Using the following syntax “/command argument” modifies a bot’s behavior, for example for enabling verbose code generation: “chatbot /code-gen verbose“ so that we obtain full python code. On the same basis allow changing the personality such as: “chatbot /personality Shrekify”.

DB /personality Shrekify
SE /code-gen verbose
CF /code-gen verbose 

> Continue the meeting and provide the first solution

GPT-4 Response:

DB 🏰👮‍♀️: "Hmm, the `version_echo` key is still not appearing in the `parsed_dotenv` dictionary. We need to figure out why `set_vars` is not setting it."

BH 🐛: "I suspect that `set_vars` is being called correctly, but the updates are being overwritten when `load_dotenv` is called. The `load_dotenv` method loads variables from a file and overwrites the current `parsed_dotenv` dictionary."

SE 💡: "If that's the case, then we should modify the `load_dotenv` method to merge its contents with the existing `parsed_dotenv` dictionary, rather than overwriting it. This will preserve any variables set via `set_vars`."

CF 🔧: "Good idea, let's change the `parse_vars` function within `load_dotenv` to update `parsed_dotenv` instead of setting it. Here is the modified code:
def parse_vars(self, path):
        with open(path) as f:
            for line in f:
                key, value = line.strip().split("=", 1)
                if key not in self.config.parsed_dotenv:  # check if key is not already in the dictionary
                    self.config.parsed_dotenv[key] = value
    except IOError:
        print(f"Warning: Unable to open .env file at {path}")
    except ValueError:
        print(f"Warning: Unable to parse line in .env file: {line}")
"In this way, we only add a variable from the .env file if it's not already in `parsed_dotenv`. This should prevent `load_dotenv` from overwriting any variables set by `set_vars`. Let's re-run the tests."

However, the process isn't perfect. Human limitations such as copy-pasting and prompt/result generation times can create bottlenecks. Also, GPT isn't consistently accurate, sometimes changing the format or offering only partial results. But with practice and the right strategies, you can guide GPT to generate useful and accurate code.

Avoid Hallucinations and Gain Useful Results

Hallucinations, or instances where GPT generates irrelevant, bias or incorrect information, can be frustrating. However, the more you work with GPT, the better you become at steering it towards desired outputs. One way to minimize these hallucinations is to provide context to GPT by copying and pasting relevant documentation and code samples.

Build a custom Langchain agent with tests example:

We have this builder:
class OpenAIBuilder:
  def __init(self, ...):
    # Set your settings
"You must adapt the following code to match our previous structure:
from langchain import (
from langchain.agents import initialize_agent, AgentType
from langchain.agents.agent_toolkits import create_python_agent
from langchain.agents import AgentType, initialize_agent, load_tools
from langchain.tools.python.tool import PythonREPLTool
from openai_chat_builder import OpenAiChatBuilder, OpenAiChatHelper

class AgentPythonExecBuilder(OpenAiChatBuilder):
    def __init__(self, openai_api_key):
          self.tools = ["PythonREPLTool"]

    # We override the add_tool method so it only allows PythonREPLTool.
    def add_tool(self, tool):
        if tool == "PythonREPLTool":
            return super().add_tool(tool)
            raise ValueError("AgentPythonExec can only have PythonREPLTool.")

    def build(self):
        llm = OpenAI(

        tools = load_tools(self.tools)
        agent = initialize_agent(tools, llm, agent=self.agentType, verbose=self.verbose)

        return OpenAiChatHelper(llm, tools, agent)

Build Langchain agent tests:

Now let's build a test file that resembles this one:
import unittest

from probable_fiesta.config.builder.config_builder import ConfigBuilder
from src.promptnub_backend.app.langchain.builder.openai_factory import OpenAIFactory
from langchain.llms import OpenAI

class TestOpenAIBuilder(unittest.TestCase):
    def setUp(self):
        # Setup API Key
        config_builder = ConfigBuilder()

        # Setup ModelBuilder instance
        self.builder = OpenAIFactory()
        self.builder.openai_api_key = config_builder.dotenv.get_var("OPENAI_API_KEY")

    def test_llm_creation(self):
        # Test the creation of OpenAI llm instance
        llm = self.builder.builder.llm
        self.assertIsInstance(llm, OpenAI, "llm property does not return OpenAI instance")

    def test_llm_run(self):
        # Test the creation of OpenAI llm instance
        llm_chain = self.builder.build()

        # Ask a question
        question = "What NFL team won the Super Bowl in the year Justin Beiber was born?"
        response = llm_chain.run(question)

        # Assert expected result
        # Note that there is a space at the beginning
        expected = " Justin Bieber was born in 1994, so the NFL team that won the Super Bowl in 1994 was the Dallas Cowboys."
        self.assertEquals(str(response), expected)

GPT-4 Response:

Given the provided `AgentPythonExecBuilder` class and the `TestOpenAIBuilder` class, you would write a test file for the `AgentPythonExecBuilder` class as follows:
from logger.builder.logger_factory import LoggerFactory
from config.builder.config_builder import ConfigBuilder
from langchain.llms import OpenAI
from langchain.tools.python.tool import PythonREPLTool
from your_package_path.builder.agent_python_exec_factory import AgentPythonExecFactory

import unittest

# Test Log
LOG_TEST = LoggerFactory.new_logger_default(
    name="test_agent_python_exec_builder", level="DEBUG", fmt="simple", directory="logs"

class TestAgentPythonExecBuilder(unittest.TestCase):
    def setUp(self):
        # Setup API Key
        config_builder = ConfigBuilder()
        # Setup ModelBuilder instance
        self.builder = AgentPythonExecFactory()
        self.builder.openai_api_key = config_builder.dotenv.get_var("OPENAI_API_KEY")

    def test_llm_creation(self):
        # Test the creation of OpenAI llm instance
        LOG_TEST.info("Test OpenAI llm creation")
        llm = self.builder.builder.llm
            llm, OpenAI, "llm property does not return OpenAI instance"

    def test_python_repl_tool_creation(self):
        # Test the creation of PythonREPLTool instance
        LOG_TEST.info("Test PythonREPLTool creation")
        tools = self.builder.builder.tools
            tools[0], PythonREPLTool, "tools[0] does not return PythonREPLTool instance"

    def test_llm_run(self):
        # Test the creation of OpenAI llm instance and PythonREPLTool
        LOG_TEST.info("Test OpenAI llm and PythonREPLTool run sync")
        llm_chain = self.builder.build()

        # Ask a question
        question = "What is 2 + 2?"
        response = llm_chain.run(question)

        # Assert expected result
        expected = "The result of 2 + 2 is 4."
        self.assertEquals(str(response), expected)  

Here, you create a new instance of `AgentPythonExecBuilder` instead of `OpenAIBuilder` and test it with a question that can be answered by executing Python code. 

Automated Code Generation

The real power of LLMs, however, lies in automated code generation. By using tools such as PubNub, Python, LangChain, and Streamlit, you can automate most of your workflow, leaving you to focus on more important tasks.

With a few more abstractions we can hook our simple LangChain runner into our workflow:

class TestOpenAIBuilder(unittest.TestCase):
    def setUp(self):
        self.runner = Runner()

    def test_run(self):
        response = self.runner.run("What is the capital of France?")
        expected = " France is a country in Europe. The capital of France is Paris."
        self.assertEqual(response, expected)

The Road to Developing 100X Faster

Envision a developer with a laser-focused ambition to amplify their coding skills, aspiring to attain a 2X, 3X, or even a staggering 50X productivity level. The secret to this ambition lies in the strategic application of LLMs. This toolset acts as a lever to productivity, enabling our developers to innovate new platforms and drive automated workflows like never before. From something as simple as text summarization to generating production code.

As we venture further into the sphere of AI and its role in software development, it's important to remember that the true strength lies in the fusion of human creativity with AI abilities. Thoughtful planning, the smart application of AI, and the appropriate digital tools can enable us to transform into a more productive company. This transition is about amplifying human developers' potential with AI, not replacing them. The concept of becoming a 100X company, using AI, isn't just a far-off dream but a reachable future we can shape collectively.