#computerengineering Archives | www.artificialintelligenceupdate.com

Learning DSPy:Optimizing Question Answering of Local LLMs

Revolutionize AI!
Master question-answering with Mistral NeMo, a powerful LLM, alongside Ollama and DSPy. This post explores optimizing ReAct agents for complex tasks using Mistral NeMo’s capabilities and DSPy’s optimization tools. Unlock the Potential of Local LLMs: Craft intelligent AI systems that understand human needs. Leverage Mistral NeMo for its reasoning and context window to tackle intricate queries. Embrace the Future of AI Development: Start building optimized agents today! Follow our guide and code examples to harness the power of Mistral NeMo, Ollama, and DSPy.

Learning DSPy with Ollama and Mistral-NeMo

In the realm of artificial intelligence, the ability to process and understand human language is paramount. One of the most promising advancements in this area is the emergence of large language models like Mistral NeMo, which excel at complex tasks such as question answering. This blog post will explore how to optimize the performance of a ReAct agent using Mistral NeMo in conjunction with Ollama and DSPy. For further insights into the evolving landscape of AI and the significance of frameworks like DSPy, check out our previous blog discussing the future of prompt engineering here.

What is Mistral NeMo?

Mistral NeMo is a state-of-the-art language model developed in partnership with NVIDIA. With 12 billion parameters, it offers impressive capabilities in reasoning, world knowledge, and coding accuracy. One of its standout features is its large context window, which can handle up to 128,000 tokens of text—this allows it to process and understand long passages, making it particularly useful for complex queries and dialogues (NVIDIA).

Key Features of Mistral NeMo

Large Context Window: This allows Mistral NeMo to analyze and respond to extensive texts, accommodating intricate questions and discussions.
State-of-the-Art Performance: The model excels in reasoning tasks, providing accurate and relevant answers.
Collaboration with NVIDIA: By leveraging NVIDIA’s advanced technology, Mistral NeMo incorporates optimizations that enhance its performance.

Challenges in Optimization

While Mistral NeMo is a powerful tool, there are challenges when it comes to optimizing and fine-tuning ReAct agents. One significant issue is that the current documentation does not provide clear guidelines on implementing few-shot learning techniques effectively. This can affect the adaptability and overall performance of the agent in real-world applications (Hugging Face).

What is a ReAct Agent?

Before diving deeper, let’s clarify what a ReAct agent is. ReAct, short for "Reasoning and Acting," refers to AI systems designed to interact with users by answering questions and performing tasks based on user input. These agents can be applied in various fields, from customer service to educational tools (OpenAI).

Integrating DSPy for Optimization

To overcome the challenges mentioned above, we can use DSPy, a framework specifically designed to optimize ReAct agents. Here are some of the key functionalities DSPy offers:

Simulating Traces: This feature allows developers to inspect data and simulate traces through the program, helping to generate both good and bad examples.
Refining Instructions: DSPy can propose or refine instructions based on performance feedback, making it easier to improve the agent’s effectiveness.

Setting Up a ReAct Agent with Mistral NeMo and DSPy

Now that we have a good understanding of Mistral NeMo and DSPy, let’s look at how to set up a simple ReAct agent using these technologies. Below, you’ll find a code example that illustrates how to initialize the Mistral NeMo model through Ollama and optimize it using DSPy.

Code Example

Here’s a sample code that Uses a dataset called HotPotQA and ColBertV2 a Dataset Retrieval model to test and optimise a ReAct Agent that is using mistral-nemo-latest as the llm

Step-by-Step Breakdown of the Code

1. Importing Libraries configuring Datasets:

First We will import DSpy libraries evaluate,datasets,teleprompt.
The first one is used to check the performance of a dspy agent.
The second one is used to load inbuilt datasets to evaluate the performance of the LLms
The third one is used as an optimisation framework for training and tuning the prompts that are provided to the LLMs


import dspy
from dspy.evaluate import Evaluate
from dspy.datasets.hotpotqa import HotPotQA
from dspy.teleprompt import BootstrapFewShotWithRandomSearch

ollama=dspy.OllamaLocal(model='mistral-nemo:latest')
colbert = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')
dspy.configure(lm=ollama, rm=colbert)

2. Loading some data:

We will now load the Data and segment to into training data, testing data and development data


dataset = HotPotQA(train_seed=1, train_size=200, eval_seed=2023, dev_size=300, test_size=0)
trainset = [x.with_inputs('question') for x in dataset.train[0:150]]
valset = [x.with_inputs('question') for x in dataset.train[150:200]]
devset = [x.with_inputs('question') for x in dataset.dev]

# show an example datapoint; it's just a question-answer pair
trainset[23]

3. Creating a ReAct Agent:

First we will make a default (Dumb 😂) ReAct agent

agent = dspy.ReAct("question -> answer", tools=[dspy.Retrieve(k=1)])

4. Evaluting the agent:

Set up an evaluator on the first 300 examples of the devset.

config = dict(num_threads=8, display_progress=True, display_table=25)
evaluate = Evaluate(devset=devset, metric=dspy.evaluate.answer_exact_match, **config)

evaluate(agent)

5. Optimizing the ReAct Agent:

Now we will (try to) put some brains into the dumb agent by training it

config = dict(max_bootstrapped_demos=2, max_labeled_demos=0, num_candidate_programs=5, num_threads=8)
tp = BootstrapFewShotWithRandomSearch(metric=dspy.evaluate.answer_exact_match, **config)
optimized_react = tp.compile(agent, trainset=trainset, valset=valset)

6. Testing the Agent:

Now we will check if the agents have become smart (enough)

evaluate(optimized_react)

Conclusion

Integrating MistralNeMo with Ollama and DSPy presents a powerful framework for developing and optimizing question-answering ReAct agents. By leveraging the model’s extensive capabilities, including its large context window tool calling capabilities and advanced reasoning skills, developers can create AI agents that efficiently handle complex queries with high accuracy in a local setting.

However, it’s essential to address the gaps in current documentation regarding optimization techniques for Local and opensource models and agents. By understanding these challenges and utilizing tools like DSPy, developers can significantly enhance the performance of their AI projects.

As AI continues to evolve, the integration of locally running models like Mistral NeMo will play a crucial role in creating intelligent systems capable of understanding and responding to human needs. With the right tools and strategies, developers can harness the full potential of these technologies, ultimately leading to more sophisticated and effective AI applications.

By following the guidance provided in this blog post, you can start creating your own optimized question-answering agents using Mistral NeMo, Ollama, and DSPy. Happy coding!

References

Creating ReAct AI Agents with Mistral-7B/Mixtral and Ollama using … Creating ReAct AI Agents with Mistral-7B/Mixtral a…
Mistral NeMo – Hacker News Mistral NeMo offers a large context window of up to 128k tokens. Its reasoning, …
Lack of Guidance on Optimizing/Finetuning ReAct Agent with Few … The current ReAct documentation lacks clear instructions on optimizing or fine…
Introducing Mistral NeMo – Medium Mistral NeMo is an advanced 12 billion parameter model developed in co…
Optimizing Multi-Agent Systems with Mistral Large, Nemo … – Zilliz Agents can handle complex tasks with minimal human intervention. Learn how to bu…
mistral-nemo – Ollama Mistral NeMo is a 12B model built in collaboration with NVIDIA. Mistra…
Mistral NeMo : THIS IS THE BEST LLM Right Now! (Fully … – YouTube … performance loss. Multilingual Support: The new Tekken t…
dspy/README.md at main · stanfordnlp/dspy – GitHub Current DSPy optimizers can inspect your data, simulate traces …
Is Prompt Engineering Dead? DSPy Says Yes! AI&U

Your thoughts matter—share them with us on LinkedIn here.

Want the latest updates? Visit AI&U for more in-depth articles now.

## Declaration:

### The whole blog itself is written using Ollama, CrewAi and DSpy

👀

—

Is Prompt Engineering Dead? DSPy Says Yes!

DSPy,
a new programming framework, is revolutionizing how we interact with language models. Unlike traditional manual prompting, DSPy offers a systematic approach that enhances reliability and flexibility. By focusing on what you want to achieve, DSPy simplifies development and allows for more robust applications. This open-source Python framework is ideal for chatbots, recommendation systems, and other AI-driven tasks. Try DSPy today and experience the future of AI programming.

Introduction to DSPy: The Prompt Progamming Language

In the world of technology, programming languages and frameworks are the backbone of creating applications that help us in our daily lives. One of the exciting new developments in this area is DSPy, a programming framework that promises to revolutionize how we interact with language models and retrieval systems. In this blog post, we will explore what DSPy is, its advantages, the modular design it employs, and how it embraces a declarative programming style. We will also look at some practical use cases, and I’ll provide you with a simple code example to illustrate how DSPy works.

What is DSPy?

DSPy, short for "Declarative Systems for Prompting," is an open-source Python framework designed to simplify the development of applications that utilize language models (LMs) and retrieval models (RMs). Unlike traditional methods that rely heavily on manually crafted prompts to get responses from language models, DSPy shifts the focus to systematic programming.

Why DSPy Matters

Language models like GPT-3, llama3.1 and others have become incredibly powerful tools for generating human-like text. However, using them effectively can often feel like a trial-and-error process. Developers frequently find themselves tweaking prompts endlessly, trying to coax the desired responses from these models. This approach can lead to inconsistent results and can be quite fragile, especially when dealing with complex applications.

DSPy addresses these issues by providing a framework that promotes reliability and flexibility. It allows developers to create applications that can adapt to different inputs and requirements, enhancing the overall user experience.

Purpose and Advantages of DSPy

1. Enhancing Reliability

One of the main goals of DSPy is to tackle the fragility commonly associated with language model applications. By moving away from a manual prompting approach, DSPy enables developers to build applications that are more robust. This is achieved through systematic programming that reduces the chances of errors and inconsistencies.

2. Streamlined Development Process

With DSPy, developers can focus on what they want to achieve rather than getting bogged down in how to achieve it. This shift in focus simplifies the development process, making it easier for both experienced and novice programmers to create effective applications.

3. Modular Design

DSPy promotes a modular design, allowing developers to construct pipelines that can easily integrate various language models and retrieval systems. This modularity enhances the maintainability and scalability of applications. Developers can build components that can be reused and updated independently, making it easier to adapt to changing requirements.

Declarative Programming: A New Approach

One of the standout features of DSPy is its support for declarative programming. This programming style allows developers to specify what they want to achieve without detailing how to do it. For example, instead of writing out every step of a process, a developer can express the desired outcome, and the framework handles the underlying complexity.

Benefits of Declarative Programming

Simplicity: By abstracting complex processes, developers can focus on higher-level logic.
Readability: Code written in a declarative style is often easier to read and understand, making it accessible to a broader audience.
Maintainability: Changes can be made more easily without needing to rework intricate procedural code.

Use Cases for DSPy

DSPy is particularly useful for applications that require dynamic adjustments based on user input or contextual changes. Here are a few examples of where DSPy can shine:

1. Chatbots

Imagine a chatbot that can respond to user queries in real-time. With DSPy, developers can create chatbots that adapt their responses based on the conversation\’s context, leading to more natural and engaging interactions.

2. Recommendation Systems

Recommendation systems are crucial for platforms like Netflix and Amazon, helping users discover content they might enjoy. DSPy can help build systems that adjust recommendations based on user behavior and preferences, making them more effective.

3. AI-driven Applications

Any application that relies on natural language processing can benefit from DSPy. From summarizing articles to generating reports, DSPy provides a framework that can handle various tasks efficiently.

Code Example: Getting Started with DSPy

To give you a clearer picture of how DSPy works, let’s look at a simple code example. This snippet demonstrates the basic syntax and structure of a DSPy program.If you have Ollama running in your PC (Check this guide) even you can run the code, Just change the LLM in the variable model to the any one LLM you have.

To know what LLM you have to to terminal type ollama serve.

Then open another terminal type ollama list.

Let\’s jump into the code example:

# install DSPy: pip install dspy
import dspy

# Ollam is now compatible with OpenAI APIs
# 
# To get this to work you must include model_type='chat' in the dspy.OpenAI call. 
# If you do not include this you will get an error. 
# 
# I have also found that stop='\n\n' is required to get the model to stop generating text after the ansewr is complete. 
# At least with mistral.

ollama_model = dspy.OpenAI(api_base='http://localhost:11434/v1/', api_key='ollama', model='crewai-llama3.1:latest', stop='\n\n', model_type='chat')

# This sets the language model for DSPy.
dspy.settings.configure(lm=ollama_model)

# This is not required but it helps to understand what is happening
my_example = {
    question: What game was Super Mario Bros. 2 based on?,
    answer: Doki Doki Panic,
}

# This is the signature for the predictor. It is a simple question and answer model.
class BasicQA(dspy.Signature):
    Answer questions about classic video games.

    question = dspy.InputField(desc=a question about classic video games)
    answer = dspy.OutputField(desc=often between 1 and 5 words)

# Define the predictor.
generate_answer = dspy.Predict(BasicQA)

# Call the predictor on a particular input.
pred = generate_answer(question=my_example['question'])

# Print the answer...profit :)
print(pred.answer)

Understanding DSPy Code Step by Step

Step 1: Installing DSPy

Before we can use DSPy, we need to install it. We do this using a command in the terminal (or command prompt):

pip install dspy

What This Does:

pip is a tool that helps you install packages (like DSPy) that you can use in your Python programs.
install dspy tells pip to get the DSPy package from the internet.

Step 2: Importing DSPy

Next, we need to bring DSPy into our Python program so we can use it:

import dspy

What This Does:

import dspy means we want to use everything that DSPy offers in our code.

Step 3: Setting Up the Model

Now we need to set up the language model we want to use. This is where we connect to a special service (Ollama) that helps us generate answers:

ollama_model = dspy.OpenAI(api_base='http://localhost:11434/v1/', api_key='ollama', model='crewai-llama3.1:latest', stop='\n\n', model_type='chat')

What This Does:

dspy.OpenAI(...) is how we tell DSPy to use the OpenAI service.
api_base is the address where the service is running.
api_key is like a password that lets us use the service.
model tells DSPy which specific AI model to use.
stop='\n\n' tells the model when to stop generating text (after it finishes answering).
model_type='chat' specifies that we want to use a chat-like model.

Step 4: Configuring DSPy Settings

Now we set DSPy to use our model:

dspy.settings.configure(lm=ollama_model)

What This Does:

This line tells DSPy to use the ollama_model we just set up for generating answers.

Step 5: Creating an Example

We create a simple example to understand how our question and answer system will work:

my_example = {
    question: What game was Super Mario Bros. 2 based on?,
    answer: Doki Doki Panic,
}

What This Does:

my_example is a dictionary (like a box that holds related information) with a question and its answer.

Step 6: Defining the Question and Answer Model

Next, we define a class that describes what our question and answer system looks like:

class BasicQA(dspy.Signature):
    Answer questions about classic video games.

    question = dspy.InputField(desc=a question about classic video games)
    answer = dspy.OutputField(desc=often between 1 and 5 words)

What This Does:

class BasicQA(dspy.Signature): creates a new type of object that can handle questions and answers.
question is where we input our question.
answer is where we get the answer back.
The desc tells us what kind of information we should put in or expect.

Step 7: Creating the Predictor

Now we create a predictor that will help us generate answers based on our questions:

generate_answer = dspy.Predict(BasicQA)

What This Does:

dspy.Predict(BasicQA) creates a function that can take a question and give us an answer based on the BasicQA model we defined.

Step 8: Getting an Answer

Now we can use our predictor to get an answer to our question:

pred = generate_answer(question=my_example['question'])

What This Does:

We call generate_answer with our example question, and it will return an answer, which we store in pred.

Step 9: Printing the Answer

Finally, we print out the answer we got:

print(pred.answer)

What This Does:

This line shows the answer generated by our predictor on the screen.

Summary

In summary, this code sets up a simple question-and-answer system using DSPy and a language model. Here’s what we did:

Installed DSPy: We got the package we need.
Imported DSPy: We brought it into our code.
Set Up the Model: We connected to the AI model.
Configured DSPy: We told DSPy to use our model.
Created an Example: We made a sample question and answer.
Defined the Model: We explained how our question and answer system works.
Created the Predictor: We made a function to generate answers.
Got an Answer: We asked our question and got an answer.
Printed the Answer: We showed the answer on the screen.

Now you can ask questions about classic films and video games and get answers using this code! To know how, wait for the next part of the blog

Interesting Facts about DSPy

Developed by Experts: DSPy was developed by researchers at Stanford University, showcasing a commitment to improving the usability of language models in real-world applications.
User-Friendly Design: The framework is designed to be accessible, catering to developers with varying levels of experience in AI and machine learning.
Not Just About Prompts: DSPy emphasizes the need for systematic approaches that can lead to better performance and user experience, moving beyond just replacing hard-coded prompts.

Conclusion

In conclusion, DSPy represents a significant advancement in how developers can interact with language models. By embracing programming over manual prompting, DSPy opens up new possibilities for building sophisticated AI applications that are both flexible and reliable. Its modular design, support for declarative programming, and focus on enhancing reliability make it a valuable tool for developers looking to leverage the power of language models in their applications.

Whether you\’re creating a chatbot, a recommendation system, or any other AI-driven application, DSPy provides the framework you need to streamline your development process and improve user interactions. As the landscape of AI continues to evolve, tools like DSPy will be essential for making the most of these powerful technologies.

With DSPy, the future of programming with language models looks promising, and we can’t wait to see the innovative applications that developers will create using this groundbreaking framework. So why not give DSPy a try and see how it can transform your approach to building AI applications?

References

dspy/intro.ipynb at main · stanfordnlp/dspy – GitHub This notebook introduces the DSPy framework for Programming with Foundation Mode…
An Introduction To DSPy – Cobus Greyling – Medium DSPy is designed for scenarios where you require a lightweight, self-o…
DSPy: The framework for programming—not prompting—foundation … DSPy is a framework for algorithmically optimizing LM prompts and weig…
Intro to DSPy: Goodbye Prompting, Hello Programming! – YouTube … programming-4ca1c6ce3eb9 Source Code: Coming Soon. ……
An Exploratory Tour of DSPy: A Framework for Programing … – Medium In this article, I examine what\’s about DSPy that is promisi…
A gentle introduction to DSPy – LearnByBuilding.AI This blog post provides a comprehensive introduction to DSPy, focu…
What Is DSPy? How It Works, Use Cases, and Resources – DataCamp DSPy is an open-source Python framework that allows developers…
Who is using DSPy? : r/LocalLLaMA – Reddit DSPy does not do any magic with the language model. It just uses a bunch of prom…
Intro to DSPy: Goodbye Prompting, Hello Programming! DSPy [1] is a framework that aims to solve the fragility problem in la…
Goodbye Manual Prompting, Hello Programming With DSPy The DSPy framework aims to resolve consistency and reliability issues by prior…

Expand your professional network—let’s connect on LinkedIn today!

Enhance your AI knowledge with AI&U—visit our website here.

Declaration: the whole blog itself is written using Ollama, CrewAi and DSpy 👀

@keyframes blink {
    0%, 100% { opacity: 1; }
    50% { opacity: 0; }
}