Have you ever asked an LLM a tough math question and seen it confidently give the wrong answer? That doesn’t mean the LLM isn’t smart. LLMs are built to predict the next word in a sentence, not to solve equations or search the internet. They’re more like poets than calculators. The good news is we can give them extra tools. In this article, I’ll walk you through a guided project to add reasoning skills to your LLM apps.
Add Reasoning Skills to Your LLM
If I asked you to find the square root of 54,321, you wouldn’t just guess. You’d probably take out your phone, open the calculator, and then tell me the answer.
That’s exactly what we want our AI to do. This approach is called ReAct, which stands for Reason and Act. Here’s how it works:
- Thought: The AI realizes it doesn’t know the answer.
- Action: It decides to use a specific tool (like a Search Engine or Calculator).
- Observation: It sees the result of that tool.
- Answer: It synthesizes the final response.
Let’s build this step by step.
Step 1: The Setup
Before we start coding, we need a few tools. We’ll use three main components:
- Python: Our programming language.
- Ollama: This lets you run a strong open-source LLM, like Llama 3, right on your computer.
- DuckDuckGo Search (ddgs): This gives our LLM access to up-to-date information from the web.
First, install Ollama and download the Llama 3 model by running: ollama pull llama3. Next, install the needed Python libraries:
pip install ollama ddgs
Now we can start coding.
Step 2: Building the Tools
An agent is only as useful as the tools it has. We need to create Python functions the AI can use when it needs help.
First, we’ll build a web search tool. Since our AI only knows what it learned during training, it needs to search the web to answer current questions. Here’s how to make this tool:
import re
import ollama
from ddgs import DDGS
# TOOL 1: SEARCH
def search_web(query):
try:
with DDGS() as ddgs:
results = list(ddgs.text(query, max_results=3))
if results:
return "\n".join([f"- {r.get('body','')} (source: {r.get('href','')})" for r in results])
return "No results found."
except Exception as e:
return f"Error during search: {e}"Next, we’ll create a tool for math. LLMs often struggle with arithmetic because they see numbers as text. We solve this by letting them use Python’s math features directly. Here’s how to build this tool:
# TOOL 2: CALCULATE
def calculate(expression: str):
# allow only safe math
if not re.fullmatch(r"[0-9\.\+\-\*\/\(\)\s\^]+", expression):
return {"status": "error", "error": "Invalid math expression"}
try:
# Support ^ like exponent
expression = expression.replace("^", "**")
val = eval(expression, {"__builtins__": {}})
return {"status": "success", "value": float(val)}
except Exception as e:
return {"status": "error", "error": str(e)}Step 3: Building The Loop
This is the main part of the project. We’ll set up a loop where the AI can think and then act. The AI will follow a clear format: Thought, then Action, then Observation:
def run_agent(question):
print(f"\n--- Question: {question} ---")
system_prompt = (
"You are a reasoning agent.\n"
"You have 2 tools: SEARCH and CALCULATE.\n\n"
"FORMAT:\n"
"Thought: ...\n"
"ACTION: SEARCH: <query>\n"
"OR\n"
"ACTION: CALCULATE: <expression>\n"
"OR\n"
"ANSWER: <final answer>\n\n"
"Rules:\n"
"- After you receive a successful calculation result, you MUST give ANSWER.\n"
"- Do not repeat calculations unnecessarily.\n"
)
history = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": question}
]
max_steps = 12
calc_count = 0
search_count = 0
last_calc_value = None
for i in range(max_steps):
response = ollama.chat(model="llama3", messages=history)
content = response["message"]["content"].strip()
history.append({"role": "assistant", "content": content})
# ----------------------------
# ACTION: SEARCH
# ----------------------------
if "ACTION: SEARCH:" in content:
search_count += 1
query = content.split("ACTION: SEARCH:")[1].split("\n")[0].strip()
print(f"Step {i+1}: Searching web for '{query}'...")
result = search_web(query)
history.append({"role": "user", "content": f"OBSERVATION: {result}"})
# ----------------------------
# ACTION: CALCULATE
# ----------------------------
elif "ACTION: CALCULATE:" in content:
calc_count += 1
expr = content.split("ACTION: CALCULATE:")[1].split("\n")[0].strip()
print(f"Step {i+1}: Calculating '{expr}'...")
result = calculate(expr)
history.append({"role": "user", "content": f"OBSERVATION: {result}"})
# If calc worked, force final answer
if result["status"] == "success":
last_calc_value = result["value"]
history.append({
"role": "user",
"content": f"OBSERVATION: Calculation succeeded with value {last_calc_value}. "
f"Now you MUST provide ANSWER using this value."
})
# prevent endless calculations
if calc_count >= 4:
history.append({
"role": "user",
"content": "OBSERVATION: You have made enough calculations. Provide ANSWER now."
})
# ----------------------------
# ANSWER
# ----------------------------
elif "ANSWER:" in content:
final_answer = content.split("ANSWER:")[1].strip()
print(f"\n Final Answer: {final_answer}")
return final_answer
else:
print(f"Step {i+1} (Thinking): {content}")
# fallback finalization
if last_calc_value is not None:
fallback = f"Approx final value: {last_calc_value:.2f}"
print("\n Fallback Answer:", fallback)
return fallback
print("\n Agent failed to answer within step limit.")
return NoneIn this case, we give the model clear instructions on how to respond. For example, we tell it, “If you want to search, write ACTION: SEARCH:”. This approach changes the unpredictable text from an LLM into structured commands that our code can understand.
The Loop here is that we don’t just ask once. We let the model take steps.
Step 4: Putting It All Together
Finally, let’s test our agent with a real example that needs both up-to-date information and math:
if __name__ == "__main__":
run_agent("How many liters of petrol can I buy in Delhi today with ₹1500?")When you run the script, you’ll see the AI working step by step in real time. It will look like this:
(env) (base) amankharwal@Amans-MacBook-Pro aiagent % python reasoning_LLM.py
--- Question: How many liters of petrol can I buy in Delhi today with ₹1500? ---
Step 1: Searching web for '"petrol price delhi"'...
Step 2: Calculating '₹1500 / 101.41 (current Delhi petrol price per litre)'...
Step 3: Calculating '₹1500 / (current Delhi petrol price per litre)'...
Step 4: Searching web for '"current petrol price delhi"'...
Step 5: Searching web for '"current petrol price delhi"'...
Step 6: Calculating '₹1500 / 94.81 (current Delhi petrol price per litre)'...
Step 7: Calculating '₹1500 / 94.81'...
Final Answer: You can buy approximately 15.83 liters of petrol in Delhi today with ₹1500.
Closing Thoughts
That’s how you can add reasoning skills to your LLM apps. Building agents like this is all about putting the right pieces together. We’re moving from chatbots that only talk to apps that can actually do things.
As you try out this code, keep in mind that the AI does the reasoning, but you design how it works. The better the tools you give it, the smarter it becomes. Today it’s a calculator; tomorrow, it could help with your email, calendar, or even your whole database.
If you found this article useful, you can follow me on Instagram for daily AI tips and practical resources. You might also like my latest book, Hands-On GenAI, LLMs & AI Agents. It’s a step-by-step guide to help you get ready for jobs in today’s AI field.





