Day 1 - Module 1: Basic LLM Interaction & Tool Calling
Objective: Understand how to interact with Large Language Models (LLMs) programmatically, handle streaming responses, and enable models to use external tools.
Source Code: src/01-basics/
Introduction
At the heart of AI agents lies the ability to interact with powerful Large Language Models (LLMs). These models can understand and generate human-like text, answer questions, translate languages, and much more. In this first module, we will explore the fundamental ways to communicate with an LLM using the OpenAI API pattern (as used by the GitHub Models endpoint in this repository) and introduce the concept of "tool calling," which allows the LLM to interact with external systems or functions.
We will cover:
- Basic API Calls: Sending a prompt to the model and receiving a complete response.
- Streaming Responses: Receiving the model's response incrementally as it's generated.
- Tool Calling: Defining functions (tools) that the model can request to use to gather information or perform actions.
Core Concept: LLM Interaction
Understanding how to structure API calls, manage conversation history (messages), and interpret responses is foundational for building any LLM-powered application.
Setup Review
Prerequisites
Before running the examples, ensure you have:
- Cloned the
agentic-playground
repository. - Installed the required Python packages (
pip install -r requirements.txt
). - Created a
.env
file in the repository root with yourGITHUB_TOKEN
(a Personal Access Token with no specific permissions needed for GitHub Models inference).
1. Hello World: Basic API Interaction
File: src/01-basics/hello-world.py
This script demonstrates the simplest form of interaction: sending a message to the LLM and getting a single, complete response back.
Code Breakdown:
- Import necessary libraries:
os
for environment variables,OpenAI
for the client,load_dotenv
to load the.env
file.
- Initialize the OpenAI Client:
base_url
: Points to the GitHub Models inference endpoint.api_key
: Reads theGITHUB_TOKEN
from your environment variables (loaded from.env
).
client = OpenAI(
base_url="https://models.inference.ai.azure.com",
api_key=os.environ["GITHUB_TOKEN"],
)
- Define the Conversation:
- Messages are provided as a list of dictionaries, each with a
role
(system
,user
, orassistant
) andcontent
. - The
system
message sets the context or instructions for the model (e.g., "antworte alles in französisch" - answer everything in French). - The
user
message contains the user's query.
- Messages are provided as a list of dictionaries, each with a
messages=[
{
"role": "system",
"content": "antworte alles in französisch",
},
{
"role": "user",
"content": "What is the capital of France?",
}
]
- Call the Chat Completions API:
client.chat.completions.create()
sends the request.messages
: The conversation history/prompt.model
: Specifies the model to use (e.g.,gpt-4o-mini
).temperature
,max_tokens
,top_p
: Control the creativity, length, and sampling strategy of the response.
response = client.chat.completions.create(
messages=messages,
model="gpt-4o-mini",
temperature=1,
max_tokens=4096,
top_p=1
)
- Print the Response:
- The model's reply is found within the
response
object.
- The model's reply is found within the
To Run:
You should see the answer to "What is the capital of France?" printed in French.
2. Streaming Output
File: src/01-basics/streaming-output.py
Waiting for the entire response can take time, especially for longer answers. Streaming allows you to receive the response piece by piece, improving the perceived responsiveness of the interaction.
Code Breakdown:
- Client Setup: Similar to
hello-world.py
. - API Call with Streaming:
- The key difference is
stream=True
. stream_options={'include_usage': True}
optionally requests token usage information at the end.
- The key difference is
response = client.chat.completions.create(
messages=[
# ... (system and user messages) ...
],
model=model_name,
stream=True,
stream_options={'include_usage': True}
)
- Processing the Stream:
- The
response
object is now an iterator. - We loop through each
update
in the stream. - Each
update
can contain a small piece of the response text (delta.content
). We print these pieces immediately. - If usage information is included, it appears in a final update.
- The
usage = None
for update in response:
if update.choices and update.choices[0].delta:
print(update.choices[0].delta.content or "", end="") # Print chunk without newline
if update.usage:
usage = update.usage
if usage:
print("\n") # Add newline after full response
for k, v in usage.model_dump().items():
print(f"{k} = {v}")
To Run:
You will see the reasons for exercising appear on the console incrementally, followed by the token usage statistics.
3. Tool Calling: Extending LLM Capabilities
File: src/01-basics/tool-calling.py
LLMs are trained on vast datasets but lack real-time information and the ability to perform actions in the real world. Tool calling allows the LLM to request the execution of predefined functions (tools) to overcome these limitations.
Concept:
- Define Tools: You describe available functions (like getting the current time, searching the web, etc.) to the LLM, including their names, descriptions, and expected parameters.
- LLM Request: When the LLM determines it needs a tool to answer a user's query, it doesn't directly answer but instead outputs a special message indicating which tool to call and with what arguments.
- Execute Tool: Your code receives this request, executes the corresponding function with the provided arguments.
- Provide Result: You send the function's return value back to the LLM.
- Final Response: The LLM uses the tool's result to formulate the final answer to the user.
Why Tool Calling is Powerful
Tool calling transforms LLMs from passive text generators into active agents capable of interacting with APIs, databases, or custom code, vastly expanding their potential applications.
Code Breakdown:
- Import additional libraries:
json
for parsing arguments,pytz
anddatetime
for the time function. - Define the Tool Function: A standard Python function (
get_current_time
) that takes a city name and returns the time. Note the docstring, which helps the LLM understand what the function does.
import pytz
from datetime import datetime
def get_current_time(city_name: str) -> str:
"""Returns the current time in a given city."""
# ... (implementation using pytz) ...
- Define the Tool Schema: A dictionary describing the function to the LLM.
type
: Always "function".function
: Contains details:name
: Must match the Python function name.description
: Crucial for the LLM to know when to use the tool.parameters
: Describes the arguments (name, type, description, required).
tool={
"type": "function",
"function": {
"name": "get_current_time",
"description": """Returns information about the current time...""",
"parameters": {
"type": "object",
"properties": {
"city_name": {
"type": "string",
"description": "The name of the city...",
}
},
"required": ["city_name"],
},
},
}
Designing Good Tool Descriptions
The description
field in the tool schema is critical. It should clearly and concisely explain what the tool does and when it should be used. Use natural language that the LLM can easily understand.
- Initial API Call with Tools:
- The
tools
parameter is added to thecreate
call, listing the available tools.
- The
response = client.chat.completions.create(
messages=messages,
tools=[tool], # Pass the tool definition
model=model_name,
)
- Handling Tool Call Response:
- Check if
finish_reason
istool_calls
. - Append the model's request message to the history.
- Extract the
tool_call
information (ID, function name, arguments). - Parse the JSON arguments.
- Crucially, call the actual Python function (
locals()[tool_call.function.name](**function_args)
). - Append the tool's result back to the message history, using the
tool_call_id
androle: "tool"
.
- Check if
if response.choices[0].finish_reason == "tool_calls":
messages.append(response.choices[0].message) # Append assistant's request
tool_call = response.choices[0].message.tool_calls[0]
if tool_call.type == "function":
function_args = json.loads(tool_call.function.arguments)
callable_func = locals()[tool_call.function.name]
function_return = callable_func(**function_args)
messages.append( # Append tool result
{
"tool_call_id": tool_call.id,
"role": "tool",
"name": tool_call.function.name,
"content": function_return,
}
)
- Second API Call: Call the model again with the updated message history (including the tool result).
response = client.chat.completions.create(
messages=messages,
tools=[tool],
model=model_name,
)
print(f"Model response = {response.choices[0].message.content}")
To Run:
cd /home/ubuntu/agentic-playground/src/01-basics
# Install pytz if you haven't: pip install pytz
python tool-calling.py
You will see output indicating the function call (Calling function 'get_current_time'...
), the function's return value, and finally the model's response incorporating the time information.
Further Reading & Resources
To deepen your understanding of basic LLM interaction and tool calling, explore these resources:
- General LLM Interaction:
- Tool Calling Specifics:
- LangChain Documentation: Tool calling concepts
- LangChain Blog Post: Tool Calling with LangChain
- Analytics Vidhya Guide: Guide to Tool Calling in LLMs
- Medium Tutorial: Tool Calling for LLMs: A Detailed Tutorial
- Apideck Introduction: An introduction to function calling and tool use
- Mistral AI Docs: Function calling
- The Register Guide: A quick guide to tool-calling in large language models
This module covered the basics of interacting with LLMs and enabling them to use tools. In the next module, we'll explore how models can handle different types of input, specifically images and voice.