How do LLMs use tools?

Tool use lets LLMs take actions beyond text generation. They can search the web, run code, query databases, and interact with external systems.

How can an AI search the web or run code when it's just predicting text?

On its own, an LLM can only generate text. It can't browse the internet, execute programs, or access databases. It has no hands, no eyes, no connection to the outside world.

But LLMs can be given tools: external functions they can choose to call. When the model decides it needs information or needs to take action, it outputs a special tool-calling format. The system intercepts this, executes the tool, and returns the result. The model continues with this new information.

Suddenly, a text predictor becomes an agent that can act in the world.

How tool use works

The system provides tool definitions: name, description, parameters. These go into the prompt (or are handled specially by the API).

Available tools:
- web_search(query: string): Search the web and return results
- calculator(expression: string): Evaluate a math expression
- get_weather(location: string): Get current weather

When the model determines it needs a tool, instead of generating normal text, it outputs a structured tool call:

{"tool": "web_search", "query": "current population of Tokyo"}

The system executes the search, gets results, and feeds them back to the model. The model then generates its response using this real information.

Tool Use Cycle

💬User QueryWhat's the weather?

→

🤔Model DecidesNeed tool: get_weather

→

🔧Tool CallExecute function

→

📥ResultData returned

→

✅ResponseWith real info

Why tools matter

Without tools, LLMs are limited to their training data. They can't:

Access current information (their knowledge has a cutoff)
Perform precise calculations (they approximate math)
Interact with your systems (files, databases, APIs)
Take actions (send emails, create documents)

Tools bridge this gap. They connect the model's reasoning to real-world capabilities.

Function calling: the technical implementation

Modern APIs provide structured function calling. You define functions with typed parameters:

functions = [{
    "name": "get_weather",
    "description": "Get the current weather in a location",
    "parameters": {
        "type": "object",
        "properties": {
            "location": {"type": "string", "description": "City and state"},
            "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
        },
        "required": ["location"]
    }
}]

The model outputs structured calls matching this schema. Your code validates and executes them. This is more reliable than parsing free-form text.

Common tool categories

Information retrieval:

Web search
Database queries
Document retrieval (RAG)
API calls to external services

Computation:

Calculator
Code execution (Python, JavaScript)
Spreadsheet operations

Actions:

File creation/modification
Email sending
Calendar management
System commands

Specialized:

Image generation
Data visualization
Translation services

The power of tool use is extensibility. New tools can be added without retraining the model.

🔧

Interactive Tool Flow

Step through the tool use cycle and see each stage in action

Trust and verification

Tools give LLMs real power. This requires caution.

If a model can execute code, it can potentially damage systems. If it can send emails, it can spam. If it can access databases, it can leak data.

Tool use requires careful permission design. What tools are available? What are their scope limits? What requires human approval? These decisions shape the safety of the system.

The model might also use tools incorrectly: wrong parameters, unnecessary calls, misinterpreted results. Tool outputs need validation just like any model output.

Sources & Further Reading

📖 Docs

Tool Use

Anthropic

📄 Paper

Toolformer: Language Models Can Teach Themselves to Use Tools

Schick et al. · Meta AI · 2023

🔗 Article

Claude's tool use

Anthropic