Codegen - Manipulate Code at Scale

Codegen natively integrates with LLMs via the codebase.ai(…) method, which lets you use large language models (LLMs) to help generate, modify, and analyze code.

Configuration

Before using AI capabilities, you need to provide an OpenAI API key via codebase.set_ai_key(…):

# Set your OpenAI API key
codebase.set_ai_key("your-openai-api-key")

Calling Codebase.ai(…)

The Codebase.ai(…) method takes three key arguments:

result = codebase.ai(
    prompt="Your instruction to the AI",
    target=symbol_to_modify,  # Optional: The code being operated on
    context=additional_info   # Optional: Extra context from static analysis
)

prompt: Clear instruction for what you want the AI to do
target: The symbol (function, class, etc.) being operated on - its source code will be provided to the AI
context: Additional information you want to provide to the AI, which you can gather using GraphSitter’s analysis tools

Codegen does not automatically provide any context to the LLM by default. It does not “understand” your codebase, only the context you provide.

The context parameter can include:

A single symbol (its source code will be provided)
A list of related symbols
A dictionary mapping descriptions to symbols/values
Nested combinations of the above

How Context Works

The AI doesn’t automatically know about your codebase. Instead, you can provide relevant context by:

Using GraphSitter’s static analysis to gather information:

function = codebase.get_function("process_data")
context = {
    "call_sites": function.call_sites,      # Where the function is called
    "dependencies": function.dependencies,   # What the function depends on
    "parent": function.parent,              # Class/module containing the function
    "docstring": function.docstring,        # Existing documentation
}

Passing this information to the AI:

result = codebase.ai(
    "Improve this function's implementation",
    target=function,
    context=context  # AI will see the gathered information
)

Common Use Cases

Code Generation

Generate new code or refactor existing code:

# Break up a large function
function = codebase.get_function("large_function")
new_code = codebase.ai(
    "Break this function into smaller, more focused functions",
    target=function
)
function.edit(new_code)

# Generate a test
my_function = codebase.get_function("my_function")
test_code = codebase.ai(
    f"Write a test for the function {my_function.name}",
    target=my_function
)
my_function.insert_after(test_code)

Documentation

Generate and format documentation:

# Generate docstrings for a class
class_def = codebase.get_class("MyClass")
for method in class_def.methods:
    docstring = codebase.ai(
        "Generate a docstring describing this method",
        target=method,
        context={
            "class": class_def,
            "style": "Google docstring format"
        }
    )
    method.set_docstring(docstring)

Code Analysis and Improvement

Use AI to analyze and improve code:

# Improve function names
for function in codebase.functions:
    if codebase.ai(
        "Does this function name clearly describe its purpose? Answer yes/no",
        target=function
    ).lower() == "no":
        new_name = codebase.ai(
            "Suggest a better name for this function",
            target=function,
            context={"call_sites": function.call_sites}
        )
        function.rename(new_name)

Contextual Modifications

Make changes with full context awareness:

# Refactor a class method
method = codebase.get_class("MyClass").get_method("target_method")
new_impl = codebase.ai(
    "Refactor this method to be more efficient",
    target=method,
    context={
        "parent_class": method.parent,
        "call_sites": method.call_sites,
        "dependencies": method.dependencies
    }
)
method.edit(new_impl)

Best Practices

Provide Relevant Context

# Good: Providing specific, relevant context
summary = codebase.ai(
    "Generate a summary of this method's purpose",
    target=method,
    context={
        "class": method.parent,              # Class containing the method
        "usages": list(method.usages),       # How the method is used
        "dependencies": method.dependencies,  # What the method depends on
        "style": "concise"
    }
)

# Bad: Missing context that could help the AI
summary = codebase.ai(
    "Generate a summary",
    target=method  # AI only sees the method's code
)

Gather Comprehensive Context

# Gather relevant information before AI call
def get_method_context(method):
    return {
        "class": method.parent,
        "call_sites": list(method.call_sites),
        "dependencies": list(method.dependencies),
        "related_methods": [m for m in method.parent.methods
                          if m.name != method.name]
    }

# Use gathered context in AI call
new_impl = codebase.ai(
    "Refactor this method to be more efficient",
    target=method,
    context=get_method_context(method)
)

Handle AI Limits

# Set custom AI request limits for large operations
codebase.set_session_options(max_ai_requests=200)

Review Generated Code

# Generate and review before applying
new_code = codebase.ai(
    "Optimize this function",
    target=function
)
print("Review generated code:")
print(new_code)
if input("Apply changes? (y/n): ").lower() == 'y':
    function.edit(new_code)

Limitations and Safety

The AI doesn’t automatically know about your codebase - you must provide relevant context
AI-generated code should always be reviewed
Default limit of 150 AI requests per codemod execution
- Use set_session_options(…) to adjust limits:
```
codebase.set_session_options(max_ai_requests=200)
```

You can also use codebase.set_session_options to increase the execution time and the number of operations allowed in a session. This is useful for handling larger tasks or more complex operations that require additional resources. Adjust the max_seconds and max_transactions parameters to suit your needs:

codebase.set_session_options(max_seconds=300, max_transactions=500)

Introduction

Tutorials

Building with Codegen

Calling Out to LLMs

Configuration

Calling Codebase.ai(…)

How Context Works

Common Use Cases

Code Generation

Documentation

Code Analysis and Improvement

Contextual Modifications

Best Practices

Limitations and Safety

Introduction

Tutorials

Building with Codegen

​Configuration

​Calling Codebase.ai(…)

​How Context Works

​Common Use Cases

​Code Generation

​Documentation

​Code Analysis and Improvement

​Contextual Modifications

​Best Practices

​Limitations and Safety

Configuration

Calling Codebase.ai(…)

How Context Works

Common Use Cases

Code Generation

Documentation

Code Analysis and Improvement

Contextual Modifications

Best Practices

Limitations and Safety