The implement
operation is a main feature of the Perpetual tool designed to automate code implementation based on user-provided instructions. This operation leverages Large Language Models (LLMs) to analyze your project, understand the context, and generate or modify code according to your specifications.
The implement
operation works by identifying and processing sections of your code marked with ###IMPLEMENT###
comments. These comments serve as indicators for where new code should be generated or existing code should be modified. The operation follows a multi-stage process to ensure accurate and contextually appropriate code implementation:
-
Project Analysis: The operation begins by analyzing your project structure and content. It utilizes the project index and annotations generated by the
annotate
operation to understand the overall context of your codebase. -
Target File Identification: Files containing
###IMPLEMENT###
comments are identified as targets for code implementation. -
Context Gathering: The operation collects relevant information from the target files and related project files to provide comprehensive context to the LLM.
-
Code Generation: Using the gathered context and the instructions provided in the
###IMPLEMENT###
comments, the LLM generates or modifies code for each target file. It also modifies related files if needed or can even create new files. -
Integration: The generated code is seamlessly integrated into your project, replacing the
###IMPLEMENT###
comments and/or modifying other existing code as specified.
Throughout this process, the implement
operation relies heavily on the project index and file annotations to make informed decisions about code implementation. This ensures that the generated code is consistent with your project's structure, coding style, and existing functionality.
To effectively use the implement
operation, follow this typical workflow:
-
Project Setup:
- Create the basic structure of your project, including main files and directories.
- Initialize your project for use with Perpetual tool by using the
init
operation. - Create a local
.env
configuration file at<project_root>/.perpetual/.env
and/or a global configuration file at~/.config/Perpetual/.env
on Linux or<User profile dir>\AppData\Roaming\Perpetual\.env
on Windows. Settings from the local project configuration file will take precedence over global configuration settings.
-
Marking Implementation Points:
-
In your source files, use
###IMPLEMENT###
comments to indicate where you want code to be generated or modified. -
Example:
//###IMPLEMENT### //Create a function to check user input
-
-
Running the Implement Operation:
-
Execute the implement operation using the command:
Perpetual implement [flags]
-
The operation will process all files with
###IMPLEMENT###
comments.
-
-
Reviewing and Iterating:
-
Review the generated code for accuracy and consistency.
-
If necessary, use the
stash
operation to revert changes:Perpetual stash -r
-
Modify your
###IMPLEMENT###
comments to provide more specific instructions if needed. -
Re-run the
implement
operation to generate new code based on updated instructions.
-
-
Finalizing:
- Once satisfied with the generated code, commit the changes to your version control system.
- Repeat from step 2 for further implementations.
-
###IMPLEMENT###
: Marks sections for code implementation. You can provide detailed instructions after this comment. -
###NOUPLOAD###
: Place this comment at the top of files containing sensitive or unneeded information. Files with this comment will not be sent to the LLM for processing on theimplement
operation. This will still expose your file to the LLM on theannotate
(always) ordoc
operation (if using the-f
flag).It's important to note that while the
###NOUPLOAD###
comment prevents the full file content from being sent to the LLM during theimplement
operation, it does not provide complete protection against data exposure. The file will still be processed during theannotate
operation, which may use a local LLM for generating annotations. This annotation process is necessary to create the project index, which helps the LLM understand the project structure and write new code in context. While the annotation may leak some contextual information about the file, this can be mitigated with special summarization instructions (see theannotate
operation documentation for more details). Users should be aware of these limitations and take appropriate precautions when dealing with sensitive information. It was initially meant to reduce LLM context clogging with unrelated code. To ensure the file will never be processed by the LLM, useproject.json
(see below).
Perpetual
provides detailed logging of LLM interactions at <project_root>/.perpetual/.message_log.txt
file. This file contains an unformatted log of the actual messages exchanged between Perpetual
and the LLM. This log provides a complete record of the communication, including any repeated messages, and can be useful if you need to understand the exact content of the messages sent to the LLM.
To run the implement
operation, use the following command:
Perpetual implement [flags]
Supported flags:
-h
: Display help information about theimplement
operation.-n
: No annotate mode. Skip re-annotating changed files and use current annotations if any.-p
: Enable extended planning stage. Useful for larger modifications that may create new files. Disabled by default to save tokens.-pr
: Enables planning with additional reasoning. May produce improved results for complex or abstractly described tasks, but can also lead to flawed reasoning and worsen the final outcome. This flag includes the-p
flag.-r <file>
: Manually request a specific file for the operation. If not specified, files are selected automatically.-s
: Try to salvage incorrect filenames on Stage 1. Experimental; use in projects with a large number of files where the LLM tends to make more mistakes when generating lists of files to analyze.-u
: Do not exclude unit-tests source files from processing.-x <file>
: Path to user-supplied regex filter-file for filtering out certain files from processing. (Useproject.json
inside the<project_root>/.perpetual
directory as a reference)-z
: When using-p
or-pr
flags, do not enforce initial sources to file lists produced by planning.-v
: Enable debug logging for more detailed output.-vv
: Enable both debug and trace logging for maximum verbosity.
The implement
operation can be fine-tuned using environment variables in the .env
file. These variables allow you to customize the behavior of the LLM used for code implementation. Key configuration options include:
-
LLM Provider:
LLM_PROVIDER_OP_IMPLEMENT_STAGE1
,LLM_PROVIDER_OP_IMPLEMENT_STAGE2
,LLM_PROVIDER_OP_IMPLEMENT_STAGE3
,LLM_PROVIDER_OP_IMPLEMENT_STAGE4
: Specify the LLM provider for each stage of the implement operation.
-
Model Selection:
ANTHROPIC_MODEL_OP_IMPLEMENT_STAGE1
,ANTHROPIC_MODEL_OP_IMPLEMENT_STAGE2
,ANTHROPIC_MODEL_OP_IMPLEMENT_STAGE3
: Anthropic models for each stage.- Similar variables exist for OpenAI and Ollama providers (e.g.,
OPENAI_MODEL_OP_IMPLEMENT_STAGE1
,OLLAMA_MODEL_OP_IMPLEMENT_STAGE1
, etc.)
-
Token Limits:
ANTHROPIC_MAX_TOKENS_OP_IMPLEMENT_STAGE1
,ANTHROPIC_MAX_TOKENS_OP_IMPLEMENT_STAGE2
,ANTHROPIC_MAX_TOKENS_OP_IMPLEMENT_STAGE3
: Set maximum tokens for each stage.- Similar variables exist for OpenAI and Ollama providers.
-
JSON Structured Output Mode:
JSON structured output mode is supported for stages 1 and 3. This mode can be enabled to provide faster responses and slightly lower costs. Note that not all models may support or work reliably with JSON-structured output.
Enable JSON-structured output mode:
To enable JSON-structured output mode for different providers, add the following options to your
.env
file:ANTHROPIC_FORMAT_OP_IMPLEMENT_STAGE1="json" ANTHROPIC_FORMAT_OP_IMPLEMENT_STAGE3="json" OPENAI_FORMAT_OP_IMPLEMENT_STAGE1="json" OPENAI_FORMAT_OP_IMPLEMENT_STAGE3="json" OLLAMA_FORMAT_OP_IMPLEMENT_STAGE1="json" OLLAMA_FORMAT_OP_IMPLEMENT_STAGE3="json"
This mode can be enabled for Stage 1 and 3 for OpenAI, Anthropic, and Ollama providers, providing faster responses and slightly lower costs. Note that not all models may support or work reliably with JSON-structured output.
-
Retry Settings:
ANTHROPIC_ON_FAIL_RETRIES_OP_IMPLEMENT_STAGE1
,ANTHROPIC_ON_FAIL_RETRIES_OP_IMPLEMENT_STAGE2
,ANTHROPIC_ON_FAIL_RETRIES_OP_IMPLEMENT_STAGE3
: Specify retry attempts for each stage.- Similar variables exist for OpenAI and Ollama providers.
-
Temperature:
ANTHROPIC_TEMPERATURE_OP_IMPLEMENT_STAGE1
,ANTHROPIC_TEMPERATURE_OP_IMPLEMENT_STAGE2
,ANTHROPIC_TEMPERATURE_OP_IMPLEMENT_STAGE3
: Set temperature for each stage.- Similar variables exist for OpenAI and Ollama providers.
-
Other LLM Parameters:
TOP_K
,TOP_P
,SEED
,REPEAT_PENALTY
,FREQ_PENALTY
,PRESENCE_PENALTY
: Can be set for each stage by appending_OP_IMPLEMENT_STAGE1
,_OP_IMPLEMENT_STAGE2
, or_OP_IMPLEMENT_STAGE3
.
Example configuration in .env
file:
LLM_PROVIDER="anthropic"
ANTHROPIC_MODEL_OP_IMPLEMENT_STAGE1="claude-3-haiku-20240307"
ANTHROPIC_MODEL_OP_IMPLEMENT_STAGE2="claude-3-5-sonnet-20240620"
ANTHROPIC_MODEL_OP_IMPLEMENT_STAGE3="claude-3-5-sonnet-20240620"
ANTHROPIC_MAX_TOKENS_OP_IMPLEMENT_STAGE1="4096"
ANTHROPIC_MAX_TOKENS_OP_IMPLEMENT_STAGE2="4096"
ANTHROPIC_MAX_TOKENS_OP_IMPLEMENT_STAGE3="4096"
ANTHROPIC_TEMPERATURE_OP_IMPLEMENT_STAGE1="0.5"
ANTHROPIC_TEMPERATURE_OP_IMPLEMENT_STAGE2="0.5"
ANTHROPIC_TEMPERATURE_OP_IMPLEMENT_STAGE3="0.5"
ANTHROPIC_ON_FAIL_RETRIES_OP_IMPLEMENT_STAGE1="3"
ANTHROPIC_ON_FAIL_RETRIES_OP_IMPLEMENT_STAGE2="3"
ANTHROPIC_ON_FAIL_RETRIES_OP_IMPLEMENT_STAGE3="3"
# Enable JSON-structured output mode
ANTHROPIC_FORMAT_OP_IMPLEMENT_STAGE1="json"
ANTHROPIC_FORMAT_OP_IMPLEMENT_STAGE3="json"
This configuration uses the Anthropic provider with the Claude 3.5 Sonnet model for stages 2 and 3, and the Claude 3 Haiku model for stage 1, where it collects context for stages 2 and 3 using the cheaper model to lower costs. It sets a maximum of 4096 tokens, uses a temperature of 0.5, and allows up to 3 retries on failure for each stage. As of the date of writing this document, Claude 3.5 Sonnet is recommended for all stages, but in the future, it may be better to set an even more powerful model for stage 3 when it becomes available.
Customization of LLM prompts for the implement
operation is handled through the .perpetual/op_implement.json
configuration file. This file is populated using the init
operation, which sets up default language-specific prompts tailored to your project's needs. You may want to change it in case of problems, but normally you should not do it unless you are adapting prompts for a programming language or project type not supported by Perpetual.
Special Options:
code_tags_rx
: Regular expressions to identify code blocks in responses.filename_embed_rx
: Regular expression to embed filename into file implementation request.filename_tags
: Tags used to denote filenames in messages.filename_tags_rx
: Regular expressions to parse filename tags.implement_comments_rx
: Regular expressions to detect###IMPLEMENT###
comments.noupload_comments_rx
: Regular expressions to detect###NOUPLOAD###
comments.
Note: Users should not modify these special options unless encountering specific problems, as they are critical for the correct parsing and handling of LLM responses.
-
stage1_output_key
,stage1_output_schema
,stage1_output_schema_desc
,stage1_output_schema_name
: Parameters used if JSON structured output mode is enabled for stage 1 of the operation. -
stage3_output_key
,stage3_output_schema
,stage3_output_schema_desc
,stage3_output_schema_name
: Parameters used if JSON structured output mode is enabled for stage 3 of the operation.
Global project configuration is handled through the .perpetual/project.json
configuration file. It defines what source code files are targets for processing with Perpetual and what are not. You should update paths and regexps used for project-file selection to fit your specific project requirements.
To get the most out of the implement
operation, consider these best practices:
-
Clear Instructions: Provide detailed and clear instructions in your
###IMPLEMENT###
comments. The more specific you are, the better the generated code will be. -
Incremental Implementation: For complex features, break down the implementation into smaller, manageable tasks. This allows for easier review and iteration.
-
Regular Code Reviews: Always review the generated code carefully. While the LLM is powerful, it may not always produce perfect code on the first try.
-
Version Control: Use version control systems to track changes and easily revert if necessary. The
stash
operation can also help with this. -
Consistent Coding Style: Ensure your project has a consistent coding style. The LLM will attempt to match the style of existing code, so maintaining consistency helps produce better results.
-
Maintaining Good Project Architecture: The better and easier to understand and maintain your architecture is, the better results the LLM will provide. Use S.O.L.I.D. principles, split your code into smaller and more specialized units, and place each unit into separate files.
-
Use Planning Flags: For complex implementations that may require creating new files or extensive changes, use the
-p
or-pr
flags to enable more thorough planning. -
Use of
###NOUPLOAD###
Comment: Use the###NOUPLOAD###
comment in files containing sensitive information to prevent them from being directly processed by the LLM in theimplement
operation. Be aware of its limitations as described earlier in this document. For complete exclusion of files from processing, use blacklist defined atproject.json
.The
###NOUPLOAD###
comment can also be used to prevent the LLM context from being clogged with unnecessary information. For example, you may want to avoid uploading implementations of some interfaces when only the interface is sufficient to implement the task, or some large files with text constants to lower the risk of the LLM treating them as direct instructions. This is especially useful when using LLMs with smaller context windows. -
Iterative Refinement: If the initial implementation isn't satisfactory, refine your
###IMPLEMENT###
comments and re-run the operation. Each iteration can bring you closer to the desired result. -
Fine-tune LLM Settings: Experiment with different LLM settings in your
.env
file to find the configuration that works best for your project and coding style. See.env.example
file at<project_root>/.perpetual/.env.example
for all config options.
The implement
operation is divided into four main stages.
This stage is responsible for analyzing the project and gathering context for the implementation. It performs the following tasks:
- Run
annotate
to update project source-code files annotations. - Generate a project index containing file names and their annotations.
- Create a request for files with
###IMPLEMENT###
comments. It queries the LLM to identify which project source-code files are relevant for the implementation according to the project index. - Return the list of files to review.
This stage plans the implementation based on the context gathered in Stage 1. It includes:
- Gather Source Code: Collects source code from relevant project files requested by the LLM in Stage 1.
- Generate Reasonings: If the planning mode includes reasoning (
-pr
flag), it requests the LLM to generate a detailed work plan outlining the steps needed to implement the requested changes. This helps in organizing the implementation tasks and ensuring comprehensive coverage of the requirements.
This stage plans the implementation based on the context gathered in Stages 1 and 2. It includes:
- Querying the LLM to determine which files will be modified or created as a result of implementing the code.
- Processing the LLM's Response to extract a list of files to modify or create. This involves parsing the LLM's output to identify relevant filenames and ensuring they align with the project's file structure and naming conventions.
This final stage generates the actual code based on the planning from Stages 2 and 3. It includes:
- Gather Source Code and Work Plan: Utilizes the source code from relevant project files and, if available, the work plan from Stage 2 as further instructions.
- Iteratively Process Each File: For each file that needs modification or creation:
- Query the LLM to produce the implemented code.
- Handle Partial Responses and continue generation if token limits are reached.
- Parse and Store the generated code for each file.
- Integrate Generated Code: Saves the generated code into the appropriate files, replacing the
###IMPLEMENT###
comments and ensuring that the code integrates seamlessly with the existing codebase.
The implement
operation includes robust error handling and retry mechanisms to ensure reliable code generation:
-
LLM Query Failures: If an LLM query fails, the operation will retry up to the number of times specified in the
<PROVIDER>_ON_FAIL_RETRIES_OP_IMPLEMENT_STAGE<NUMBER>
environment variables. -
Token Limit Handling: If the LLM response reaches the token limit, the operation attempts to continue generating code from where it left off. This is particularly useful for large files or complex implementations, but the result heavily depends on the LLM's ability to follow the instructions. Currently (as of September 2024), it works best with the Anthropic Claude 3.5 Sonnet model and may also work well with GPT-4 or GPT-4o models.
-
Invalid Responses: The operation checks for properly formatted code blocks in the LLM responses (by default, it tries to detect Markdown-formatted code blocks, but you may customize it for any other format). If no valid code is found, it will retry the query.
The implement
operation can consume significant time (when using a locally running LLM) or incur costs when using commercial LLM providers, especially for large projects or complex tasks. Consider the following to optimize performance:
-
Use Appropriate Models: Choose LLM models/providers that balance capability and speed. For example, using a smaller model for Stages 1 and 3 and more powerful models for Stages 2 and 4. You may also try using small local models with Ollama for the
annotate
operation to save on costs associated with auto re-annotating changed files. -
Do Not Use
-p
or-pr
Flags Unless Needed: You may significantly save on LLM API calls, tokens, and costs by not using these flags if you believe that the implementation won't produce any new files or cause changes in other files not marked with###IMPLEMENT###
comments. -
Incremental Implementation: For large projects, implement changes in smaller, manageable chunks rather than attempting to modify the entire codebase at once.
-
Use the
-u
Flag: If your project contains unit-tests source code files that are relevant to the implementation task, use the-u
flag to include them in processing (this will disable the filter for such files). This provides additional context for the LLM and allows it to see and modify unit-tests. However, be aware that including tests will increase the amount of code the LLM needs to analyze, which may increase costs. -
Custom File Filtering: For more fine-grained control over which files are processed, use the
-x
flag with a custom regex filter file. This allows you to exclude specific files or file types that are not relevant to your current implementation task, reducing your costs.