code2logic

code2flow

Python Code Flow Analysis Tool - Static analysis for control flow graphs (CFG), data flow graphs (DFG), and call graph extraction.

Performance Optimization

For large projects (>1000 functions), use Fast Mode:

# Ultra-fast analysis (5-10x faster)
code2flow /path/to/project --fast

# Custom performance settings
code2flow /path/to/project \
    --parallel-workers 8 \
    --max-depth 3 \
    --skip-data-flow \
    --cache-dir ./.cache

Performance Tips

Technique Speedup Use Case
--fast mode 5-10x Initial exploration
Parallel workers 2-4x Multi-core machines
Caching 3-5x Repeated analysis
Depth limiting 2-3x Large codebases
Skip private methods 1.5-2x Public API analysis

Benchmarks

Project Size Functions Time (fast) Time (full)
Small (<100) ~50 0.5s 2s
Medium (1K) ~500 3s 15s
Large (10K) ~2000 15s 120s

Features

Installation

# Install from source
pip install -e .

# Or with development dependencies
pip install -e ".[dev]"

Quick Start

# Analyze a Python project
code2flow /path/to/project

# With verbose output
code2flow /path/to/project -v

# Specify output directory and formats
code2flow /path/to/project -o ./analysis --format yaml,json,mermaid,png

# Use different analysis modes
code2flow /path/to/project -m static    # Fast static analysis only
code2flow /path/to/project -m hybrid     # Combined analysis (default)

Usage

Basic Analysis

code2flow /path/to/project

Analysis Modes

# Static analysis only (fastest)
code2flow /path/to/project -m static

# Dynamic analysis with tracing
code2flow /path/to/project -m dynamic

# Hybrid analysis (recommended)
code2flow /path/to/project -m hybrid

# Behavioral pattern focus
code2flow /path/to/project -m behavioral

# Reverse engineering ready
code2flow /path/to/project -m reverse

Custom Output

code2flow /path/to/project -o my_analysis

Output Files

File Description
analysis.yaml Complete structured analysis data
analysis.json JSON format for programmatic use
flow.mmd Full Mermaid flowchart (all nodes)
compact_flow.mmd Compact flowchart - deduplicated nodes, grouped by function
calls.mmd Function call graph
cfg.png Control flow visualization
call_graph.png Call graph visualization
llm_prompt.md LLM-ready analysis summary

Compact Flow Format

The compact_flow.mmd file provides optimized output:

Example compact output:

flowchart TD
    %% Function subgraphs
    subgraph F12345["process_data"]
        N1["x = N"]  
        N2{"if x > 0"}
        N3[/"return x"/]
    end
    
    %% Edges reference deduplicated nodes
    N1 --> N2
    N2 -->|"true"| N3

Understanding the Output

LLM Prompt Structure

The generated prompt includes:

Behavioral Patterns

Each pattern includes:

Reverse Engineering Guidelines

The analysis provides specific guidance for:

  1. Preserving call graph structure
  2. Implementing identified patterns
  3. Maintaining data dependencies
  4. Recreating state machines
  5. Preserving decision logic

Advanced Features

State Machine Detection

Automatically identifies:

Data Flow Tracking

Maps:

Dynamic Tracing

When using dynamic mode:

Integration with LLMs

The generated system_analysis_prompt.md is designed to be:

Example usage with an LLM:

"Based on the system analysis provided, implement this system in Go,
preserving all behavioral patterns and data flow characteristics."

Limitations

Contributing

The analyzer is designed to be extensible. Key areas for enhancement:

License

Apache License 2.0 - see LICENSE for details.

Author

Created by Tom Sapletta - tom@sapletta.com