Edit Banana: The Complete Guide to Converting Static Charts into Editable Diagrams

Every researcher, analyst, and content creator has faced this frustrating scenario: you find the perfect chart in a PDF, a research paper, or a screenshot — but it's a static image. You can't change the colors, update the data, or adjust the layout. It's a "pixel dead image," locked in format forever. Edit Banana changes this entirely.
Built by BIT-DataLab (Beijing Institute of Technology), Edit Banana is an open-source framework that uses SAM 3 (Segment Anything Model 3) and multimodal large language models to convert static charts, diagrams, and figures into fully editable DrawIO (XML) and PowerPoint (PPTX) files — preserving layout logic, color matching, element hierarchy, and text accuracy.
With 3,800+ GitHub stars, 233 forks, and an active online demo, Edit Banana is at the forefront of the emerging field of chart derendering — the reverse engineering of visual data representations back into their editable components.
What Is Edit Banana?
Edit Banana is a Universal Content Re-Editor — its tagline says it all: "Make the Uneditable, Editable." The tool takes a static image (PNG/JPG) or PDF containing charts, flowcharts, diagrams, or statistical figures, and outputs an editable file where every element can be individually dragged, styled, and modified.
Key Stats
| Metric | Value |
|---|---|
| GitHub Stars | 3,800+ |
| Forks | 233 |
| Language | Python |
| License | AGPL-3.0 |
| Created | January 2026 |
| Last Updated | March 2026 |
| Contributors | 6 |
| Topics | ai, data, figure, llm, python, nanobanana |
| Organization | BIT-DataLab (Beijing Institute of Technology) |
| Online Demo | editbanana.anxin6.cn |
What Problem Does It Solve?
In academic research, business reporting, and content creation, charts and diagrams are routinely shared as images embedded in PDFs, slides, or papers. These static formats make it impossible to:
- Update data without recreating the entire chart
- Change styling (colors, fonts, line thickness)
- Extract underlying numerical data
- Adapt charts for different presentations or publications
- Modify individual elements like arrows, labels, or shapes
Edit Banana solves all of these by reconstructing the diagram from its pixel representation into a structured, editable format.
How It Works: The Architecture Pipeline
Edit Banana uses a sophisticated multi-stage pipeline:
Stage 1: Input Processing
The system accepts PNG/JPG images or PDF files containing charts, flowcharts, statistical figures, or diagrams.
Stage 2: Segmentation with SAM 3
Using a fine-tuned SAM 3 (Segment Anything Model 3) mask decoder, Edit Banana identifies and segments every visual element in the diagram:
- Shapes (rectangles, circles, diamonds)
- Lines and arrows
- Text regions
- Background areas
- Color fills and strokes
The segmentation preserves spatial relationships — where elements are positioned relative to each other, how they connect, and their hierarchical structure.
Stage 3: Text Extraction (Parallel Processing)
A dual-engine text extraction pipeline runs in parallel:
-
Local OCR (Tesseract): Detects text bounding boxes and extracts standard text content. Runs entirely offline with no API dependencies.
-
Pix2Text: Specialized in mathematical formula recognition and LaTeX conversion — turning rendered formulas like ∫f(x)dx back into their LaTeX source code.
-
Crop-Guided Strategy: Rather than passing the entire image to OCR, the system extracts individual text/formula regions as high-resolution crops, then sends these focused regions to the text engine for maximum accuracy.
Stage 4: Fixed Multi-Round VLM Scanning
A Multimodal LLM (Qwen-VL or GPT-4V) scans the diagram in multiple rounds to understand:
- Logical relationships between elements
- Flow direction (for flowcharts)
- Data relationships (for bar/line/pie charts)
- Element hierarchy and grouping
Stage 5: XML/PPTX Generation
The final stage merges spatial data from SAM 3 segmentation with text OCR results to generate:
- DrawIO XML: Directly importable into draw.io, the most popular free diagramming tool
- PPTX: PowerPoint files where each element is a separate, editable shape
Key Features
🎯 High-Fidelity Reconstruction
The reconstruction preserves:
- Layout logic — exact spatial positioning of elements
- Color matching — extracted dominant colors applied to fills/strokes
- Element hierarchy — grouping, layering, and nesting
- Text accuracy — including mathematical formulae via LaTeX
📐 Individual Element Manipulation
Once converted, every element in the output can be:
- Dragged to new positions
- Restyled (colors, borders, shadows)
- Resized
- Deleted or duplicated
- Connected with new arrows/lines
🧮 Mathematical Formula Support
The Pix2Text integration means rendered equations are converted back to LaTeX source code — critical for academic papers where formula editing is common.
🖥️ Multiple Interfaces
- CLI:
python main.py -i input/diagram.png - FastAPI Server:
server_pa.pyfor API-based access - Web Demo: editbanana.anxin6.cn
👥 Multi-User Concurrency
The server supports concurrent users with:
- Global Lock mechanism for thread-safe GPU access
- LRU Cache to persist image embeddings across requests
- Credit system (10 free credits for new users) to prevent resource abuse
Getting Started
Prerequisites
- Python 3.10+
- CUDA-capable GPU (highly recommended)
- Tesseract OCR engine
Installation
# Clone
git clone https://github.com/BIT-DataLab/Edit-Banana.git
cd Image2DrawIO
# Create directories
mkdir -p input output sam3_output
# Download SAM 3 model weights
# From: https://modelscope.cn/models/facebook/sam3
# Place in: models/sam3.pt
# Install dependencies
pip install -r requirements.txt
# Install Tesseract (Ubuntu)
sudo apt install tesseract-ocr tesseract-ocr-chi-sim
# Configure
cp config/config.yaml.example config/config.yaml
Usage
# Process a single image
python main.py -i input/test_diagram.png
# Output XML saved to output/ directory
Configuration (config.yaml)
Customize the pipeline:
- sam3: Score thresholds, NMS thresholds, max iteration loops
- paths: Input/output directories
- dominant_color: Color extraction sensitivity
Use Cases
📚 Academic Research
Researchers frequently need to re-create or modify charts from published papers for their own publications. Edit Banana eliminates the need to recreate charts from scratch — just feed in the original figure and get an editable version.
📊 Business Reporting
Analysts working with quarterly reports, market research decks, or competitor analyses often encounter charts locked in PDFs. Edit Banana extracts them into editable PowerPoint slides.
🎨 Content Creation
Bloggers, educators, and presenters can transform any chart they find into customizable assets — changing colors to match their brand, updating labels, or restructuring layouts.
🔬 Data Recovery
When the original data behind a chart is lost (common in legacy reports), Edit Banana can reconstruct the visual elements, making it possible to approximate and extract the underlying data.
Edit Banana vs Alternatives
This is a chart/diagram derendering tool. Here's how it compares:
| Feature | Edit Banana | Pic2Chart | ChartReader | WebPlotDigitizer | ChartDetective |
|---|---|---|---|---|---|
| Stars | 3.8K | N/A (SaaS) | 200+ | N/A (Web) | 100+ |
| Input | Image + PDF | Image | Image | Image + PDF | Vector charts |
| Output | DrawIO + PPTX | PowerPoint SVG | Data tables | Data extraction | Data points |
| Segmentation | ✅ SAM 3 | ❌ | ✅ Transformers | ❌ Manual | ❌ Interactive |
| LaTeX Support | ✅ Pix2Text | ❌ | ❌ | ❌ | ❌ |
| VLM Integration | ✅ Qwen-VL/GPT-4V | ❌ | ✅ | ❌ | ❌ |
| Self-hosted | ✅ | ❌ | ✅ | ❌ | ✅ |
| Editable Diagram | ✅ Full elements | ✅ Partial | ❌ Data only | ❌ Data only | ❌ Data only |
| Online Demo | ✅ | ✅ | ❌ | ✅ | ❌ |
When to Choose Each
- Edit Banana: When you need the full diagram reconstructed as an editable file with every element manipulable. Best for flowcharts, architecture diagrams, and complex figures.
- Pic2Chart: Quick online conversion of simple diagrams to PowerPoint SVG. No self-hosting option.
- ChartReader: When you need to extract numerical data from bar/line charts for analysis. Outputs data tables, not editable diagrams.
- WebPlotDigitizer: Industry standard for manual data extraction from XY plots. Best when precision matters and you're willing to do manual calibration.
- ChartDetective: Interactive tool for vector chart data extraction. Requires manual interaction.
Edit Banana's Unique Advantage
The key differentiator is that Edit Banana is the only tool that performs full diagram reconstruction into editable formats. Other tools focus on data extraction — getting numbers from charts. Edit Banana goes further by reconstructing the visual representation itself as a fully editable diagram where every shape, line, text, and color can be individually modified.
Frequently Asked Questions
What types of diagrams does it handle?
Flowcharts, architecture diagrams, statistical charts (bar, line, pie), UML diagrams, mind maps, and any visual diagram with distinct elements.
Do I need a GPU?
A CUDA-capable GPU is highly recommended for SAM 3 inference. CPU-only mode is possible but significantly slower.
How accurate is the text extraction?
Standard text uses Tesseract OCR with high accuracy. Mathematical formulae use Pix2Text for LaTeX conversion. The crop-guided strategy ensures high-resolution input for maximum accuracy.
Can I use this commercially?
The code is licensed under AGPL-3.0, which requires derivative works to be open-sourced. For commercial use without this requirement, contact the team.
Is there an API?
Yes. The server_pa.py FastAPI server provides HTTP endpoints for programmatic access. The online demo at editbanana.anxin6.cn also accepts uploads.
GitHub vs Web version — which is better?
The web platform at editbanana.anxin6.cn has the most up-to-date features. The GitHub repository trails behind but provides full self-hosting capability.
Conclusion
Edit Banana represents a breakthrough in the field of chart derendering. By combining SAM 3 segmentation, multimodal LLMs, and dual-engine OCR into a single pipeline, it achieves what was previously impossible: converting static chart images into fully editable diagrams with preserved layout, colors, and text — including mathematical formulae.
With 3,800+ stars and backing from Beijing Institute of Technology's Data Lab, this project fills a gap that researchers, analysts, and content creators have struggled with for years. No more recreating charts from scratch — just upload an image and get an editable DrawIO or PowerPoint file in seconds.
