Skip to content

🧬 GeneForge Enhancer Module

Overview

This module enables the generation of synthetic enhancer sequences that regulate gene expression in a cell-type-specific and transcription-factor-aware manner. Enhancers are short DNA sequences that act as switches to turn genes on/off or modulate their expression. This capability expands GeneForge beyond protein and RNA generation into regulatory genomic design.


💡 Functionalities

  • Generate synthetic enhancers from scratch (de novo).
  • Predict enhancer strength and tissue specificity.
  • Model transcription factor (TF) binding site composition.
  • Simulate the impact of enhancer variants on gene expression.
  • Export in FASTA, YAML (GeneForgeLang), and JSON formats.

🧩 Input Schema (YAML)

- enhancer:
    name: "EPO_Hematopoietic_Enhancer"
    target_gene: "EPO"
    cell_type: "hematopoietic_progenitor"
    species: "Mus musculus"
    factors: ["GATA1", "KLF1", "TAL1"]
    goal: "upregulate"
    model: "GeneForgeEnhancerGen-v1"
    validate_in_silico: true
    simulate_context: "blood_lineage"

📤 Output Example

{
  "sequence": "AGGTCAGGCTGATAACCTTGTAGGTCA...",
  "binding_sites": [
    {"TF": "GATA1", "pos": 17, "score": 0.93},
    {"TF": "KLF1", "pos": 45, "score": 0.88}
  ],
  "predicted_activity": 0.87,
  "tissue_specificity": "hematopoietic cells"
}

🛠️ Training Datasets

  • ENCODE
  • FANTOM5
  • VISTA Enhancer Browser
  • Cell-type-specific ATAC-seq/ChIP-seq datasets

🔬 Use Cases

  • Control expression of therapeutic genes
  • Modulate cell fate (e.g., hematopoietic vs neuronal)
  • Build gene circuits for synthetic biology
  • Test enhancer variants for disease-linked genes

📦 Integration in Repo

Place this module in the folder: enhancer_design/

  • Add a README inside enhancer_design/ explaining usage and schema.
  • Add inference and example notebooks under examples/.
  • Add the YAML parser in gene_tokenizer/yaml_parser.py
  • Register the enhancer model in model/gene_transformer.py

✅ Next Steps

  • Validate output using enhancer-promoter reporter assays.
  • Support enhancer-blocker-insulator relationships.
  • Cross-species enhancer translation using alignment modules.

👥 Contributors

GeneForge Team @ Fundación de Neurociencias