Skip to content

GeneForgeLang API Reference

This document provides comprehensive documentation for the GeneForgeLang (GFL) stable API. The API follows semantic versioning and provides both typed and untyped interfaces.

API Version: 2.0.0

Compatibility: Backward compatible with 0.1.x APIs Package Version: 0.2.0+

Overview

The GFL API provides three main functions for working with genomic workflow specifications:

  • parse(): Convert GFL source code to AST
  • validate(): Check AST for semantic correctness
  • infer(): Run probabilistic reasoning on AST

All functions support both typed and untyped modes for maximum flexibility.

Core Functions

parse()

Parse GFL source code into an Abstract Syntax Tree (AST).

Signatures

# Typed mode (recommended)
def parse(text: str, *, typed: Literal[True]) -> GFLAST: ...

# Untyped mode (backward compatible)
def parse(text: str, *, typed: Literal[False] = False) -> Dict[str, Any]: ...

Parameters

  • text (str): GFL source code in YAML-style syntax
  • typed (bool, optional): If True, return typed GFLAST object; if False, return Dict[str, Any]. Default: False

Returns

  • Typed mode: GFLAST object with full type safety and IDE support
  • Untyped mode: Dict[str, Any] for backward compatibility

Raises

  • ValueError: If the input cannot be parsed

Examples

from gfl.api import parse

# Untyped mode (backward compatible)
ast_dict = parse("""
experiment:
  tool: CRISPR_cas9
  type: gene_editing
  params:
    target_gene: TP53
""")

# Typed mode (recommended for new code)
ast_typed = parse("""
experiment:
  tool: CRISPR_cas9
  type: gene_editing
  params:
    target_gene: TP53
""", typed=True)

# IDE autocomplete works with typed mode!
print(ast_typed.experiment.tool)  # "CRISPR_cas9"
print(ast_typed.experiment.params.target_gene)  # "TP53"

validate()

Validate an AST for semantic correctness.

Signatures

# Detailed mode (recommended)
def validate(ast: Union[GFLAST, Dict[str, Any]], *, detailed: Literal[True]) -> ValidationResult: ...

# Simple mode (backward compatible)
def validate(ast: Union[GFLAST, Dict[str, Any]], *, detailed: Literal[False] = False) -> List[str]: ...

Parameters

  • ast: AST to validate (either GFLAST or Dict[str, Any])
  • detailed (bool, optional): If True, return detailed ValidationResult; if False, return List[str]. Default: False

Returns

  • Detailed mode: ValidationResult with categorized errors, warnings, and info
  • Simple mode: List[str] with error messages (empty list if valid)

Examples

from gfl.api import parse, validate

ast = parse("experiment:\n  tool: CRISPR_cas9\n  type: gene_editing")

# Backward compatible mode
errors = validate(ast)
if errors:
    print(f"Found {len(errors)} errors")
    for error in errors:
        print(f"  - {error}")

# Detailed mode (recommended)
result = validate(ast, detailed=True)
if not result.is_valid:
    print("Validation failed:")
    for error in result.errors:
        print(f"  ERROR: {error}")
    for warning in result.warnings:
        print(f"  WARNING: {warning}")

infer()

Run probabilistic post-processing with a provided model.

Signatures

# Detailed mode (recommended)
def infer(model, ast: Union[GFLAST, Dict[str, Any]], *, detailed: Literal[True]) -> InferenceResult: ...

# Simple mode (backward compatible)
def infer(model, ast: Union[GFLAST, Dict[str, Any]], *, detailed: Literal[False] = False) -> Dict[str, Any]: ...

Parameters

  • model: Model with predict(features: Dict[str, Any]) -> Dict[str, Any] method
  • ast: AST to run inference on
  • detailed (bool, optional): If True, return detailed InferenceResult; if False, return Dict[str, Any]. Default: False

Returns

  • Detailed mode: InferenceResult with predictions, confidence, and metadata
  • Simple mode: Dict[str, Any] for backward compatibility

Examples

from gfl.api import parse, infer
from gfl.models.dummy import DummyModel

ast = parse("experiment:\n  tool: CRISPR_cas9\n  type: gene_editing")
model = DummyModel()

# Backward compatible mode
results = infer(model, ast)
print(results["predictions"])

# Detailed mode (recommended)
result = infer(model, ast, detailed=True)
print(f"Confidence: {result.confidence}")
print(f"Predictions: {result.predictions}")

Convenience Functions

parse_file()

Parse GFL file and return AST.

def parse_file(file_path: str, *, typed: bool = False) -> Union[GFLAST, Dict[str, Any]]:

Parameters

  • file_path (str): Path to GFL file
  • typed (bool, optional): If True, return GFLAST; if False, return Dict[str, Any]. Default: False

Examples

from gfl.api import parse_file

# Parse file in typed mode
ast = parse_file("experiment.gfl", typed=True)

validate_file()

Parse and validate GFL file.

def validate_file(file_path: str, *, detailed: bool = False) -> Union[List[str], ValidationResult]:

Parameters

  • file_path (str): Path to GFL file
  • detailed (bool, optional): If True, return ValidationResult; if False, return List[str]. Default: False

Examples

from gfl.api import validate_file

# Validate file with detailed results
result = validate_file("experiment.gfl", detailed=True)
if result.is_valid:
    print("✓ File is valid")

get_api_info()

Get API version and compatibility information.

def get_api_info() -> Dict[str, str]:

Returns

Dictionary with API metadata: - version: Package version - api_version: API version (semantic versioning) - compatibility: Compatibility information - typed_support: Typed API support level - schema_version: JSON schema version

Examples

from gfl.api import get_api_info

info = get_api_info()
print(f"GFL API Version: {info['api_version']}")

Type System

Core Types

GFLAST

The main typed AST representation.

@dataclass
class GFLAST:
    experiment: Optional[Experiment] = None
    analyze: Optional[Analysis] = None
    simulate: Optional[bool] = None
    branch: Optional[Branch] = None
    metadata: Dict[str, Any] = field(default_factory=dict)

Experiment

Represents an experimental design.

@dataclass
class Experiment:
    tool: str
    type: ExperimentType
    params: ExperimentParams = field(default_factory=ExperimentParams)
    strategy: Optional[str] = None

Analysis

Represents an analysis configuration.

@dataclass
class Analysis:
    strategy: AnalysisStrategy
    data: Optional[str] = None
    thresholds: AnalysisThresholds = field(default_factory=AnalysisThresholds)
    filters: List[str] = field(default_factory=list)
    operations: List[AnalysisOperation] = field(default_factory=list)

Validation Types

ValidationResult

Detailed validation results.

@dataclass
class ValidationResult:
    errors: List[ValidationError] = field(default_factory=list)
    warnings: List[ValidationError] = field(default_factory=list)
    info: List[ValidationError] = field(default_factory=list)

    @property
    def is_valid(self) -> bool:
        return len(self.errors) == 0

ValidationError

Individual validation message.

@dataclass
class ValidationError:
    message: str
    location: Optional[str] = None
    severity: Literal["error", "warning", "info"] = "error"
    code: Optional[str] = None

Inference Types

InferenceResult

Detailed inference results.

@dataclass
class InferenceResult:
    predictions: Dict[str, Any]
    confidence: float
    metadata: Dict[str, Any] = field(default_factory=dict)

Migration Guide

From 0.1.x to 0.2.x

The 0.2.x API is fully backward compatible with 0.1.x, but adds new typed interfaces.

Old Code (0.1.x)

from gfl.api import parse, validate, infer

ast = parse(gfl_text)
errors = validate(ast)
results = infer(model, ast)
from gfl.api import parse, validate, infer

# Use typed API for better IDE support
ast = parse(gfl_text, typed=True)
result = validate(ast, detailed=True)
inference = infer(model, ast, detailed=True)

# Access with full type safety
if result.is_valid:
    print(f"Confidence: {inference.confidence}")

Error Handling

Common Exceptions

  • ValueError: Invalid input or configuration
  • FileNotFoundError: File not found (file functions)
  • ImportError: Missing optional dependencies

Best Practices

from gfl.api import parse, validate
from gfl.types import ValidationError

try:
    ast = parse(gfl_text, typed=True)
    result = validate(ast, detailed=True)

    if not result.is_valid:
        for error in result.errors:
            print(f"Validation error: {error}")
        return False

except ValueError as e:
    print(f"Parse error: {e}")
    return False

Performance Considerations

Memory Usage

  • Typed mode uses slightly more memory due to dataclass overhead
  • For large files, consider processing in chunks
  • Use lazy plugin loading to reduce startup time

Optimization Tips

# Cache parsed ASTs for repeated validation
ast = parse(text, typed=True)
result1 = validate(ast, detailed=True)
result2 = some_other_validation(ast)

# Use untyped mode for simple scripts
if simple_use_case:
    ast = parse(text)  # Faster for one-off processing

Version Compatibility

GFL Version API Version Python Features
0.2.x 2.0.0 3.10+ Full typed API, schema validation
0.1.x 1.0.0 3.9+ Basic API, dict-only

Support