GeneForgeLang API Reference¶

This document provides comprehensive documentation for the GeneForgeLang (GFL) stable API. The API follows semantic versioning and provides both typed and untyped interfaces.

API Version: 2.0.0¶

Compatibility: Backward compatible with 0.1.x APIs Package Version: 0.2.0+

Overview¶

The GFL API provides three main functions for working with genomic workflow specifications:

parse(): Convert GFL source code to AST
validate(): Check AST for semantic correctness
infer(): Run probabilistic reasoning on AST

All functions support both typed and untyped modes for maximum flexibility.

Core Functions¶

parse()¶

Parse GFL source code into an Abstract Syntax Tree (AST).

Signatures¶

# Typed mode (recommended)
def parse(text: str, *, typed: Literal[True]) -> GFLAST: ...

# Untyped mode (backward compatible)
def parse(text: str, *, typed: Literal[False] = False) -> Dict[str, Any]: ...

Parameters¶

text (str): GFL source code in YAML-style syntax
typed (bool, optional): If True, return typed GFLAST object; if False, return Dict[str, Any]. Default: False

Returns¶

Typed mode: GFLAST object with full type safety and IDE support
Untyped mode: Dict[str, Any] for backward compatibility

Raises¶

ValueError: If the input cannot be parsed

Examples¶

from gfl.api import parse

# Untyped mode (backward compatible)
ast_dict = parse("""
experiment:
  tool: CRISPR_cas9
  type: gene_editing
  params:
    target_gene: TP53
""")

# Typed mode (recommended for new code)
ast_typed = parse("""
experiment:
  tool: CRISPR_cas9
  type: gene_editing
  params:
    target_gene: TP53
""", typed=True)

# IDE autocomplete works with typed mode!
print(ast_typed.experiment.tool)  # "CRISPR_cas9"
print(ast_typed.experiment.params.target_gene)  # "TP53"

validate()¶

Validate an AST for semantic correctness.

Signatures¶

# Detailed mode (recommended)
def validate(ast: Union[GFLAST, Dict[str, Any]], *, detailed: Literal[True]) -> ValidationResult: ...

# Simple mode (backward compatible)
def validate(ast: Union[GFLAST, Dict[str, Any]], *, detailed: Literal[False] = False) -> List[str]: ...

Parameters¶

ast: AST to validate (either GFLAST or Dict[str, Any])
detailed (bool, optional): If True, return detailed ValidationResult; if False, return List[str]. Default: False

Returns¶

Detailed mode: ValidationResult with categorized errors, warnings, and info
Simple mode: List[str] with error messages (empty list if valid)

Examples¶

from gfl.api import parse, validate

ast = parse("experiment:\n  tool: CRISPR_cas9\n  type: gene_editing")

# Backward compatible mode
errors = validate(ast)
if errors:
    print(f"Found {len(errors)} errors")
    for error in errors:
        print(f"  - {error}")

# Detailed mode (recommended)
result = validate(ast, detailed=True)
if not result.is_valid:
    print("Validation failed:")
    for error in result.errors:
        print(f"  ERROR: {error}")
    for warning in result.warnings:
        print(f"  WARNING: {warning}")

infer()¶

Run probabilistic post-processing with a provided model.

Signatures¶

# Detailed mode (recommended)
def infer(model, ast: Union[GFLAST, Dict[str, Any]], *, detailed: Literal[True]) -> InferenceResult: ...

# Simple mode (backward compatible)
def infer(model, ast: Union[GFLAST, Dict[str, Any]], *, detailed: Literal[False] = False) -> Dict[str, Any]: ...

Parameters¶

model: Model with predict(features: Dict[str, Any]) -> Dict[str, Any] method
ast: AST to run inference on
detailed (bool, optional): If True, return detailed InferenceResult; if False, return Dict[str, Any]. Default: False

Returns¶

Detailed mode: InferenceResult with predictions, confidence, and metadata
Simple mode: Dict[str, Any] for backward compatibility

Examples¶

from gfl.api import parse, infer
from gfl.models.dummy import DummyModel

ast = parse("experiment:\n  tool: CRISPR_cas9\n  type: gene_editing")
model = DummyModel()

# Backward compatible mode
results = infer(model, ast)
print(results["predictions"])

# Detailed mode (recommended)
result = infer(model, ast, detailed=True)
print(f"Confidence: {result.confidence}")
print(f"Predictions: {result.predictions}")

Convenience Functions¶

parse_file()¶

Parse GFL file and return AST.

def parse_file(file_path: str, *, typed: bool = False) -> Union[GFLAST, Dict[str, Any]]:

Parameters¶

file_path (str): Path to GFL file
typed (bool, optional): If True, return GFLAST; if False, return Dict[str, Any]. Default: False

Examples¶

from gfl.api import parse_file

# Parse file in typed mode
ast = parse_file("experiment.gfl", typed=True)

validate_file()¶

Parse and validate GFL file.

def validate_file(file_path: str, *, detailed: bool = False) -> Union[List[str], ValidationResult]:

Parameters¶

file_path (str): Path to GFL file
detailed (bool, optional): If True, return ValidationResult; if False, return List[str]. Default: False

Examples¶

from gfl.api import validate_file

# Validate file with detailed results
result = validate_file("experiment.gfl", detailed=True)
if result.is_valid:
    print("✓ File is valid")

get_api_info()¶

Get API version and compatibility information.

def get_api_info() -> Dict[str, str]:

Returns¶

Dictionary with API metadata: - version: Package version - api_version: API version (semantic versioning) - compatibility: Compatibility information - typed_support: Typed API support level - schema_version: JSON schema version

Examples¶

from gfl.api import get_api_info

info = get_api_info()
print(f"GFL API Version: {info['api_version']}")

Type System¶

Core Types¶

GFLAST¶

The main typed AST representation.

@dataclass
class GFLAST:
    experiment: Optional[Experiment] = None
    analyze: Optional[Analysis] = None
    simulate: Optional[bool] = None
    branch: Optional[Branch] = None
    metadata: Dict[str, Any] = field(default_factory=dict)

Experiment¶

Represents an experimental design.

@dataclass
class Experiment:
    tool: str
    type: ExperimentType
    params: ExperimentParams = field(default_factory=ExperimentParams)
    strategy: Optional[str] = None

Analysis¶

Represents an analysis configuration.

@dataclass
class Analysis:
    strategy: AnalysisStrategy
    data: Optional[str] = None
    thresholds: AnalysisThresholds = field(default_factory=AnalysisThresholds)
    filters: List[str] = field(default_factory=list)
    operations: List[AnalysisOperation] = field(default_factory=list)

Validation Types¶

ValidationResult¶

Detailed validation results.

@dataclass
class ValidationResult:
    errors: List[ValidationError] = field(default_factory=list)
    warnings: List[ValidationError] = field(default_factory=list)
    info: List[ValidationError] = field(default_factory=list)

    @property
    def is_valid(self) -> bool:
        return len(self.errors) == 0

ValidationError¶

Individual validation message.

@dataclass
class ValidationError:
    message: str
    location: Optional[str] = None
    severity: Literal["error", "warning", "info"] = "error"
    code: Optional[str] = None

Inference Types¶

InferenceResult¶

Detailed inference results.

@dataclass
class InferenceResult:
    predictions: Dict[str, Any]
    confidence: float
    metadata: Dict[str, Any] = field(default_factory=dict)

Migration Guide¶

From 0.1.x to 0.2.x¶

The 0.2.x API is fully backward compatible with 0.1.x, but adds new typed interfaces.

Old Code (0.1.x)¶

from gfl.api import parse, validate, infer

ast = parse(gfl_text)
errors = validate(ast)
results = infer(model, ast)

New Code (0.2.x - Recommended)¶

from gfl.api import parse, validate, infer

# Use typed API for better IDE support
ast = parse(gfl_text, typed=True)
result = validate(ast, detailed=True)
inference = infer(model, ast, detailed=True)

# Access with full type safety
if result.is_valid:
    print(f"Confidence: {inference.confidence}")

Error Handling¶

Common Exceptions¶

ValueError: Invalid input or configuration
FileNotFoundError: File not found (file functions)
ImportError: Missing optional dependencies

Best Practices¶

from gfl.api import parse, validate
from gfl.types import ValidationError

try:
    ast = parse(gfl_text, typed=True)
    result = validate(ast, detailed=True)

    if not result.is_valid:
        for error in result.errors:
            print(f"Validation error: {error}")
        return False

except ValueError as e:
    print(f"Parse error: {e}")
    return False

Performance Considerations¶

Memory Usage¶

Typed mode uses slightly more memory due to dataclass overhead
For large files, consider processing in chunks
Use lazy plugin loading to reduce startup time

Optimization Tips¶

# Cache parsed ASTs for repeated validation
ast = parse(text, typed=True)
result1 = validate(ast, detailed=True)
result2 = some_other_validation(ast)

# Use untyped mode for simple scripts
if simple_use_case:
    ast = parse(text)  # Faster for one-off processing

Version Compatibility¶

GFL Version	API Version	Python	Features
0.2.x	2.0.0	3.10+	Full typed API, schema validation
0.1.x	1.0.0	3.9+	Basic API, dict-only

Support¶

Issues: GitHub Issues
Documentation: README.md
Examples: examples/