GeneForgeLang API Reference

This document provides comprehensive documentation for the GeneForgeLang (GFL) stable API. The API follows semantic versioning and provides both typed and untyped interfaces.

API Version: 2.0.0

Compatibility: Backward compatible with 0.1.x APIs Package Version: 0.2.0+

Overview

The GFL API provides three main functions for working with genomic workflow specifications:

parse(): Convert GFL source code to AST
validate(): Check AST for semantic correctness
infer(): Run probabilistic reasoning on AST

All functions support both typed and untyped modes for maximum flexibility.

Core Functions

parse()

Parse GFL source code into an Abstract Syntax Tree (AST).

Signatures

# Typed mode (recommended)
def parse(text: str, *, typed: Literal[True]) -> GFLAST: ...

# Untyped mode (backward compatible)
def parse(text: str, *, typed: Literal[False] = False) -> Dict[str, Any]: ...

Parameters

text (str): GFL source code in YAML-style syntax
typed (bool, optional): If True, return typed GFLAST object; if False, return Dict[str, Any]. Default: False

Returns

Typed mode: GFLAST object with full type safety and IDE support
Untyped mode: Dict[str, Any] for backward compatibility

Raises

ValueError: If the input cannot be parsed

Examples

from gfl.api import parse

# Untyped mode (backward compatible)
ast_dict = parse("""
experiment:
  tool: CRISPR_cas9
  type: gene_editing
  params:
    target_gene: TP53
""")

# Typed mode (recommended for new code)
ast_typed = parse("""
experiment:
  tool: CRISPR_cas9
  type: gene_editing
  params:
    target_gene: TP53
""", typed=True)

# IDE autocomplete works with typed mode!
print(ast_typed.experiment.tool)  # "CRISPR_cas9"
print(ast_typed.experiment.params.target_gene)  # "TP53"

validate()

Validate an AST for semantic correctness.

Signatures

# Detailed mode (recommended)
def validate(ast: Union[GFLAST, Dict[str, Any]], *, detailed: Literal[True]) -> ValidationResult: ...

# Simple mode (backward compatible)
def validate(ast: Union[GFLAST, Dict[str, Any]], *, detailed: Literal[False] = False) -> List[str]: ...

Parameters

ast: AST to validate (either GFLAST or Dict[str, Any])
detailed (bool, optional): If True, return detailed ValidationResult; if False, return List[str]. Default: False

Returns

Detailed mode: ValidationResult with categorized errors, warnings, and info
Simple mode: List[str] with error messages (empty list if valid)

Examples

from gfl.api import parse, validate

ast = parse("experiment:\n  tool: CRISPR_cas9\n  type: gene_editing")

# Backward compatible mode
errors = validate(ast)
if errors:
    print(f"Found {len(errors)} errors")
    for error in errors:
        print(f"  - {error}")

# Detailed mode (recommended)
result = validate(ast, detailed=True)
if not result.is_valid:
    print("Validation failed:")
    for error in result.errors:
        print(f"  ERROR: {error}")
    for warning in result.warnings:
        print(f"  WARNING: {warning}")

infer()

Run probabilistic post-processing with a provided model.

Signatures

# Detailed mode (recommended)
def infer(model, ast: Union[GFLAST, Dict[str, Any]], *, detailed: Literal[True]) -> InferenceResult: ...

# Simple mode (backward compatible)
def infer(model, ast: Union[GFLAST, Dict[str, Any]], *, detailed: Literal[False] = False) -> Dict[str, Any]: ...

Parameters

model: Model with predict(features: Dict[str, Any]) -> Dict[str, Any] method
ast: AST to run inference on
detailed (bool, optional): If True, return detailed InferenceResult; if False, return Dict[str, Any]. Default: False

Returns

Detailed mode: InferenceResult with predictions, confidence, and metadata
Simple mode: Dict[str, Any] for backward compatibility

Examples

from gfl.api import parse, infer
from gfl.models.dummy import DummyModel

ast = parse("experiment:\n  tool: CRISPR_cas9\n  type: gene_editing")
model = DummyModel()

# Backward compatible mode
results = infer(model, ast)
print(results["predictions"])

# Detailed mode (recommended)
result = infer(model, ast, detailed=True)
print(f"Confidence: {result.confidence}")
print(f"Predictions: {result.predictions}")

Convenience Functions

parse_file()

Parse GFL file and return AST.

def parse_file(file_path: str, *, typed: bool = False) -> Union[GFLAST, Dict[str, Any]]:

Parameters

file_path (str): Path to GFL file
typed (bool, optional): If True, return GFLAST; if False, return Dict[str, Any]. Default: False

Examples

from gfl.api import parse_file

# Parse file in typed mode
ast = parse_file("experiment.gfl", typed=True)

validate_file()

Parse and validate GFL file.

def validate_file(file_path: str, *, detailed: bool = False) -> Union[List[str], ValidationResult]:

Parameters

file_path (str): Path to GFL file
detailed (bool, optional): If True, return ValidationResult; if False, return List[str]. Default: False

Examples

from gfl.api import validate_file

# Validate file with detailed results
result = validate_file("experiment.gfl", detailed=True)
if result.is_valid:
    print("✓ File is valid")

get_api_info()

Get API version and compatibility information.

def get_api_info() -> Dict[str, str]:

Returns

Dictionary with API metadata: - version: Package version - api_version: API version (semantic versioning) - compatibility: Compatibility information - typed_support: Typed API support level - schema_version: JSON schema version

Examples

from gfl.api import get_api_info

info = get_api_info()
print(f"GFL API Version: {info['api_version']}")

Type System

Core Types

GFLAST

The main typed AST representation.

@dataclass
class GFLAST:
    experiment: Optional[Experiment] = None
    analyze: Optional[Analysis] = None
    simulate: Optional[bool] = None
    branch: Optional[Branch] = None
    metadata: Dict[str, Any] = field(default_factory=dict)

Experiment

Represents an experimental design.

@dataclass
class Experiment:
    tool: str
    type: ExperimentType
    params: ExperimentParams = field(default_factory=ExperimentParams)
    strategy: Optional[str] = None

Analysis

Represents an analysis configuration.

@dataclass
class Analysis:
    strategy: AnalysisStrategy
    data: Optional[str] = None
    thresholds: AnalysisThresholds = field(default_factory=AnalysisThresholds)
    filters: List[str] = field(default_factory=list)
    operations: List[AnalysisOperation] = field(default_factory=list)

Validation Types

ValidationResult

Detailed validation results.

@dataclass
class ValidationResult:
    errors: List[ValidationError] = field(default_factory=list)
    warnings: List[ValidationError] = field(default_factory=list)
    info: List[ValidationError] = field(default_factory=list)

    @property
    def is_valid(self) -> bool:
        return len(self.errors) == 0

ValidationError

Individual validation message.

@dataclass
class ValidationError:
    message: str
    location: Optional[str] = None
    severity: Literal["error", "warning", "info"] = "error"
    code: Optional[str] = None

Inference Types

InferenceResult

Detailed inference results.

@dataclass
class InferenceResult:
    predictions: Dict[str, Any]
    confidence: float
    metadata: Dict[str, Any] = field(default_factory=dict)

Migration Guide

From 0.1.x to 0.2.x

The 0.2.x API is fully backward compatible with 0.1.x, but adds new typed interfaces.

Old Code (0.1.x)

from gfl.api import parse, validate, infer

ast = parse(gfl_text)
errors = validate(ast)
results = infer(model, ast)

New Code (0.2.x - Recommended)

from gfl.api import parse, validate, infer

# Use typed API for better IDE support
ast = parse(gfl_text, typed=True)
result = validate(ast, detailed=True)
inference = infer(model, ast, detailed=True)

# Access with full type safety
if result.is_valid:
    print(f"Confidence: {inference.confidence}")

Error Handling

Common Exceptions

ValueError: Invalid input or configuration
FileNotFoundError: File not found (file functions)
ImportError: Missing optional dependencies

Best Practices

from gfl.api import parse, validate
from gfl.types import ValidationError

try:
    ast = parse(gfl_text, typed=True)
    result = validate(ast, detailed=True)

    if not result.is_valid:
        for error in result.errors:
            print(f"Validation error: {error}")
        return False

except ValueError as e:
    print(f"Parse error: {e}")
    return False

Performance Considerations

Memory Usage

Typed mode uses slightly more memory due to dataclass overhead
For large files, consider processing in chunks
Use lazy plugin loading to reduce startup time

Optimization Tips

# Cache parsed ASTs for repeated validation
ast = parse(text, typed=True)
result1 = validate(ast, detailed=True)
result2 = some_other_validation(ast)

# Use untyped mode for simple scripts
if simple_use_case:
    ast = parse(text)  # Faster for one-off processing

Version Compatibility

GFL Version	API Version	Python	Features
0.2.x	2.0.0	3.10+	Full typed API, schema validation
0.1.x	1.0.0	3.9+	Basic API, dict-only

Support

Issues: GitHub Issues
Documentation: README.md
Examples: examples/