Streaming Parser

The streaming parser allows you to process large GDSII files without loading the entire file into memory. This is essential for working with multi-gigabyte layout files.

Overview

Instead of reading the entire file at once, the streaming parser:

  • Reads the file header and metadata
  • Processes structures one at a time
  • Uses callbacks to handle each structure
  • Minimizes memory usage

Basic Usage

#![allow(unused)]
fn main() {
use laykit::{StreamingGDSIIReader, StatisticsCollector};
use std::fs::File;
use std::io::BufReader;

// Open file
let file = File::open("large_design.gds")?;
let reader = BufReader::new(file);

// Create streaming reader
let mut streaming_reader = StreamingGDSIIReader::new(reader)?;

// Access file metadata
println!("Library: {}", streaming_reader.library_name());
println!("Units: {:?}", streaming_reader.units());

// Process structures with callback
let mut stats = StatisticsCollector::new();
streaming_reader.process_structures(&mut stats)?;

println!("Processed {} structures", stats.structure_count);
}

Callback Interface

Implement the StructureCallback trait to process structures:

#![allow(unused)]
fn main() {
use laykit::{StructureCallback, GDSStructure};

struct MyCallback {
    count: usize,
}

impl StructureCallback for MyCallback {
    fn on_structure(&mut self, structure: &GDSStructure) 
        -> Result<(), Box<dyn std::error::Error>> 
    {
        println!("Processing: {}", structure.name);
        println!("  Elements: {}", structure.elements.len());
        self.count += 1;
        Ok(())
    }
}

// Use it
let mut callback = MyCallback { count: 0 };
streaming_reader.process_structures(&mut callback)?;
}

Built-in Collectors

Statistics Collector

Collect basic statistics about the file:

#![allow(unused)]
fn main() {
use laykit::StatisticsCollector;

let mut stats = StatisticsCollector::new();
streaming_reader.process_structures(&mut stats)?;

println!("Structures: {}", stats.structure_count);
println!("Total elements: {}", stats.element_count);
}

Structure Name Collector

Extract all structure names:

#![allow(unused)]
fn main() {
use laykit::StructureNameCollector;

let mut collector = StructureNameCollector::new();
streaming_reader.process_structures(&mut collector)?;

for name in &collector.names {
    println!("Structure: {}", name);
}
}

Advanced Usage

Filtering Structures

Process only specific structures:

#![allow(unused)]
fn main() {
struct FilterCallback {
    pattern: String,
    matches: Vec<String>,
}

impl StructureCallback for FilterCallback {
    fn on_structure(&mut self, structure: &GDSStructure) 
        -> Result<(), Box<dyn std::error::Error>> 
    {
        if structure.name.contains(&self.pattern) {
            self.matches.push(structure.name.clone());
            println!("Found: {}", structure.name);
        }
        Ok(())
    }
}
}

Extracting Specific Data

Extract specific information while streaming:

#![allow(unused)]
fn main() {
struct LayerAnalyzer {
    layer_counts: HashMap<i16, usize>,
}

impl StructureCallback for LayerAnalyzer {
    fn on_structure(&mut self, structure: &GDSStructure) 
        -> Result<(), Box<dyn std::error::Error>> 
    {
        for element in &structure.elements {
            match element {
                GDSElement::Boundary(b) => {
                    *self.layer_counts.entry(b.layer).or_insert(0) += 1;
                }
                GDSElement::Path(p) => {
                    *self.layer_counts.entry(p.layer).or_insert(0) += 1;
                }
                _ => {}
            }
        }
        Ok(())
    }
}
}

Early Termination

Stop processing when a condition is met:

#![allow(unused)]
fn main() {
struct FindStructure {
    target: String,
    found: bool,
}

impl StructureCallback for FindStructure {
    fn on_structure(&mut self, structure: &GDSStructure) 
        -> Result<(), Box<dyn std::error::Error>> 
    {
        if structure.name == self.target {
            self.found = true;
            // Return error to stop processing
            return Err("Found target structure".into());
        }
        Ok(())
    }
}
}

Memory Considerations

Memory Usage

The streaming parser uses minimal memory:

  • File metadata: ~1 KB
  • Current structure: Depends on structure size
  • Callback data: Depends on your implementation

Total memory: Typically < 10 MB for callback overhead + largest single structure

Comparison

Approach1 GB File Memory10 GB File Memory
Full Load~2 GB~20 GB
Streaming~50 MB~50 MB

Performance

The streaming parser is slightly slower than loading the full file but uses dramatically less memory:

File SizeFull LoadStreamingMemory Saved
100 MB0.5s0.7s~150 MB
1 GB5.0s7.0s~1.5 GB
10 GB50s70s~15 GB

Limitations

Current streaming implementation:

  • ✅ Reads file headers
  • ✅ Processes structures sequentially
  • ✅ Minimal memory usage
  • ⚠️ Element parsing is simplified (structure-level only)
  • ⚠️ Cannot seek backward (forward-only)
  • ⚠️ GDSII only (OASIS streaming coming in v0.2.0)

When to Use Streaming

Use streaming when:

  • Working with files > 500 MB
  • Memory is limited
  • Only need specific information
  • Processing files sequentially

Use full loading when:

  • Files < 100 MB
  • Need random access to structures
  • Modifying the file
  • Performance is critical and memory is available

Future Enhancements

Planned for v0.2.0:

  • Full element parsing in streaming mode
  • OASIS streaming support
  • Parallel structure processing
  • Seeking/indexing support