Document Understanding
OCR, handwriting recognition, layout analysis, classification, extraction, and evaluation for records that are noisy, old, scanned, or inconsistent.
Machine Learning Engineer | Document Understanding | PhD
I build reliable machine learning systems for documents, handwriting, historical records, and other high-variance data that standard automation handles poorly.
Profile
My background combines PhD research in document understanding, enterprise machine learning experience at Ancestry, and hands-on engineering across Python, PyTorch, data pipelines, and web systems. I am strongest where model behavior, data quality, and business constraints all matter.
Professional Focus
OCR, handwriting recognition, layout analysis, classification, extraction, and evaluation for records that are noisy, old, scanned, or inconsistent.
Training workflows, inference services, model diagnostics, data curation, and pragmatic quality gates for systems that need to keep working.
Turning papers, prototypes, and ambiguous technical requirements into usable software without losing the assumptions that make the work valid.
Selected Work
Dissertation work focused on invariance, variance, and robustness in historical document understanding.
Offline handwriting recognition implementation using neural sequence modeling and CTC-style training.
Research on combining semantic segmentation masks and embeddings for census-style form classification.
Consulting
I also work with teams that need focused help integrating machine learning, LLM workflows, automation, and document intelligence into messy real-world operations.
View consulting page