MultimodalVA
Published:
MultimodalVA is a Python package I developed for cause-of-death classification using verbal autopsy data. The package supports text-only, tabular-only, and ensemble workflows so researchers can work with narrative responses, structured questionnaire data, or both together in a unified modeling pipeline.
The package was built to make verbal autopsy modeling more reproducible and more extensible. It includes end-to-end training and prediction workflows, hyperparameter optimization support, evaluation tools, and multiple ensemble strategies for combining modalities.
Package scope
- Text classification for verbal autopsy narratives
- Tabular classification for structured symptom-response data
- Ensemble and multimodal pipelines for combining text and tabular information
- Shared utilities for splitting data, training models, scoring predictions, and visualizing results
Why it matters
Verbal autopsy data often contains both structured symptom questions and free-text narratives. MultimodalVA is designed to bring those information sources together in a practical research workflow, making it easier to experiment with models that better reflect the richness of the underlying data.
Current focus
The package continues to support my broader work on AI for global health and verbal autopsy analysis, including model development, benchmarking, and more accessible tooling for research collaborators.
