MultimodalVA

Published: January 01, 2025

MultimodalVA is a Python package I developed for cause-of-death classification using verbal autopsy data. The package supports text-only, tabular-only, and ensemble workflows so researchers can work with narrative responses, structured questionnaire data, or both together in a unified modeling pipeline.

The package was built to make verbal autopsy modeling more reproducible and more extensible. It includes end-to-end training and prediction workflows, hyperparameter optimization support, evaluation tools, and multiple ensemble strategies for combining modalities.

Package scope

Text classification for verbal autopsy narratives
Tabular classification for structured symptom-response data
Ensemble and multimodal pipelines for combining text and tabular information
Shared utilities for splitting data, training models, scoring predictions, and visualizing results

Why it matters

Verbal autopsy data often contains both structured symptom questions and free-text narratives. MultimodalVA is designed to bring those information sources together in a practical research workflow, making it easier to experiment with models that better reflect the richness of the underlying data.

Current focus

The package continues to support my broader work on AI for global health and verbal autopsy analysis, including model development, benchmarking, and more accessible tooling for research collaborators.

Yue Chu

Package scope

Why it matters

Current focus