Marothi LETSOALO
  • Me
  • Research
    • Overview
    • PhDDesk
    • Collaborations
  • Trainings
  • News
  • Blog
  • Apps & Tools

On this page

  • Focus
  • Overview
  • Documentation Design
  • Mplus Naming Layer
  • Immediate Data Domains
  • Next Documentation Steps

Metadata & Data Documentation

Metadata
Data Documentation
Variable Registry
Mplus
CDISC-style Naming
Longitudinal Data
Reproducibility
Obstetric HIV
Severe Maternal Outcomes
Documents the PhD data architecture by defining the core variable registry, source tables, join logic, Mplus-ready four-character aliases, and stream-specific use of data for reproducible obstetric HIV analysis.
Authors

Marothi Peter Letsoalo

Danielle Jade Roberts

Nonhlanhla Yende-Zuma

Date Updated

March 25, 2026

Keywords

PhDDesk

Focus

Structured metadata and data documentation for reproducible obstetric HIV phenotype modeling, longitudinal inference, and dynamic prediction.

Overview

This documentation stream formalizes the data architecture that supports the PhD analytical program after the Proposal Abstract stage. The objective is to move from conceptual aims to an explicit variable registry, join-key strategy, and analytical-use map so that each downstream model is tied to a documented data object rather than an informal extraction workflow.

The current metadata build is organized around six linked assets:

  1. The source eCRF data dictionary for the pregnancy cohort.
  2. A complete variable shortlist spanning social, clinical, treatment, and outcome domains.
  3. A core analytical shortlist restricted to variables with direct methodological use across the PhD aims.
  4. An aim-linked use map that records why each retained variable is needed and where it enters the statistical pipeline.
  5. A synthetic ADaM contract that fixes the prototype analysis datasets and model-input files before live data are linked.
  6. A shortlist-to-ADaM derivation map that shows how the prototype social-risk indicators are constructed from retained baseline source fields.

The current core registry has already been mapped to the three main data-analytic streams that follow the methodological audit. The same variable does not have a single static role across the dissertation; some variables act as baseline confounders in one stream, time-varying predictors in another, and endpoint-support variables in the translational prediction stage. The prototype data layer now also records where a derived analysis variable begins, rather than allowing it to appear only inside downstream model code.

Documentation Design

The metadata layer is being built around three principles:

  1. Variable provenance must be explicit. Every retained field should be traceable to its source table, join key, and intended analytical role.
  2. Aim-specific use must be documented. The registry should show whether a variable is used for phenotype construction, confounding control, trajectory modeling, endpoint definition, or dynamic prediction support.
  3. Missingness and timing cannot be treated as afterthoughts. Because the cohort contains irregular visit schedules and partial biomarker capture, the documentation needs to record missing-data burden and temporal anchors as part of the analysis design.
  4. Derived variables must be traceable to retained source fields. The prototype ADaM layer should expose the derivation logic for social-risk indicators and other constructed variables instead of hiding them inside analysis scripts.

Mplus Naming Layer

The current metadata build now includes a provisional Mplus naming layer for core variables. These aliases are designed to be short, syntax-friendly, and easier to manage in model scripts while still retaining a CDISC-style domain logic.

The current naming proposal follows four rules:

  1. All analysis names are uppercase.
  2. Each suggested alias is limited to a maximum of 4 characters.
  3. The first two characters identify the source-domain family.
  4. The last two characters summarize the variable concept and remain provisional until the analysis dataset structure is finalized.
Current Documentation Scope
Documentation asset Current role
db_ecrf_data_dictionary.csv Source-level variable definitions and field naming reference.
maternal_health_variable_shortlist.csv Broad candidate registry for maternal-health analysis planning.
maternal_health_variable_shortlist_core.csv Core analytical variable set for the PhD program.
maternal_health_core_use_by_objective.qmd Aim-linked documentation of variable use and methodological justification.
Analytical Coverage
Research stream Core variables Scope
Social Risk Phenotypes and the Maternal Burden of Obstetric HIV: A Causal Mixture Modeling Approach 27 Baseline social structure, reproductive history, STI context, and treatment covariates used for latent social-risk phenotyping and causal mixture modeling.
Multivariate Maternal Risk Phenotyping: Handling Time-Unstructuredness in Growth Mixture Models 17 Physiological and HIV-related markers used to characterize multivariate maternal risk trajectories under irregular follow-up.
Multidimensional Latent Burden and Translational Stratification: A Dynamic Prediction Framework 62 Outcome, timing, biomarker, follow-up, and support variables used for severe maternal outcome definition and dynamic prediction.
Preliminary Mplus Alias Scheme
Item Current rule
Naming style CDISC-style uppercase alias suggestion
Maximum length 4 characters
Structure 2-character source-domain prefix + 2-character analysis suffix
Intended use Mplus-ready short names for modeling datasets and syntax files
Status Provisional naming layer to be refined with the evolving analysis data model
Synthetic ADaM Contract
Variable Label Role Source anchor Notes
ADLB
BMI_ANT Antenatal BMI biomarker bmi Longitudinal anthropometry.
CD4_CNT CD4 count biomarker cd4_count Longitudinal immune marker.
DAYS_SINCE_ANC1 Days since first ANC time_var visit chronology Clinical timescale.
GEST_DAYS_CORRECTED Corrected gestational age (days) time_var edddt / last_menstrual_period Gold-standard gestational timescale.
HGB_LVL Hemoglobin biomarker haemoglobin Longitudinal hematology marker.
LOG_VL Log viral load biomarker viral_load Longitudinal HIV biomarker.
ADSL
BASE_AGE Maternal age at enrollment baseline_covariate age Baseline demographic covariate.
BASE_BMI Baseline BMI baseline_covariate bmi Baseline physiologic covariate.
CD4_BASE Baseline CD4 count baseline_covariate cd4_count Baseline HIV disease activity marker.
ECON Economic strain indicator derived_social_indicator highest_education + number_of_rooms + work_status Derived from shortlisted fields: highest_education, number_of_rooms, work_status.
ECTV Time-varying material strain signal derived_social_indicator use_family_planning Derived from shortlisted fields: use_family_planning.
EST_GA_AT_ANC1 Estimated gestational age at first ANC (days) timing_anchor est_gestational_age_wks Clinical timeline anchor.
GRAVIDITY Prior pregnancies baseline_covariate total_number_pregnancies Baseline reproductive history.
HOUS Housing constraint indicator derived_social_indicator kind_of_toilet + number_of_rooms + people_sleep_in_room + tap_water Derived from shortlisted fields: kind_of_toilet, number_of_rooms, people_sleep_in_room, tap_water.
MOM_ID Analysis mother identifier subject_key usubjid Anchored to maternal_health_variable_shortlist_core.csv
PART Partner-context risk indicator derived_social_indicator marital_status + partner_hiv_status + partner_other_sexual_relations Derived from shortlisted fields: marital_status, partner_hiv_status, partner_other_sexual_relations.
PRTV Time-varying partner-context signal derived_social_indicator partner_other_sexual_relations Derived from shortlisted fields: partner_other_sexual_relations.
SITE_COMPLEX Facility complexity design_covariate siteid/sitenm Ordinal site complexity proxy.
SITE_COUNTRY Country design_covariate country Site-level context.
SITE_ID Site identifier design_covariate siteid Primary clustering variable for multilevel mixture models.
SITE_VL_SUPP Site-level viral suppression rate design_covariate siteid Continuous contextual covariate.
SOC_RISK_CLASS Synthetic latent social risk class simulation_latent_truth Derived Used only for synthetic data generation and fallback prediction.
TRNS Transport barrier indicator derived_social_indicator travel_time_to_clinic Derived from shortlisted fields: travel_time_to_clinic.
ADSL_RAW
highest_education Your highest level of education shortlist_source_field highest_education Shortlisted baseline source retained in the synthetic prototype (SOCDEM).
kind_of_toilet What kind of toilet do you have shortlist_source_field kind_of_toilet Shortlisted baseline source retained in the synthetic prototype (SOCDEM).
marital_status Marital status shortlist_source_field marital_status Shortlisted baseline source retained in the synthetic prototype (SOCDEM).
number_of_rooms Number of rooms house has shortlist_source_field number_of_rooms Shortlisted baseline source retained in the synthetic prototype (SOCDEM).
partner_hiv_status What is your partner hiv status shortlist_source_field partner_hiv_status Shortlisted baseline source retained in the synthetic prototype (SOCDEM).
partner_other_sexual_relations Partner has other sexual relations shortlist_source_field partner_other_sexual_relations Shortlisted baseline source retained in the synthetic prototype (SRHH).
people_sleep_in_room How many people sleep in your room shortlist_source_field people_sleep_in_room Shortlisted baseline source retained in the synthetic prototype (SOCDEM).
tap_water Tap water in the premises shortlist_source_field tap_water Shortlisted baseline source retained in the synthetic prototype (SOCDEM).
travel_time_to_clinic Travel time from home to clinic shortlist_source_field travel_time_to_clinic Shortlisted baseline source retained in the synthetic prototype (SOCDEM).
use_family_planning Did you use family planning methods shortlist_source_field use_family_planning Shortlisted baseline source retained in the synthetic prototype (SRHH).
work_status Do you work shortlist_source_field work_status Shortlisted baseline source retained in the synthetic prototype (SOCDEM).
ADTTE
PTB_STATUS Preterm birth status secondary_outcome delivery outcomes Synthetic secondary outcome.
SGA_STATUS Small-for-gestational-age status secondary_outcome delivery outcomes Synthetic secondary outcome.
SMO_DAYS Days to severe maternal outcome or censoring time_to_event delivery/adverse event timing Event timescale for Aim 4.
SMO_EVENT Severe maternal outcome flag time_to_event pregnancy_outcome / adverse events Primary dynamic prediction event indicator.
MPLUS_AIM2
aim2_social_mplus.dat Wide categorical social indicator export model_input ADSL Written from the centralized synthetic ADaM layer.
MPLUS_AIM3
aim3_logvl_wide.dat Wide TSCORES export for LOG_VL model_input ADLB Written from the centralized synthetic ADaM layer.
Current PME prototype data contract generated from the centralized synthetic ADaM builder.
Shortlist to ADaM Social-Derivation Map
Source variable Source label Source table Aim scope Derivation logic Role
ECON - Economic strain indicator
highest_education Your highest level of education SOCDEM aim_2;aim_4 Score lower educational attainment as structural economic vulnerability. Latent social phenotype indicator
number_of_rooms Number of rooms house has SOCDEM aim_2;aim_4 Score very low room count as constrained household resources. Latent social phenotype indicator
work_status Do you work SOCDEM aim_2;aim_4 Score unemployment/household work as material strain. Latent social phenotype indicator
ECTV - Time-varying material strain signal
use_family_planning Did you use family planning methods SRHH aim_2;aim_4 Use family-planning non-use as a proxy support signal layered onto baseline material strain. Time-varying support flag
HOUS - Housing constraint indicator
kind_of_toilet What kind of toilet do you have SOCDEM aim_2;aim_4 Flag lower-quality sanitation as household infrastructure strain. Latent social phenotype indicator
number_of_rooms Number of rooms house has SOCDEM aim_2;aim_4 Combine low room count with crowding to define housing constraint. Latent social phenotype indicator
people_sleep_in_room How many people sleep in your room SOCDEM aim_2;aim_4 Combine sleeping density with rooms to define crowding burden. Latent social phenotype indicator
tap_water Tap water in the premises SOCDEM aim_2;aim_4 Flag absent on-premises tap water as household infrastructure strain. Latent social phenotype indicator
PART - Partner-context risk indicator
marital_status Marital status SOCDEM aim_2;aim_4 Score unstable partner context as relational vulnerability. Latent social phenotype indicator
partner_hiv_status What is your partner hiv status SOCDEM aim_2;aim_4 Score partner HIV-positive status as partner-context vulnerability. Latent social phenotype indicator
partner_other_sexual_relations Partner has other sexual relations SRHH aim_2;aim_4 Score known partner concurrency as partner-context vulnerability. Latent social phenotype indicator
PRTV - Time-varying partner-context signal
partner_other_sexual_relations Partner has other sexual relations SRHH aim_2;aim_4 Carry partner-context instability into a time-varying support flag. Time-varying support flag
TRNS - Transport barrier indicator
travel_time_to_clinic Travel time from home to clinic SOCDEM aim_2;aim_4 Flag long clinic travel times as access-to-care barriers. Latent social phenotype indicator
The social-risk indicators used in the PME prototype are now derived explicitly from shortlisted baseline fields rather than simulated directly in the aim scripts.
Derived Analytical Objects
Derived object Analytical role Mplus name Purpose
Social risk phenotyping
Social risk phenotype class Latent baseline social-vulnerability phenotype XSRP Represents the derived latent social-risk grouping used in causal mixture analyses for obstetric HIV.
Posterior social-risk probability Class-membership uncertainty summary for social-risk models PSRP Retains posterior uncertainty so social-risk class assignment is not treated as fixed in downstream estimation.
Maternal burden phenotyping
Maternal burden phenotype class Latent longitudinal physiological-burden phenotype XMBP Represents the derived multivariate maternal-risk trajectory grouping under irregular longitudinal follow-up.
Outcome endpoint
Severe maternal outcome endpoint Binary downstream maternal endpoint YSMO Defines the derived severe maternal outcome target used for modeling and translational interpretation.
Prediction objects
Dynamic landmark risk estimate Updated individualized risk at clinical prediction landmarks RDLM Stores updated patient-level risk predictions generated from the joint longitudinal-event modeling framework.
Uncertainty objects
BCH / posterior uncertainty weights Uncertainty control for latent-class and mixture-model estimation WBCH Carries BCH-style or posterior-probability uncertainty adjustments into distal outcome and prediction models.

Observed Variable Maps

Social Risk Phenotypes and the Maternal Burden of Obstetric HIV: A Causal Mixture Modeling Approach
Social Risk Phenotypes and the Maternal Burden of Obstetric HIV: A Causal Mixture Modeling Approach
Source Original variable Mplus name Label Role in analysis Join keys Justification
Arv Treatment
ARV drg1 ARD1 Drug 1 treatment_exposure usubjid|dsta Primary regimen component for therapy exposure grouping.
ARV dsta ARDS Date started treatment_exposure usubjid|dsta Treatment exposure start needed for ART-time alignment.
ARV dsto ARDE Date stopped treatment_exposure usubjid|dsta Treatment exposure stop needed for treatment dynamics and censoring.
Clinical Sti
SRHH abnormal_vaginal_discharge SRAV Last 3 months-abnormal vag discharge clinical_covariate usubjid|visitnum Clinical STI symptom indicator; model separately from social exposure constructs.
SRHH genital_herpes SRGH In the last 3 months-genital herpes clinical_covariate usubjid|visitnum Clinical STI diagnosis indicator; model separately from social exposure constructs.
SRHH syphilis SRSY In the last 3 months-syphilis clinical_covariate usubjid|visitnum Clinical STI diagnosis indicator; model separately from social exposure constructs.
Obstetric History
OBSH number_abortions OBAB Number of abortions/miscarriages baseline_covariate usubjid|visitnum Baseline reproductive history risk profile.
OBSH number_premature_births OBPB Premature births baseline_covariate usubjid|visitnum Baseline reproductive history risk profile.
OBSH number_stillbirths OBST Number stillbirths >20 weeks baseline_covariate usubjid|visitnum Baseline reproductive history risk profile.
OBSH outcome_last_pregnancy OBOP Outcome of the last pregnancy baseline_covariate usubjid|visitnum Baseline reproductive history risk profile.
OBSH total_number_pregnancies OBTP Total number of pregnancies baseline_covariate usubjid|visitnum Baseline reproductive history risk profile.
Sexual Reproductive
SRHH currently_pregnant SRCP Currently pregnant covariate usubjid|visitnum Sexual/reproductive and STI vulnerability profile.
Site Context
CNTRYST country CNCO Country design_covariate siteid Country-level context for health-system and policy heterogeneity adjustment.
CNTRYST siteid CNSI Site ID design_covariate siteid Primary site-level clustering and stratification covariate across aims.
CNTRYST sitenm CNS1 Site Name design_covariate siteid Readable site label for outputs and quality control checks.
Social
SRHH partner_other_sexual_relations SRPO Partner has other sexual relations covariate usubjid|visitnum Social relationship-risk context indicator; keep in social exposure domain.
SRHH use_family_planning SRUF Did you use family planning methods covariate usubjid|visitnum Behavior/access indicator in reproductive social context; model with social determinants.
Social Baseline
SOCDEM age SDAG Age covariate usubjid|visitnum Baseline demographic confounder for all models.
SOCDEM highest_education SDHE Your highest level of education covariate usubjid|visitnum Socioeconomic confounder and social phenotype component.
SOCDEM kind_of_toilet SDTO What kind of toilet do you have covariate usubjid|visitnum Baseline social determinants and access profile.
SOCDEM marital_status SDMS Marital status covariate usubjid|visitnum Baseline social determinants and access profile.
SOCDEM number_of_rooms SDRO Number of rooms house has covariate usubjid|visitnum Baseline social determinants and access profile.
SOCDEM partner_hiv_status SDPH What is your partner hiv status covariate usubjid|visitnum Baseline social determinants and access profile.
SOCDEM people_sleep_in_room SDPS How many people sleep in your room covariate usubjid|visitnum Baseline social determinants and access profile.
SOCDEM tap_water SDTW Tap water in the premises covariate usubjid|visitnum Baseline social determinants and access profile.
SOCDEM travel_time_to_clinic SDTT Travel time from home to clinic covariate usubjid|visitnum Access-to-care proxy affecting both exposure and outcomes.
SOCDEM work_status SDWS Do you work covariate usubjid|visitnum Economic vulnerability indicator for social phenotype classes.
Baseline social structure, reproductive history, STI context, and treatment covariates used for latent social-risk phenotyping and causal mixture modeling.
Multivariate Maternal Risk Phenotyping: Handling Time-Unstructuredness in Growth Mixture Models
Multivariate Maternal Risk Phenotyping: Handling Time-Unstructuredness in Growth Mixture Models
Source Original variable Mplus name Label Role in analysis Join keys Justification
Arv Treatment
ARV drg1 ARD1 Drug 1 treatment_exposure usubjid|dsta Primary regimen component for therapy exposure grouping.
ARV dsta ARDS Date started treatment_exposure usubjid|dsta Treatment exposure start needed for ART-time alignment.
ARV dsto ARDE Date stopped treatment_exposure usubjid|dsta Treatment exposure stop needed for treatment dynamics and censoring.
Biochemistry
LFT creatinine1 LFCR Creatinine biomarker usubjid|visitnum Core renal dysfunction marker relevant to severe maternal outcomes.
LFT total_bilirubin LFTB Total bilirubin biomarker usubjid|visitnum Core hepatic dysfunction marker relevant to severe maternal outcomes. Methodologically essential despite higher missingness; use missing-data strategy.
Hematology
FBC haemoglobin FBHB Haemoglobin biomarker usubjid|visitnum Core anemia marker for maternal risk phenotyping.
FBC platelet_count FBPL Platelet count biomarker usubjid|visitnum Core coagulation/hematologic risk marker.
Hiv Markers
HIVM cd4_count HIC4 Cd4 count biomarker usubjid|visitnum Core immune status marker for risk adjustment and phenotyping. Methodologically essential despite higher missingness; use missing-data strategy.
HIVM viral_load HIVL Viral load copies biomarker usubjid|visitnum Core HIV disease activity marker for maternal risk stratification. Methodologically essential despite higher missingness; use missing-data strategy.
Site Context
CNTRYST country CNCO Country design_covariate siteid Country-level context for health-system and policy heterogeneity adjustment.
CNTRYST siteid CNSI Site ID design_covariate siteid Primary site-level clustering and stratification covariate across aims.
CNTRYST sitenm CNS1 Site Name design_covariate siteid Readable site label for outputs and quality control checks.
Vitals
VIT bmi VTBM Bmi biomarker usubjid|visitnum Core maternal physiologic trajectory measures.
VIT diastolic_bp VTDB Diastolic bp biomarker usubjid|visitnum Core maternal hemodynamic marker linked to obstetric risk.
VIT systolic_bp VTSB Systolic bp biomarker usubjid|visitnum Core maternal hemodynamic marker linked to obstetric risk.
VIT weight VTWT Weight biomarker usubjid|visitnum Core maternal physiologic trajectory measures.
Who Stage
WHO who_clinical_classification WHWC Who clinical classification disease_severity_covariate usubjid|visitnum Global HIV clinical severity summary for baseline adjustment.
Physiological and HIV-related markers used to characterize multivariate maternal risk trajectories under irregular follow-up.
Multidimensional Latent Burden and Translational Stratification: A Dynamic Prediction Framework
Multidimensional Latent Burden and Translational Stratification: A Dynamic Prediction Framework
Source Original variable Mplus name Label Role in analysis Join keys Justification
Adverse Events
AE ae_severity AESE Adverse event severity outcome_support usubjid|vstdt Supports severe-event capture beyond delivery record fields.
AE serious_adverse_event AESA Is this a serious adverse event outcome_support usubjid|vstdt Captures severe intercurrent maternal morbidity events.
Arv Treatment
ARV drg1 ARD1 Drug 1 treatment_exposure usubjid|dsta Primary regimen component for therapy exposure grouping.
ARV dsta ARDS Date started treatment_exposure usubjid|dsta Treatment exposure start needed for ART-time alignment.
ARV dsto ARDE Date stopped treatment_exposure usubjid|dsta Treatment exposure stop needed for treatment dynamics and censoring.
Biochemistry
LFT creatinine1 LFCR Creatinine biomarker usubjid|visitnum Core renal dysfunction marker relevant to severe maternal outcomes.
LFT total_bilirubin LFTB Total bilirubin biomarker usubjid|visitnum Core hepatic dysfunction marker relevant to severe maternal outcomes. Methodologically essential despite higher missingness; use missing-data strategy.
Clinical Sti
SRHH abnormal_vaginal_discharge SRAV Last 3 months-abnormal vag discharge clinical_covariate usubjid|visitnum Clinical STI symptom indicator; model separately from social exposure constructs.
SRHH genital_herpes SRGH In the last 3 months-genital herpes clinical_covariate usubjid|visitnum Clinical STI diagnosis indicator; model separately from social exposure constructs.
SRHH syphilis SRSY In the last 3 months-syphilis clinical_covariate usubjid|visitnum Clinical STI diagnosis indicator; model separately from social exposure constructs.
Delivery Outcomes
LDR deldt LDDD Delivery date outcome usubjid|visitnum Defines pregnancy and delivery endpoints.
LDR gestational_diabetes LDGD Gestational diabetes outcome usubjid|visitnum Defines pregnancy and delivery endpoints.
LDR high_blood_pressure LDHB High blood pressure outcome usubjid|visitnum Defines pregnancy and delivery endpoints.
LDR obstetrics_complications LDOC Any obstetric complications outcome usubjid|visitnum Defines pregnancy and delivery endpoints.
LDR pregnancy_outcome LDPO 2 indicate pregnancy outcome outcome usubjid|visitnum Primary maternal outcome definition for SMO trajectory and endpoint modeling.
LDR type_delivery LDTD Type of delivery outcome usubjid|visitnum Defines pregnancy and delivery endpoints.
LDR vaginal_bleeding LDVB Vaginal bleeding outcome usubjid|visitnum Defines pregnancy and delivery endpoints.
Hematology
FBC haemoglobin FBHB Haemoglobin biomarker usubjid|visitnum Core anemia marker for maternal risk phenotyping.
FBC platelet_count FBPL Platelet count biomarker usubjid|visitnum Core coagulation/hematologic risk marker.
Hiv Markers
HIVM cd4_count HIC4 Cd4 count biomarker usubjid|visitnum Core immune status marker for risk adjustment and phenotyping. Methodologically essential despite higher missingness; use missing-data strategy.
HIVM viral_load HIVL Viral load copies biomarker usubjid|visitnum Core HIV disease activity marker for maternal risk stratification. Methodologically essential despite higher missingness; use missing-data strategy.
Hospital Admissions
HOSP admdt HOAD Admission date outcome_support usubjid|admdt Hospitalization timing signal for acute decompensation episodes.
HOSP diag HODG Diagnosis outcome_support usubjid|admdt Captures acute clinical deterioration episodes.
Maternal Followup
MAFU admitted_hospital MFAH Been admitted to hospital time_varying_covariate usubjid|visitnum Time-varying maternal clinical status and symptoms.
MAFU current_health_status MFHS General health status time_varying_covariate usubjid|visitnum Time-varying maternal clinical status and symptoms.
MAFU current_pregnancy_status MFPS Current pregnancy status time_varying_covariate usubjid|visitnum Time-varying maternal clinical status and symptoms.
Obstetric History
OBSH number_abortions OBAB Number of abortions/miscarriages baseline_covariate usubjid|visitnum Baseline reproductive history risk profile.
OBSH number_premature_births OBPB Premature births baseline_covariate usubjid|visitnum Baseline reproductive history risk profile.
OBSH number_stillbirths OBST Number stillbirths >20 weeks baseline_covariate usubjid|visitnum Baseline reproductive history risk profile.
OBSH outcome_last_pregnancy OBOP Outcome of the last pregnancy baseline_covariate usubjid|visitnum Baseline reproductive history risk profile.
OBSH total_number_pregnancies OBTP Total number of pregnancies baseline_covariate usubjid|visitnum Baseline reproductive history risk profile.
Pregnancy Followup
ANCF admitted_hospital ANAH Admitted to hospital since last visit time_varying_covariate usubjid|visitnum Antenatal follow-up status and interim clinical changes.
ANCF change_in_arv_regimen ANCA Change in arv regimen since last visit time_varying_covariate usubjid|visitnum Antenatal follow-up status and interim clinical changes.
ANCF current_pregnancy_status ANPS Current pregnancy status time_varying_covariate usubjid|visitnum Antenatal follow-up status and interim clinical changes.
Pregnancy Registration
PRF conceive_on_arvs PRCA Conceive while taking arvs pregnancy_timing_covariate usubjid|visitnum Pregnancy registration and gestational timing anchors.
PRF edddt PRED Expected delivery date pregnancy_timing_covariate usubjid|visitnum Pregnancy time anchor for gestational alignment and dynamic prediction windows.
PRF est_gestational_age_wks PREG Estimated gestational age-weeks pregnancy_timing_covariate usubjid|visitnum Core gestational timing covariate for longitudinal models.
PRF last_menstrual_period PRMP Date of last menstrual period pregnancy_timing_covariate usubjid|visitnum Pregnancy registration and gestational timing anchors.
Sexual Reproductive
SRHH currently_pregnant SRCP Currently pregnant covariate usubjid|visitnum Sexual/reproductive and STI vulnerability profile.
Site Context
CNTRYST country CNCO Country design_covariate siteid Country-level context for health-system and policy heterogeneity adjustment.
CNTRYST siteid CNSI Site ID design_covariate siteid Primary site-level clustering and stratification covariate across aims.
CNTRYST sitenm CNS1 Site Name design_covariate siteid Readable site label for outputs and quality control checks.
Social
SRHH partner_other_sexual_relations SRPO Partner has other sexual relations covariate usubjid|visitnum Social relationship-risk context indicator; keep in social exposure domain.
SRHH use_family_planning SRUF Did you use family planning methods covariate usubjid|visitnum Behavior/access indicator in reproductive social context; model with social determinants.
Social Baseline
SOCDEM age SDAG Age covariate usubjid|visitnum Baseline demographic confounder for all models.
SOCDEM highest_education SDHE Your highest level of education covariate usubjid|visitnum Socioeconomic confounder and social phenotype component.
SOCDEM kind_of_toilet SDTO What kind of toilet do you have covariate usubjid|visitnum Baseline social determinants and access profile.
SOCDEM marital_status SDMS Marital status covariate usubjid|visitnum Baseline social determinants and access profile.
SOCDEM number_of_rooms SDRO Number of rooms house has covariate usubjid|visitnum Baseline social determinants and access profile.
SOCDEM partner_hiv_status SDPH What is your partner hiv status covariate usubjid|visitnum Baseline social determinants and access profile.
SOCDEM people_sleep_in_room SDPS How many people sleep in your room covariate usubjid|visitnum Baseline social determinants and access profile.
SOCDEM tap_water SDTW Tap water in the premises covariate usubjid|visitnum Baseline social determinants and access profile.
SOCDEM travel_time_to_clinic SDTT Travel time from home to clinic covariate usubjid|visitnum Access-to-care proxy affecting both exposure and outcomes.
SOCDEM work_status SDWS Do you work covariate usubjid|visitnum Economic vulnerability indicator for social phenotype classes.
Termination Death
TERM cause_of_death TRCD Cause of death outcome usubjid|visitnum Captures terminal outcomes and death attribution.
TERM dthdt TRDT Date of death outcome usubjid|visitnum Maternal death timing anchor for severe outcome endpoint.
TERM reason_for_termination TRRF Reason for termination outcome usubjid|visitnum Captures terminal outcomes and death attribution.
Vitals
VIT bmi VTBM Bmi biomarker usubjid|visitnum Core maternal physiologic trajectory measures.
VIT diastolic_bp VTDB Diastolic bp biomarker usubjid|visitnum Core maternal hemodynamic marker linked to obstetric risk.
VIT systolic_bp VTSB Systolic bp biomarker usubjid|visitnum Core maternal hemodynamic marker linked to obstetric risk.
VIT weight VTWT Weight biomarker usubjid|visitnum Core maternal physiologic trajectory measures.
Who Stage
WHO who_clinical_classification WHWC Who clinical classification disease_severity_covariate usubjid|visitnum Global HIV clinical severity summary for baseline adjustment.
Outcome, timing, biomarker, follow-up, and support variables used for severe maternal outcome definition and dynamic prediction.

Immediate Data Domains

The current metadata work is centered on the domains that directly support the proposed framework:

  • Baseline social and demographic structure for latent social risk phenotypes.
  • Obstetric and reproductive history for baseline maternal vulnerability profiling.
  • ART exposure and treatment timing for longitudinal alignment.
  • Vital signs, hematology, renal, hepatic, and HIV markers for physiological burden phenotyping.
  • Delivery, hospitalization, adverse event, and termination records for severe maternal outcome definition and support.

Next Documentation Steps

  1. Finalize a controlled registry for severe maternal outcome derivation variables and supporting event fields.
  2. Document gestational and clinical time anchors used for irregular longitudinal alignment.
  3. Add a formal join-specification layer for participant, visit, admission, and delivery records.
  4. Extend the registry with derived-variable documentation for phenotype classes, dynamic prediction inputs, and final endpoint-construction rules once the live data pipeline is linked.

Open standalone PhDDesk project

© 2020 - 2026

ORCID: 0000-0003-2170-6312

Quarto  |  VS Code  |  Codex  |  R