Publication Data Model
Publication Data Model
Section titled “Publication Data Model”How publication efficacy data is stored, from abstract text through LLM extraction to the clinical evidence report.
Last updated: 2026-04-03
Entity Relationship Overview
Section titled “Entity Relationship Overview”Publication (root)│├── trial_arms Treatment groups — each arm is a group of patients who│ │ received the same treatment (e.g., "8.0 mg/kg",│ │ "Pembrolizumab + Chemo"). Includes an "All Arms" entry│ │ for pooled results. Created by extract_interventions.│ ││ └── trial_arm_interventions What drugs/interventions were given in this arm, at what│ dose. Each intervention has its own drug_id, dose fields,│ and intervention_role (investigational, combination,│ comparator, supportive).│├── trial_disease_details What diseases this publication studies — disease name,│ │ stage, subtype, risk, treatment setting. Linked to the│ │ diseases table via disease_id.│ ││ └── trial_disease_biomarkers Biomarkers associated with the disease context (e.g.,│ EGFR mutation for NSCLC). Linked to biomarkers table.│├── trial_endpoints What endpoints were measured (ORR, PFS, OS, DOR...).│ Definitions only — no values here.│├── trial_subgroups Who was studied — patient populations (disease,│ │ biomarker, dose cohort, demographics).│ ││ ├── trial_subgroup_biomarkers Structured biomarker details for biomarker-tagged│ │ subgroups (name, value, numeric threshold).│ ││ └── trial_outcome_measures What was measured for this subgroup — the intersection│ │ of a subgroup × endpoint. Defines WHAT we're looking at│ │ (e.g., "ORR for Squamous, confirmed, percentage, primary").│ │ Holds metadata: outcome_type, measure_unit, confirmed,│ │ time_point. No actual result values.│ ││ └── trial_arm_outcomes The actual numbers — per-arm results within an outcome│ measure. Each row is one arm's result (e.g., "8.0 mg/kg│ arm: N=32, ORR=15.6%"). Linked to trial_arms via│ trial_arm_id FK. Holds measure_value, N, p_value,│ hazard_ratio, odds_ratio.│├── adverse_events Safety endpoints (neutropenia, ILD, etc.).│ │ Holds grade_category, standardized_name.│ ││ └── trial_arm_outcomes Per-arm safety results (same table as efficacy arm│ outcomes, linked via adverse_event_id instead of│ trial_outcome_measure_id).│└── publication_interventions LEGACY (News only). Study-level drug records. Still used by NewsTrialMention. No longer created for Publications — replaced by trial_arms + trial_arm_interventions.Polymorphic ownership
Section titled “Polymorphic ownership”Several tables use polymorphic source_type + source_id columns instead of a direct publication_id foreign key. This allows the same table to store data sourced from publications OR clinical trial registries:
| Table | Polymorphic columns | Typical source_type |
|---|---|---|
trial_subgroups | source_type, source_id | 'Publication' |
trial_endpoints | source_type, source_id | 'Publication' |
trial_outcome_measures | source_type, source_id | 'Publication' |
trial_arms | source_type, source_id | 'Publication' |
trial_disease_details | source_type, source_id | 'Publication' |
adverse_events | source_type, source_id | 'Publication' |
publication_interventions | source_type, source_id | 'Publication' or 'NewsTrialMention' |
To query all subgroups for a publication: WHERE source_type = 'Publication' AND source_id = <pub_id>.
trial_arm_outcomes is NOT polymorphic — it belongs directly to a trial_outcome_measure (via trial_outcome_measure_id) or an adverse_event (via adverse_event_id).
Two paths to disease and biomarker data
Section titled “Two paths to disease and biomarker data”Disease and biomarker information is stored at two different levels, serving different purposes:
Path 1: Publication-level — “What does this study cover?”
Publication → trial_disease_details → diseases → trial_disease_biomarkers → biomarkersPopulated by the extract_diseases pipeline step. Describes the study’s overall disease context (e.g., “EGFR-mutant advanced NSCLC”). Used for filtering and categorization.
Path 2: Subgroup-level — “What population does this specific result apply to?”
Publication → trial_subgroups → diseases (via disease_id) → trial_subgroup_biomarkers → biomarkersPopulated by classify_publications + post_process. These are the data-carrying entities that link down to outcome measures and arm results.
The two levels can diverge. A study might cover “advanced NSCLC” at the disease detail level but report results for “Squamous” and “Adenocarcinoma → AGA-negative” subgroups. A basket trial studying “solid tumors” might have disease-specific subgroups for NSCLC, CRC, and breast cancer.
Core Tables
Section titled “Core Tables”publications
Section titled “publications”The root entity. Key fields for the data model:
| Column | Purpose |
|---|---|
abstract | Source text — ground truth for all extraction |
llm_data (JSONB) | Raw LLM extraction output before materialization |
llm_data_processed (bool) | Whether post_process has materialized llm_data into child tables |
result (bool) | Whether this publication reports clinical results |
clinical_trial_id | FK to linked clinical trial (nullable) |
total_number_of_participants | Denormalized from llm_data |
trial_outcome | positive / negative / unclear |
trial_subgroups
Section titled “trial_subgroups”A patient population or cohort within a publication. Subgroups can represent different things depending on subgroup_type:
| subgroup_type | Example subgroup_value | What it means |
|---|---|---|
disease | NSCLC → Squamous cell carcinoma | Histology/disease subpopulation |
biomarker | PD-L1 TPS ≥50% | Biomarker-selected subgroup |
dose | Dose 100-300mg | Dose-defined cohort |
overall | Overall | Full study population |
Key columns:
| Column | Purpose |
|---|---|
subgroup_value | Human-readable label (hierarchical with →) |
subgroup_type | Semantic category |
number_of_participants | N for this subgroup (from abstract) |
population_role | Denominator semantics: overall, partition, selected_subset, etc. |
tags (JSONB) | Semantic dimension tags |
dose_value, dose_min, dose_max | Dose fields — only populated when the subgroup itself IS a dose cohort |
dose_units, dose_frequency, rp2d | Additional dose context |
data_cutoff_date | When results were cut off |
treatment_lines (JSONB) | Prior therapy context |
min_prior_lines, max_prior_lines | Sanitized treatment line counts |
disease_id | FK to matched disease entity |
trial_endpoints
Section titled “trial_endpoints”Endpoint definitions extracted from the publication:
| Column | Purpose |
|---|---|
endpoint_name | Full name (e.g., “Overall Survival”) |
abbreviation | Short form (e.g., “OS”) |
endpoint_id | FK to master endpoints table |
trial_outcome_measures
Section titled “trial_outcome_measures”The intersection of a subgroup and an endpoint — “ORR for the Squamous subgroup”:
| Column | Purpose |
|---|---|
trial_subgroup_id | FK to trial_subgroups |
trial_endpoint_id | FK to trial_endpoints |
outcome_type | primary / secondary / exploratory |
measure_unit | percentage, months, count |
confirmed (bool, nullable) | Confirmed vs unconfirmed response (e.g., cORR vs ORR) |
p_value, hazard_ratio, odds_ratio | Outcome-level statistics |
time_point | For landmark analyses (e.g., “12 months”) |
trial_arm_outcomes
Section titled “trial_arm_outcomes”Per-arm results within an outcome measure — “ORR for Squamous in the 8.0 mg/kg arm”:
| Column | Purpose |
|---|---|
trial_outcome_measure_id | FK to trial_outcome_measures |
arm_name | Arm label (e.g., “8.0 mg/kg”, “Pembrolizumab + Chemo”) |
arm_type | investigational, control, active_comparator, placebo_comparator |
number_of_participants | N for this arm |
measure_value | The result value (e.g., “33.3”, “Not Reached”) |
study_plan_arm_id | FK to registry study_plan_arms (nullable) |
p_value, hazard_ratio, odds_ratio | Arm-level statistics |
No dose columns. Arm-specific dose is only captured in the arm_name string. See Dose Data below.
Subgroup Tags and Population Roles
Section titled “Subgroup Tags and Population Roles”Each trial_subgroup has two classification fields set during LLM extraction:
Tags (multi-select)
Section titled “Tags (multi-select)”A subgroup can have multiple tags. For example, “EGFR-mutant NSCLC” would get ["overall", "biomarker", "disease"].
| Tag | Description | Count |
|---|---|---|
disease | Disease type, histology, subtype (NSCLC, AML, DLBCL) | 63,553 |
population | Specific analysis populations (per-protocol, safety-evaluable, responders) — NOT the unsliced overall | 62,262 |
biomarker | Mutations, expression markers, molecular subtypes (EGFR, PD-L1, HER2, TMB) | 44,559 |
overall | Top-line study population. A disease-specific cohort can still be “overall” when it is the single top-line cohort being reported | 33,529 |
treatment_arm | Treatment arms, regimen groupings | 23,076 |
dose | Dose levels, cohorts, schedules | 17,168 |
prior_therapy | Specific prior treatments (prior platinum, prior IO) | 15,436 |
stage | Disease stage (early, advanced, metastatic) | 14,755 |
other | Only if no other tag fits | 13,378 |
risk_group | Cytogenetic risk, IMDC risk, prognostic groups | 6,947 |
line_of_therapy | Treatment line (1L, 2L+, treatment-naive) | 6,443 |
age | Age demographic splits | 5,913 |
response_status | Subgroups defined by achieved response (responders, CR, PR, SD, PD, pCR, MRD-negative) | 2,396 |
gender | Sex/gender splits | 1,824 |
geography | Region/country splits | 1,604 |
race_ethnicity | Race/ethnicity demographic splits | 1,154 |
performance_status | ECOG PS, KPS | 1,059 |
Defined in TrialSubgroup::SUBGROUP_TAGS (app/models/trial_subgroup.rb).
Population Role (single-select, nullable)
Section titled “Population Role (single-select, nullable)”Clarifies what the subgroup’s N represents as a denominator:
| Role | Description |
|---|---|
overall | The unsliced top-line population for the full reported cohort |
analysis_population | ITT, mITT, safety, evaluable, assessable, tested, treated populations |
partition | An ordinary subgroup bucket — dose cohort, treatment arm, age band, sex split, stage bucket. Disease is “partition” only when it is one bucket among multiple disease cohorts side by side |
selected_subset | A filtered subset defined by a qualifying condition (biomarker-positive, prior-therapy-exposed, condition-present) |
response_subset | A subgroup defined by achieved response status |
Defined in TrialSubgroup::SUBGROUP_POPULATION_ROLES (app/models/trial_subgroup.rb).
trial_disease_details
Section titled “trial_disease_details”Disease context for the publication — what disease(s) were studied and their clinical characteristics.
| Column | Purpose |
|---|---|
disease_name | Extracted disease name |
disease_id | FK to matched diseases entity |
subtypes (JSONB) | Disease subtypes (e.g., adenocarcinoma, squamous) |
stages (JSONB) | Disease stages (e.g., advanced, metastatic, stage III) |
extents (JSONB) | Disease extent descriptors |
statuses (JSONB) | Disease status (e.g., relapsed, refractory) |
risks (JSONB) | Risk classifications (e.g., high-risk cytogenetics) |
treatment_settings (JSONB) | Treatment setting context |
number_of_prior_treatment_lines | Prior therapy line count |
trial_disease_biomarkers
Section titled “trial_disease_biomarkers”Biomarkers associated with a disease context. Belongs to trial_disease_details.
| Column | Purpose |
|---|---|
trial_disease_detail_id | FK to trial_disease_details |
biomarker_id | FK to matched biomarkers entity (nullable) |
biomarker_name | Extracted biomarker name (e.g., “EGFR”) |
value | Biomarker status (e.g., “mutated”, “positive”) |
numeric_value | Threshold if applicable (e.g., “50” for TPS ≥50%) |
alternatives_names (JSONB) | Alternative names for matching |
publication_interventions (DEPRECATED for Publications — News only)
Section titled “publication_interventions (DEPRECATED for Publications — News only)”Drug/intervention records. No longer created for Publication sources — replaced by trial_arms + trial_arm_interventions. Still used by NewsTrialMention through the News pipeline.
| Column | Purpose |
|---|---|
intervention_name | Drug name |
drug_id | FK to matched drug entity |
intervention_role | investigational, comparator, combination, supportive |
intervention_type | drug, biological, procedure |
dose | Free-text dose string from abstract |
dose_evidence (JSONB) | Structured dose extraction (see below) |
study_plan_arm_id | FK to study_plan_arms — always NULL in practice |
Dose Data and Its Gaps
Section titled “Dose Data and Its Gaps”Dose information exists at three levels, but there is a structural gap at the arm level.
Level 1: Study-level dose (publication_interventions)
Section titled “Level 1: Study-level dose (publication_interventions)”extract_dose_evidence runs a separate LLM pass over each publication_intervention to populate the dose_evidence JSONB:
{ "single_dose": "400 mg", "dose_min": "8.0 mg/kg", "dose_max": "10.0 mg/kg", "rp2d": "8.0 mg/kg", "dose_units": "mg/kg", "dose_frequency": "Q3W", "dose_context_type": "weight_based", "confidence": 0.95}There is one PI per drug per publication, so this captures the study-level dose range. For a multi-dose-arm study like “8.0 mg/kg vs 10.0 mg/kg”, it records dose_min=8.0 and dose_max=10.0 — the range, not per-arm values.
Level 2: Subgroup-level dose (trial_subgroups)
Section titled “Level 2: Subgroup-level dose (trial_subgroups)”When a subgroup IS a dose cohort (e.g., “Dose 100-300mg”), the LLM extraction populates dose_min, dose_max, dose_value on the trial_subgroup record. ~1,200 subgroups out of ~200k have these fields set.
Level 3: Arm-level dose — THE GAP
Section titled “Level 3: Arm-level dose — THE GAP”trial_arm_outcomes has NO dose columns. When a publication reports efficacy by dose arm (e.g., “8.0 mg/kg arm: ORR 15.6%” and “10.0 mg/kg arm: ORR 26.9%”), the dose is only captured in arm_name as an unstructured string.
The classify_publications LLM extraction already identifies each arm by dose name but the arm schema only has: name, arm_type, measure_value, number_of_participants. No dose fields.
How the view resolves dose (COALESCE chain)
Section titled “How the view resolves dose (COALESCE chain)”The vw_publication_efficacy_data view uses a fallback chain to populate dose_min/dose_max/single_dose on each row:
1. trial_subgroups.dose_min/dose_max (subgroup is a dose cohort)2. trial_subgroups.dose_value (single-dose subgroup, formatted with units)3. publication_interventions.dose_evidence (study-level fallback, with guards)The pub-level fallback is gated:
- Skipped for control/comparator arms (Issue 31)
- Skipped for escalation/range/rp2d context types (Issue 35)
single_doseonly falls back whenpub_dose_min = pub_dose_max(single dose study)
The problem: For multi-dose-arm studies where subgroups are disease-defined (not dose-defined), the COALESCE chain falls through to study-level dose, which propagates the full dose range to every arm row. The “8.0 mg/kg” arm shows dose_min=8.0, dose_max=10.0 — misleading, because that arm only received 8.0.
What this means in practice
Section titled “What this means in practice”For a study like ARTEMIS-001 (pub 190656) with:
- Subgroups: Squamous, Adenocarcinoma (disease-defined)
- Arms: 8.0 mg/kg, 10.0 mg/kg (dose-defined)
Every row in the view gets the same dose_min=8.0, dose_max=10.0 regardless of which arm it belongs to. The arm_name has the correct dose but it’s a text string, not queryable as structured data.
Fixing the gap
Section titled “Fixing the gap”The fix requires adding dose fields to the arm extraction and storage:
- Add dose fields to the arm schema in
classify_publications(so the LLM extracts per-arm dose) - Add dose columns to
trial_arm_outcomes(to store it) - Update
post_process.rbto persist arm-level dose during materialization - Update the view to use arm-level dose as the first COALESCE choice
Data Flow: Abstract to Report
Section titled “Data Flow: Abstract to Report”Pipeline Steps
Section titled “Pipeline Steps”The publications workflow (app/workflows/publications_workflow.rb) runs these steps in order:
1. extract_trial_identifiers Find NCT IDs and registry links in abstract2. web_search_identifiers Web search for missing trial IDs (disabled)3. relink_to_clinical_trials Match publications to clinical trials4. therapeutic_area_filter Filter to target therapeutic areas5. extract_interventions LLM: extract arms and their interventions → trial_arms + trial_arm_interventions6. link_publication_drugs Match intervention names to drug entities7. tag_investigational_interventions Classify intervention roles8. extract_subgroups LLM: identify subgroups and endpoints from abstract9. extract_dose_evidence LLM: extract structured dose per intervention10. classify_publications LLM: full efficacy/safety extraction → llm_data11. extract_diseases LLM: identify diseases12. post_process_publications Materialize llm_data → normalized tables13. classify_intent LLM: classify publication intent14. extract_treatment_lines LLM: extract prior therapy context15. standardize_adverse_events Normalize AE names16. classify_adverse_events LLM: classify AEsWhat happens at each key step
Section titled “What happens at each key step”extract_subgroups (step 8): Identifies what subgroups and endpoints exist in the abstract. Stores in llm_data['subgroup_endpoints']. This runs BEFORE classify_publications to guide extraction.
extract_dose_evidence (step 9): Separate LLM pass per publication_intervention. Produces structured dose_evidence JSONB on the PI record. This is study-level dose per drug.
classify_publications (step 10): The main extraction. Takes the abstract plus known subgroups/endpoints and extracts:
{ "subgroup_outcome_measures": [ { "type": "disease", "value": "NSCLC → Squamous cell carcinoma", "number_of_participants": null, "outcome_measures": [ { "endpoint": "Overall Response Rate", "endpoint_abbreviation": "ORR", "confirmed": true, "arms": [ { "name": "8.0 mg/kg", "arm_type": "investigational", "measure_value": 15.6, "number_of_participants": 32 }, { "name": "10.0 mg/kg", "arm_type": "investigational", "measure_value": 26.9, "number_of_participants": 26 } ] } ] } ]}Note: arms have name and measure_value but no structured dose fields.
post_process_publications (step 12): Materializes llm_data into normalized tables:
subgroup_outcome_measuresentries →trial_subgroupsrows- Each
outcome_measuresentry →trial_outcome_measurerow (linked to subgroup + endpoint) - Each
armsentry →trial_arm_outcomerow (linked to outcome measure) - Guards: N=0 → nil (zero-sentinel), all-zero percentage endpoints with nil N → nil
View materialization
Section titled “View materialization”vw_publication_efficacy_data_v22 joins everything together:
trial_subgroups (with dose, treatment lines) ← trial_outcome_measures (subgroup × endpoint) ← trial_arm_outcomes (per-arm results) ← drug_interventions (drug/technology from publication_interventions or registry) ← pub_dose_lookup (dose_evidence from publication_interventions)The view outputs one row per: publication × subgroup × endpoint × arm × drug, with resolved dose fields via COALESCE fallback.
Query layer
Section titled “Query layer”Tpp::ClinicalEvidenceQuery filters the view by disease and technology, enriches with biomarker data, and groups results by drug for the clinical evidence report.
Common Modeling Patterns
Section titled “Common Modeling Patterns”Dose as subgroup vs dose as arm
Section titled “Dose as subgroup vs dose as arm”The same dose split can be modeled two ways, depending on how the abstract presents data:
Dose as subgroup — when the abstract reports each dose cohort independently:
trial_subgroup: "8.0 mg/kg cohort" (dose_value = "8.0", dose_units = "mg/kg") └── trial_outcome_measure: ORR └── trial_arm_outcome: arm_name = "HS-20093"
trial_subgroup: "10.0 mg/kg cohort" (dose_value = "10.0") └── trial_outcome_measure: ORR └── trial_arm_outcome: arm_name = "HS-20093"Dose as arm — when the abstract cross-tabulates dose × subgroup:
trial_subgroup: "Squamous cell carcinoma" (dose fields = NULL) └── trial_outcome_measure: ORR ├── trial_arm_outcome: arm_name = "8.0 mg/kg" (no structured dose) └── trial_arm_outcome: arm_name = "10.0 mg/kg" (no structured dose)The LLM picks whichever matches the abstract structure. In the second case, per-arm dose is lost as structured data.
Hierarchical subgroups
Section titled “Hierarchical subgroups”Subgroup values use → as a hierarchy separator:
NSCLC(parent)NSCLC → Adenocarcinoma(child)NSCLC → Adenocarcinoma → AGA-negative(grandchild)
Each level is a separate trial_subgroup record. The parent serves as “Overall” for its children.
Confirmed vs unconfirmed response
Section titled “Confirmed vs unconfirmed response”When a publication reports both confirmed ORR (cORR) and unconfirmed ORR:
- Two
trial_outcome_measurerecords are created for the same subgroup + endpoint - Distinguished by
confirmed = truevsconfirmed = false - Issue 27: the query layer previously picked the wrong one via
max_by(number_of_participants)
Arms as First-Class Entities (implemented 2026-04-02)
Section titled “Arms as First-Class Entities (implemented 2026-04-02)”The result fact
Section titled “The result fact”A publication’s efficacy result is: subgroup × endpoint × arm = value, where each dimension is a first-class entity linked by FK.
How it works
Section titled “How it works”extract_interventionscreatestrial_arms(with IDs) +trial_arm_interventions(drugs, dose per arm)classify_publicationsreceives arm IDs in the prompt, assigns them to each outcomepost_processreadsarm_data['id']astrial_arm_id— direct FK, no name matching
An “All Arms” entry is always created for pooled results.
trial_arms
Section titled “trial_arms”| Column | Purpose |
|---|---|
name | Arm label (e.g., “8.0 mg/kg”, “Pembrolizumab + Chemo”, “All Arms”) |
arm_type | investigational, control, active_comparator, placebo_comparator, combination |
number_of_participants | Arm-level N |
position | Preserves LLM output ordering |
No clinical_trial_id or study_plan_arm_id — trial_arms are self-contained publication entities.
trial_arm_interventions
Section titled “trial_arm_interventions”| Column | Purpose |
|---|---|
trial_arm_id | FK to trial_arms |
drug_id | FK to matched drug entity (nullable) |
ncit_concept_id | FK to NCI Thesaurus concept (nullable) |
intervention_name | Drug/intervention name |
intervention_type | drug, biological, procedure, device, other |
intervention_role | investigational, combination, comparator, supportive — per arm, not per publication |
dose, dose_min, dose_max, single_dose, rp2d | Structured dose fields |
dose_units, dose_frequency, dose_context_type | Dose context |
dose_evidence (JSONB) | Full dose extraction audit trail |
What this replaced
Section titled “What this replaced”| Old approach | New approach |
|---|---|
publication_interventions — one record per drug per pub, study-level dose | trial_arm_interventions — one record per drug per arm, arm-level dose |
| Drug-arm linkage via name-substring matching in 600-line SQL view | Direct FK: trial_arm_outcomes.trial_arm_id → trial_arm_interventions |
intervention_role per publication (same drug, one role) | intervention_role per arm (same drug can be “combination” in one arm, “comparator” in another) |
study_plan_arms from registry passed to LLM | trial_arms from our own extraction passed to LLM |
publication_interventions (legacy)
Section titled “publication_interventions (legacy)”The table still exists and is used by NewsTrialMention (News pipeline). No longer created for Publications. The efficacy view v23 reads from trial_arm_interventions instead.
Remaining work
Section titled “Remaining work”- Production backfill: Run
extract_interventions→classify_publications→post_processon all target-scope publications to get ID-based linking - Legacy data: ~43k pre-pipeline pubs have trial_arms created from arm outcomes (no interventions). These need reprocessing through
extract_interventionsto get drug/dose data - Issue 50: DrugLinker false-matches non-drug interventions — needs intervention_type guard
Key Files
Section titled “Key Files”| Purpose | Path |
|---|---|
| Publication model | app/models/publication.rb |
| TrialArm model | app/models/trial_arm.rb |
| TrialArmIntervention model | app/models/trial_arm_intervention.rb |
| TrialSubgroup model | app/models/trial_subgroup.rb |
| TrialOutcomeMeasure model | app/models/trial_outcome_measure.rb |
| TrialArmOutcome model | app/models/trial_arm_outcome.rb |
| PublicationIntervention model (legacy, News only) | app/models/publication_intervention.rb |
| Intervention extraction | app/tasks/publications_llm_classification/intervention_extraction.rb |
| Trial arm materializer | app/tasks/publications_llm_classification/trial_arm_materializer.rb |
| Subgroup extraction | app/tasks/publications_llm_classification/subgroup_extraction.rb |
| Main LLM extraction | app/tasks/publications_llm_classification/task.rb |
| Extraction schema | app/tasks/publications_llm_classification/details.rb |
| Dose evidence extraction | app/tasks/publications_llm_classification/dose_evidence_extraction.rb |
| Post-process materialization | app/tasks/publications_llm_classification/post_process.rb |
| Efficacy view (latest) | db/views/vw_publication_efficacy_data_v23.sql |
| Clinical evidence query | app/queries/tpp/clinical_evidence_query.rb |
| Pipeline workflow | app/workflows/publications_workflow.rb |
| Backfill task | lib/tasks/one_off/backfill_trial_arms.thor |
| Issues tracker | docs/publication_issues_tracker.md |