Skip to content

Audit Triage 2026 03 30

Audit jobs: 1668 (CRC+ADC, disease 4345) and 1669 (disease 4174), both running today. Total open issues: 61 (IDs 8437–8497, audit still running)


Finding 1: Hierarchical subgroup rows lose N from flat counterparts

Section titled “Finding 1: Hierarchical subgroup rows lose N from flat counterparts”

Pubs affected: 48926, 67379, 200353 (7 issues: 8462-8464, 8467-8472)

Root cause: The view vw_publication_efficacy_data contains both flat subgroups (e.g. IHC3+) with correct N and hierarchical copies (e.g. RAS wild-type mCRC → Cohort A → IHC3+) with N=null. ClinicalEvidenceQuery picks the hierarchical rows (disease grouping), so N comes through as null. The audit flags these as N=0.

This is a variant of tracker Issue 26 (parent N propagation) but in reverse — the child subgroup has the data, the hierarchical copy doesn’t carry it.

Example (pub 48926):

  • IHC3+ (flat): N=40, ORR=57.5
  • RAS wild-type mCRC → Cohort A → IHC3+ (hierarchical): N=null, ORR=57.5

Fix direction: Either the view needs to propagate N from flat to hierarchical rows, or the query needs to prefer flat rows when hierarchical ones have null N.


Finding 2: TTP→PFS (Issue 32) — query fix timing

Section titled “Finding 2: TTP→PFS (Issue 32) — query fix timing”

Pubs affected: 29735, 29737, 29738, 74193 (5 issues: 8454-8457, 8473)

Status: Query fix deployed today while audit was running. Should resolve on next run.


Finding 3: Biomarker backfill exclusion too coarse (Issue 38 gap)

Section titled “Finding 3: Biomarker backfill exclusion too coarse (Issue 38 gap)”

Pubs affected: 29704 (10 issues: 8441-8450), likely 29700 (2 issues: 8439-8440)

Root cause: The backfill’s candidate_publication_ids query (line 215-219 of backfill_biomarker_secondary_subgroups.thor) excludes any pub that already has a TrialSubgroup tagged "biomarker". Pub 29704 has 12 MR-based biomarker subgroups, so it was excluded — but the abstract also reports efficacy by TMB, KRAS, BRAF, HER2 amplification, and KRAS G12C subgroups that were never extracted.

Fix direction: The exclusion needs to be more granular — check whether specific biomarker subgroups are missing, not just whether any biomarker subgroup exists. Alternatively, force-screen these pubs by passing --publication_ids.


Finding 4: ORR=0% nulled by zero-sentinel forward fix (pub 31990)

Section titled “Finding 4: ORR=0% nulled by zero-sentinel forward fix (pub 31990)”

Issues: 8458-8459 (IHC2+/ISH- and IHC1+ subgroups, ORR=null in view)

Root cause: The view shows measure_value: null for ORR on these subgroups. The abstract states 0% ORR (no responses). The post_process.rb guard that converts measure_value=0→nil when all arms have 0 for a percentage endpoint is killing real 0% ORR values. This is a regression from the Issue 8 forward fix.

Fix direction: The guard should only null out 0 when the abstract doesn’t explicitly state 0%. Or: never null out measure_value when the endpoint is ORR and the subgroup has other non-null efficacy data (PFS, OS exist).


Finding 5: More hierarchical N issues (pubs 49899, 49900)

Section titled “Finding 5: More hierarchical N issues (pubs 49899, 49900)”

Pub 49899 (issues 8475-8477): CRC → Dose ≥2.4 mg/kg has N=40, same as parent CRC. Child subgroup inherits parent N instead of extracting subset N. Same root cause as Finding 1 / Issue 26.

Pub 49900 (issue 8478): 3L mCRC → 2.4 mg/kg audit flags patient_number_safety mismatch. Needs abstract check to verify.


Issue IDPubTypeNotes
843829700missing_endpoint (DoR)DoR exists for 2.4 mg/kg arm but not 3.0 mg/kg. Either not in abstract or extraction gap.
843729699incorrect dose_maxView shows dose_max=6.0 mg/kg for CRC. Audit says abstract max is 3.0 mg/kg Q3W for mCRC. Issue 35 (dose extraction).
846148903incorrect dose_maxView shows dose_max=8.0 for Low HER2 subgroup. Audit says 6.4 mg/kg (Part 2 doses). Issue 35.
847951436incorrect dose_minDose extraction issue. Issue 35.
8451-8452116843safety cross-contaminationSafety N=30 but abstract says 26; discontinuation 3% belongs to 2.4 mg/kg arm not 2.0. Safety data bleed between dose arms.
8465-846671927patient_count = N not respondersView: N=20, ORR=20%. Audit says patient_count should be 4 (responders), not 20 (denominator). Query/extraction puts N into patient_count field.
8453235204N mismatchView: N=23 for SD→MR positive (methylation). Audit says abstract states 31. Extraction error.
847474193N mismatchView: N=3 for HER2 retained in ctDNA. Audit says 2. Extraction error.
846048880spurious_row”Overall (All Arms)” row with only OS. Dose-level arms (5.4, 6.4 mg/kg) have detailed data. Spurious rollup row.

Final summary by root cause (all 61 issues triaged)

Section titled “Final summary by root cause (all 61 issues triaged)”
Root causeIssuesCountSystemic?Action needed
Dose extraction (Issue 29/35)8437, 8461, 8479, 8480-8481, 8486-8488, 8490-8491, 8493-849411YESNo fix ever applied — biggest unfixed issue
Biomarker backfill exclusion too coarse (Issue 38)8441-8450, 8482-848413YESFix candidate query exclusion for partial coverage
Hierarchical N loss (Finding 1 / Issue 26)8462-8464, 8467-8472, 8475-8477, 8489, 849212YESFix view or query N propagation
TTP→PFS (Issue 32)8454-8457, 8473, 84856NOQuery fix deployed today — will clear on re-audit
ORR=0% regression (Issue 8)8458-84592YESFix post_process.rb guard
Safety cross-contamination8451-8452, 84783YESSafety data bleeds between dose arms
False positive — patient_count semantics8465-8466, 84963NOAudit LLM misunderstanding — fix audit prompt
False positive — narrative subgroups8439-84402NOAudit LLM error — fix audit prompt
Immature → Not Reached (Issue 34)84951YESNo fix ever applied
Spurious rollup row84601MAYBEInvestigate how Overall rows are synthesized
NEW: tumor shrinkage ≠ ORR84971MAYBENew extraction pattern — needs scale investigation
One-off extraction errors8438, 8453, 84743NOIndividual pub fixes
TOTAL61True: 51, False positive: 5, Query-fix timing: 6

New issues from continued audit (8480–8497)

Section titled “New issues from continued audit (8480–8497)”

18 additional issues arrived as the audit continued. Almost all map to known patterns:

  • Dose extraction (Issue 35): +8 issues (pubs 70960, 135119, 134450). Same pattern — dose escalation range extracted instead of efficacy population dose.
  • Biomarker subgroups (Issue 38 gap): +3 issues (pub 72043: HER2 IHC2+, IHC1+, mutation/amplification subgroups missing). Same backfill exclusion gap.
  • TTP→PFS (Issue 32): +1 (pub 73299). Query fix timing.
  • Zero-sentinel / hierarchical N (Issue 8 + Finding 1): +2 (pub 134450 CRC+SCCHN). Backfill cleared N=0 from trial_arm_outcomes, but the view’s hierarchical rows show N=null. This pub was explicitly listed as an Issue 8 example — the zero-sentinel was fixed but hierarchical N loss (Finding 1) now surfaces it differently.
  • Immature → Not Reached (Issue 34): +1 (pub 114571). No fix ever applied.
  • patient_count confusion: +1 (pub 135414).

Genuinely new: Tumor shrinkage rate confused with ORR

Section titled “Genuinely new: Tumor shrinkage rate confused with ORR”

Issue 8497 (pub 162304): ORR reported as 35% but abstract says ~1.5% (1/66). The 35% figure is “any tumor reduction” — a different metric measuring any shrinkage, not RECIST objective response. The LLM conflated tumor shrinkage/reduction rate with ORR.

This is a new extraction pattern not covered by existing tracker issues. Needs investigation to determine scale — how many pubs report non-RECIST response metrics (tumor shrinkage rate, disease control rate misattributed as ORR, etc.)?


All 10 missing_subgroup — genomic biomarker subgroups (TMB, KRAS, BRAF, HER2 amp) present in abstract table but never extracted.

  • Classification: True issue — subgroup identification. Issue 38 backfill gap (Finding 3).
  • 8489, 8492: N=null on hierarchical child rows (CRC, SCCHN). True issue — Finding 1 (hierarchical N loss).
  • 8490-8491, 8493-8494: dose_min=0.1 mg/kg from phase 1a escalation, but phase 1b used RP2D 2.5 mg/kg only. True issue — Issue 29/35 (dose scope).
  • 8467-8470: N=97 (parent paired sample pool) propagated to “Complete MR” and “Absent MR” child subgroups. Abstract doesn’t state per-subgroup N (in Table). True issue — Issue 26 (parent N propagation).
  • 8471: EGFR amp N=null displayed as 0 by audit. Abstract confirms patients existed but gives no N. True issue — Issue 26/8 hybrid. N should be null.
  • 8438: DoR missing for 3.0 mg/kg arm. Abstract table explicitly states DoR=5.5 months (95% CI 2.8, NE). View only has DoR for 2.4 mg/kg. True issue — one-off extraction gap.
  • 8439-8440: “High c-Met expression” and “Lower c-Met expression” flagged as missing subgroups. Abstract mentions “>30% ORR” and “10-15% ORR” in narrative Conclusions text — no defined subgroup, no specific N. False positive — audit LLM error (confidence=7).

All 3 missing_subgroup — CRC × HER2 status cross-tabulations (IHC2+ 0/3, IHC1+ 0/1, mutation/amp 0/3) present in abstract table but not extracted. View has CRC → HER2 IHC3+ but not the other HER2 levels for CRC.

  • Classification: True issue — extraction. Issue 33 (cross-tabulated subgroups). Note: audit LLM cited wrong abstract values (used overall HER2 numbers, not CRC-specific), but the finding is valid.

All 3 incorrect_value on patient_number_efficacy for hierarchical subgroups (RAS wild-type mCRC → Cohort A → IHC3+/IHC2+/ISH+/prior anti-HER2). Flat subgroups have correct N (40, 13, 16) but hierarchical copies have N=null. Audit reads null as 0.

  • Classification: True issue — view/query. Finding 1 (hierarchical N loss).

All 3 on “Overall” rollup row. dose_max=170 (should be 190 from Q3W arm), dose_frequency=Q2W (both Q2W+Q3W used), safety N=28 (Q2W-LD arm count, total is 43). Child arm rows (Q2W-LD, Q3W) have correct values.

  • Classification: True issue — query/view layer. Overall row inherits one child arm’s dose/safety metadata instead of aggregating across arms. Related to Issue 28 (arm collapsing) but distinct — here the arms aren’t collapsed, the Overall rollup just picks the wrong arm’s values.

All 3 on CRC → Dose ≥2.4 mg/kg subgroup. N=40 (parent total) but abstract says 34 (sum of dose levels ≥2.4: 7+4+12+4+7). Same propagation for ORR and cORR patient_count.

  • Classification: True issue — extraction. Issue 26 (parent N propagated to dose-subset child).

IHC2+/ISH- and IHC1+ cohorts: ORR=null in view but abstract explicitly states “No responses occurred in cohorts B or C” = 0% ORR.

  • Classification: True issue — post-processing. Issue 8 regression (zero-sentinel guard kills real 0% ORR).

CRC → HER2-positive dose_min=3.2, dose_max=8.0 (full escalation range). Abstract table: this subgroup is exclusively the 6.4 mg/kg RP2D cohort.

  • Classification: True issue — extraction. Issue 29/35 (dose scope captures study-level range instead of subgroup-specific RP2D).

PFS=4.8 (CRC) and PFS=4.4 (KRAS-mutant) — abstract reports TTP, not PFS. Also TTP values are for SD patients only, not full cohort.

  • Classification: True issue — extraction. Issue 32 (TTP→PFS) + SD-subpopulation attribution to parent cohort.

ORR and cORR patient_count=20 flagged as wrong — audit says should be 4 (responders). But patient_count in the query = number_of_participants = denominator (20 evaluable pts), not numerator. ORR=20% of 20 = 4 responders. Query is correct.

  • Classification: False positive — audit LLM misunderstood patient_count semantics (denominator vs numerator).

Safety N=30 and discontinuation=3% attributed to 2.0 mg/kg arm, but abstract reports these for the 2.4 mg/kg arm. No safety data stated for 2.0 mg/kg.

  • Classification: True issue — query/extraction. Safety data cross-contamination between dose arms. Related to Issue 31 but in safety domain.
  • 8473: PFS=1.6 months — view correctly has TTP, but query-layer TTP→PFS fallback remaps it. True issue — query (Issue 32). Fix deployed today.
  • 8474: HER2 retained in ctDNA N=3 — abstract says 2/3 retained. N should be 2 (subset with retained HER2), not 3 (tested). True issue — one-off extraction error (denominator vs numerator).

CRC → SD → MR positive (methylation panel) N=23, abstract says 31. The 23 is the PFS event count (23/31 events), not N. LLM picked the wrong number from the table row “5.3 (4.5, 5.9) 23/31”.

  • Classification: True issue — one-off extraction error (PFS events confused with N).

PFS=5.1 months — abstract says “median TTP = 5.1 mo”. No PFS reported.

  • Classification: True issue — extraction. Issue 32 (TTP→PFS). Query fix deployed today.

Spurious “Overall (All Arms)” row with OS=“Not Reached”. Abstract reports efficacy per dose arm (5.4 and 6.4 mg/kg) only — no combined overall population.

  • Classification: True issue — query/view. Synthetic rollup row not present in abstract. May relate to how the view constructs “Overall” rows.

dose_max=6.0 mg/kg for mCRC — abstract says mCRC doses were 1.6, 2.4, 3.0 mg/kg only. 6.0 is from the broader PK analysis across all tumor types.

  • Classification: True issue — extraction. Issue 29/35 (dose scope).

ORR=35% — abstract says 35% had “tumor reductions” (any shrinkage), but only 1/66 had RECIST PR. True ORR ≈ 1.5%. LLM conflated tumor shrinkage rate with ORR.

  • Classification: True issue — extraction. NEW pattern not in tracker: tumor shrinkage rate confused with RECIST ORR.

ORR patient_count=19, audit says should be 0 (responders). Same as pub 71927 — patient_count is denominator (N=19), not numerator. ORR=0% is correct.

  • Classification: False positive — audit LLM misunderstood patient_count semantics.

OS=“Not Reached” — abstract says “not yet mature” (no median estimable). Immature ≠ Not Reached.

  • Classification: True issue — extraction. Issue 34 (immature → Not Reached).

PFS=1.8 months — abstract says “median TTP of 1.8 months”. No PFS.

  • Classification: True issue — extraction. Issue 32 (TTP→PFS). Query fix deployed today.

hTMB/MSS subgroup N=null (audit reports 0). Abstract reports PFS and HR but doesn’t state per-subgroup N.

  • Classification: True issue — Finding 1 (hierarchical N loss) / data availability limit. N is unstated in abstract, view shows null, audit reads as 0.

HER2 3+ colorectal cancer dose_min=1.5 mg/kg — but efficacy table header says “6mg/kg and above”. Full escalation range (1.5–9) extracted instead of expansion dose (≥6).

  • Classification: True issue — extraction. Issue 29/35 (dose scope).

Safety N=29 for 2.4 mg/kg arm — abstract says A2 (2.4 mg/kg) n=31, A1 (2.8 mg/kg) n=29. Wrong arm’s N.

  • Classification: True issue — extraction/query. Safety N cross-contamination between dose arms (same pattern as pub 116843).

Low HER2 expression dose_max=8.0 mg/kg — but this is a Part 2 expansion cohort (5.4/6.4 mg/kg only). 8.0 is from Part 1 escalation.

  • Classification: True issue — extraction. Issue 29/35 (dose scope).

PFS=4.14 months (converted from 18 weeks) — abstract says “median TTP = 18 weeks”. No PFS.

  • Classification: True issue — extraction. Issue 32 (TTP→PFS). Query fix deployed today.