Drugs
Drugs are the center of the Bioloupe knowledge graph. Every other domain — targets, diseases, clinical trials, organisations, deals, news — connects back to at least one drug. The drugs table holds one row per active pharmaceutical ingredient, tracked at the INN level: the molecule itself, not a brand name or formulation.
From a single drug row, you can walk outward to its molecular targets, owning companies, regulatory approvals across four regions, branded products, and lifecycle milestones. Two materialized views pre-join the most common paths so that pipeline queries need zero JOINs.
Entities
Section titled “Entities”erDiagram
drugs ||--o{ drug_ownerships : "owned by"
drugs ||--o{ drug_target_actions : "acts on"
drugs ||--o{ drug_approvals : "approved as"
drugs ||--o{ brand_drugs : "marketed as"
drugs ||--o{ drug_milestones : "lifecycle events"
drugs }o--o| drugs : "parent (combinations)"
drugs }o--o| drugs : "similar_drug (biosimilars)"
drugs }o--o| technologies : "modality"
drug_ownerships }o--|| organisations : "owner"
drug_target_actions }o--|| targets : "target"
drug_target_actions }o--o| modes_of_action : "mechanism"
drug_approvals ||--o{ drug_approval_organisations : "granted by"
drug_approvals ||--o{ indications : "indicated for"
drug_approval_organisations }o--|| organisations : "organisation"
brand_drugs ||--o{ brand_drug_approvals : "approved in"
brand_drug_approvals }o--|| drug_approvals : "approval"
drug_milestones }o--o| diseases : "for disease"
The core entity. One row per molecule, identified by a stable bloupe_id.
| Column | Type | Purpose |
|---|---|---|
bloupe_id | text (unique) | Stable Bioloupe identifier for external references |
name | text | INN or common name (e.g., pembrolizumab, trastuzumab) |
synonyms | json | Array of alternative names, codes, and abbreviations |
drug_type | text | Molecular classification: small_molecule, biologic, antibody, adc, cell_therapy, gene_therapy, vaccine, radiopharmaceutical, oligonucleotide, peptide, combination |
drug_status | text | Lifecycle status: marketed, in_development, discontinued, withdrawn, suspended |
first_in_class | boolean | TRUE if first drug approved for its mechanism; NULL if unknown |
technology_id | FK | Drug modality/technology (e.g., monoclonal antibody, small molecule). FK to technologies |
parent_id | FK (self) | Parent drug for combinations or salt-to-base redirects |
similar_drug_id | FK (self) | Reference/originator drug for biosimilars |
atc_code | json | WHO ATC classification codes (a drug can carry multiple) |
pharmacology | jsonb | PK/PD data: half-life, bioavailability, route, time-to-peak |
cell_therapy_metadata | jsonb | Cell source, cell type, genetic modification, target antigen (cell/gene therapies only) |
discontinuation_reason | text | Why the drug was stopped: safety, efficacy, commercial, strategic, regulatory |
embedding | text | pgvector embedding for semantic search |
drug_ownerships
Section titled “drug_ownerships”Maps drugs to owning organisations. Supports multi-party ownership, licensing geography, and temporal tracking.
| Column | Type | Purpose |
|---|---|---|
drug_id | FK | The owned drug |
organisation_id | FK | The owning organisation |
is_current | boolean | TRUE if ownership is active; FALSE if historical. Enforced: is_current = true requires ended_at IS NULL |
deal_type | text | How ownership was obtained: originator, acquisition, license, co_development, collaboration, spin_off |
geography_include | json | ISO country codes or regions where ownership applies |
geography_exclude | json | ISO country codes or regions explicitly excluded |
started_at | timestamptz | When ownership began |
ended_at | timestamptz | When ownership ended (NULL if still current) |
drug_target_actions
Section titled “drug_target_actions”Mechanism of action — how a drug interacts with a molecular target. One row per drug-target-action triple, enforced by a unique constraint on (drug_id, target_id, mode_of_action_id).
| Column | Type | Purpose |
|---|---|---|
drug_id | FK | The drug acting on the target |
target_id | FK | The molecular target (FK to targets) |
mode_of_action_id | FK | How the drug acts: inhibitor, agonist, antagonist, etc. (FK to modes_of_action) |
is_primary | boolean | Whether this is the primary mechanism for the drug |
undisclosed | boolean | TRUE if the target has not been publicly disclosed |
drug_approvals
Section titled “drug_approvals”Regulatory approvals across major regions. One row per drug-region approval event. Carries regulatory designation flags and structured label data.
| Column | Type | Purpose |
|---|---|---|
drug_id | FK | The approved drug |
approval_region | text | Regulatory region: usa, eu, jp, cn |
approval_status | text | Current status: approved, tentative, withdrawn, refused |
approval_date | timestamptz | Date of regulatory approval |
brand_name | text | Approved trade name (e.g., Keytruda) |
application_number | text | NDA, BLA, or MAA number |
innovation_type | text | new_molecular_entity, new_formulation, new_indication, new_combination, new_dosage_form, biosimilar, generic |
dosage_form | text | Physical form: tablet, injection, infusion, capsule |
strength | text | Dosage strength (e.g., 100mg, 50mg/mL) |
route | text | Route of administration: oral, intravenous, subcutaneous |
breakthrough_therapy | boolean | FDA Breakthrough Therapy Designation |
priority_review | boolean | FDA Priority Review |
accelerated_approval | boolean | FDA Accelerated Approval pathway |
fast_track | boolean | FDA Fast Track designation |
orphan_drug | boolean | Orphan Drug designation |
prime_eu | boolean | EU PRIME designation |
rmat | boolean | FDA RMAT (Regenerative Medicine Advanced Therapy) designation |
real_time_review | boolean | Real-Time Oncology Review |
estimated_loss_of_exclusivity | timestamptz | Estimated LOE date for generic/biosimilar entry |
label | jsonb | Structured FDA label data (dosing, warnings, contraindications) |
rems | jsonb | Risk Evaluation and Mitigation Strategy data |
region_metadata | jsonb | Region-specific metadata: PDUFA target date, CHMP opinion, PMDA review |
drug_approval_organisations
Section titled “drug_approval_organisations”Organisations associated with a regulatory approval. One row per approval-organisation-role triple (unique constraint enforced).
| Column | Type | Purpose |
|---|---|---|
drug_approval_id | FK | The regulatory approval |
organisation_id | FK | The organisation |
role | text | Organisation’s role: applicant, manufacturer, distributor |
brand_drugs
Section titled “brand_drugs”Marketed brand-name products of a drug molecule. Multiple brands can exist per drug across different regions and formulations.
| Column | Type | Purpose |
|---|---|---|
drug_id | FK | The underlying molecule |
brand_name | text | Trade/brand name (e.g., Keytruda, Herceptin) |
active_ingredient | text | Active ingredient as listed on the label |
dosage_form | text | Pharmaceutical form (tablet, injection, capsule) |
route | text | Route of administration |
strength | text | Dosage strength |
aliases | json | Alternative brand names or regional names |
brand_drug_approvals
Section titled “brand_drug_approvals”Bridge table linking branded products to their regulatory approvals. One row per brand-approval pair (unique constraint enforced).
| Column | Type | Purpose |
|---|---|---|
brand_drug_id | FK | The branded product |
drug_approval_id | FK | The regulatory approval |
drug_milestones
Section titled “drug_milestones”Regulatory and development lifecycle events. One row per milestone, optionally scoped to a disease.
| Column | Type | Purpose |
|---|---|---|
drug_id | FK | The drug |
disease_id | FK (nullable) | Specific disease context, if applicable |
milestone_type | text | Event type: discovery, pre_clinical, ind_filing, end_of_phase2_meeting, nda_submission, bla_submission, maa_submission, pdufa_date, crl, advisory_committee, approval, label_expansion, post_marketing_requirement, withdrawal |
milestone_date | timestamptz | Date of event (can be future for PDUFA dates) |
region | text | Regulatory region: usa, eu, jp, cn, or other ISO regions |
description | text | Free-text description of the milestone |
Design decisions
Section titled “Design decisions”| Decision | Why | Alternative considered |
|---|---|---|
drug_status is a curated column, not derived from approvals | 762 marketed drugs have zero approval records (pre-existing marketed compounds, regional gaps in source data). Deriving status from approvals would misclassify them. | Computed column from drug_approvals — rejected because it cannot account for curation gaps. |
brand_drug_approvals join table instead of fixed-region FKs | Regions are not a closed set. A join table scales to any number of approval regions without schema changes. | Hardcoded usa_approval_id, eu_approval_id, jp_approval_id, cn_approval_id columns on brand_drugs — rejected for rigidity. |
parent_id self-reference for combinations | Combination products (e.g., encorafenib + binimetinib) need a parent row. Children redirect to the parent for pipeline queries. Salt forms also redirect to their base molecule. | Separate combination_components join table — rejected because the parent-child relationship is singular and hierarchical. |
similar_drug_id self-reference for biosimilars | Biosimilars reference a single originator drug. A self-FK keeps the relationship navigable without a separate table. | Separate biosimilar_relationships table — rejected as over-engineering for a 1:N relationship. |
Materialized views
Section titled “Materialized views”Two materialized views pre-join the most common query paths. They trade write-time cost for read-time speed. See Materialized views for refresh schedules and indexing strategy.
mv_development_programs
Section titled “mv_development_programs”Highest development phase per drug-disease pair. Combines approved indications (rank 100) with clinical trial phases extracted from trial arm interventions. One row per (drug_id, disease_id), keeping only the highest phase. Key columns: highest_phase, highest_phase_rank (integer for sorting), phase_source (approval or clinical_trial), and nct_ids (array of contributing trial identifiers).
mv_drug_pipeline
Section titled “mv_drug_pipeline”Pre-joined drug overview for zero-JOIN pipeline queries. One row per drug with denormalized fields: drug_name, drug_status, drug_type, technology_name, drug_class, primary_target_name, target_names (array), current_owner_names (array), highest_phase_overall, and highest_phase_rank_overall. Draws phase data from mv_development_programs.
Example queries
Section titled “Example queries”Which drugs target EGFR?
SELECT d.name, moa.name AS mechanism, dta.is_primaryFROM drug_target_actions dtaJOIN drugs d ON d.id = dta.drug_idJOIN targets tgt ON tgt.id = dta.target_idLEFT JOIN modes_of_action moa ON moa.id = dta.mode_of_action_idWHERE tgt.name = 'EGFR'ORDER BY dta.is_primary DESC, d.name;Who owns pembrolizumab?
SELECT o.name AS owner, dow.deal_type, dow.geography_include, dow.is_currentFROM drug_ownerships dowJOIN drugs d ON d.id = dow.drug_idJOIN organisations o ON o.id = dow.organisation_idWHERE d.name = 'pembrolizumab'ORDER BY dow.is_current DESC, dow.started_at DESC;Show all Phase 3+ drugs (using the materialized view)
SELECT drug_name, drug_status, primary_target_name, current_owner_names, highest_phase_overallFROM mv_drug_pipelineWHERE highest_phase_rank_overall >= 3ORDER BY highest_phase_rank_overall DESC, drug_name;