Lookups
Lookup tables hold reference data that is seeded once and rarely changes. They provide controlled vocabularies, geographic codes, and ontology mappings that other domains reference through foreign keys. The lookups domain contains six reference entities plus one audit entity that tracks changes across the entire database.
Reference tables
Section titled “Reference tables”countries
Section titled “countries”198 countries with ISO 3166-1 codes, organized into five geographic regions.
| Column | Type | Purpose |
|---|---|---|
name | text (unique) | Full country name (e.g., “United States”, “Germany”) |
alpha2 | text (unique) | ISO 3166-1 alpha-2 code (e.g., US, DE, JP) |
alpha3 | text (unique) | ISO 3166-1 alpha-3 code (e.g., USA, DEU, JPN) |
region | text | Geographic region: africa, americas, asia_pacific, europe, middle_east |
All three name/code columns carry unique constraints. Region is nullable but populated for every seeded row.
endpoints
Section titled “endpoints”39 clinical trial endpoints grouped into four categories.
| Column | Type | Purpose |
|---|---|---|
name | text (unique) | Full endpoint name (e.g., “Overall Survival”, “Progression-Free Survival”) |
abbreviation | text | Common abbreviation (e.g., OS, PFS, ORR, DOR, CR, DCR) |
description | text | Definition of what this endpoint measures |
category | text | efficacy, safety, pharmacokinetic, quality_of_life |
modes_of_action
Section titled “modes_of_action”35 drug mechanisms describing how a compound interacts with its molecular target.
| Column | Type | Purpose |
|---|---|---|
name | text (unique) | Mechanism name (e.g., “Inhibitor”, “Antagonist”, “Agonist”, “Degrader”) |
description | text | Mechanistic description of the mode of action |
Entries range from classical pharmacology (Inhibitor, Agonist, Antagonist) through targeted modalities (Antibody-Drug Conjugate, CAR-T Cell Therapy, Bispecific Engagement) to epigenetic and gene-level interventions (Histone Deacetylase Inhibition, RNA Interference, Gene Therapy). Referenced by drug_target_actions to link drugs to targets with a specific mechanism.
therapeutic_areas
Section titled “therapeutic_areas”18 broad disease categories used to classify drugs and development programs.
| Column | Type | Purpose |
|---|---|---|
name | text (unique) | Therapeutic area name (e.g., oncology, malignant_hematology, neurology) |
description | text | Description of the therapeutic area |
Full list: oncology, malignant_hematology, non_malignant_hematology, cardiology, neurology, immunology, respiratory, infectious_disease, metabolic, dermatology, ophthalmology, rare_disease, gastroenterology, rheumatology, endocrinology, nephrology, pulmonology, psychiatry.
ncit_concepts
Section titled “ncit_concepts”NCI Thesaurus concepts for standardized biomedical coding. Unlike the fixed-size lookup tables above, this table grows as new concepts are imported.
| Column | Type | Purpose |
|---|---|---|
ncit_code | text (unique) | NCI Thesaurus code (e.g., C1234). Natural key. |
name | text | Concept preferred name |
semantic_types | jsonb | Array of NCI Thesaurus semantic type labels |
description | text | Concept definition text |
concept_mappings
Section titled “concept_mappings”Maps internal Bioloupe entities to external ontology codes for interoperability. Supports cross-referencing drugs, diseases, targets, biomarkers, endpoints, and modes of action against ten external vocabularies.
| Column | Type | Purpose |
|---|---|---|
entity_type | text | Type of entity being mapped: drug, disease, target, biomarker, endpoint, mode_of_action |
entity_id | uuid | UUID of the entity in this database |
vocabulary | text | External vocabulary: omop, icd10, mesh, snomed, rxnorm, meddra, atc, unii, chembl, drugbank |
code | text | Code/identifier in the external vocabulary |
preferred_term | text | Preferred term as it appears in the external vocabulary |
A unique constraint on (entity_type, entity_id, vocabulary, code) prevents duplicate mappings. Composite indexes on (entity_type, entity_id) and (vocabulary, code) support both directions of lookup: “what external codes does this entity have?” and “which entity does this external code refer to?”
Seeding
Section titled “Seeding”Lookup tables are populated by idempotent seeders that use upsert logic. Re-running a seeder never creates duplicates — it matches on the natural key (typically name) and updates any changed fields.
| Table | Seeded rows |
|---|---|
countries | 198 |
endpoints | 39 |
modes_of_action | 35 |
therapeutic_areas | 18 |
ncit_concepts and concept_mappings are populated by ETL pipelines rather than seeders, and grow over time.
Design decisions
Section titled “Design decisions”| Decision | Why |
|---|---|
therapeutic_areas as a lookup table with FK constraints | Replaced a free-text JSONB array on drugs. A dedicated table enforces a controlled vocabulary, prevents typos, and makes the set of valid values discoverable without scanning data. |
| Idempotent upsert seeding | Seeders can run repeatedly during development, CI, and production bootstrapping without risk of duplicates or constraint violations. The natural key (usually name) is the match target. |
concept_mappings as a polymorphic bridge table | A single table handles cross-references for six entity types and ten vocabularies. Adding a new entity type or vocabulary requires no schema change. |
audit_log
Section titled “audit_log”Tracks every insert, update, and delete across the database. Populated by PostgreSQL triggers, not application code, so changes from any source (API, ETL pipeline, manual SQL) are captured.
| Column | Type | Purpose |
|---|---|---|
id | bigint (identity) | Auto-incrementing primary key |
entity_type | text | Table name that was modified (e.g., drugs, organisations) |
entity_id | text | Primary key of the modified row (text to support both UUID and bigint) |
action | text | insert, update, delete |
changed_fields | jsonb | Array of column names that changed (update only) |
old_values | jsonb | Previous column values (update and delete) |
new_values | jsonb | New column values (insert and update) |
changed_by | text | User or system identifier |
changed_at | timestamptz | Event timestamp (uses clock_timestamp() for actual wall-clock time) |
application_name | text | Source application (e.g., bioloupe-api, etl-pipeline) |
client_addr | text | Client IP address |
Indexes on (entity_type, entity_id), changed_at, and action support common audit queries: “show me the history of this row,” “what changed in the last hour,” and “how many deletes happened today.”
Example queries
Section titled “Example queries”Which therapeutic areas are available?
SELECT name, descriptionFROM therapeutic_areasORDER BY name;Audit trail for a specific drug
SELECT action, changed_fields, old_values, new_values, changed_by, changed_at, application_nameFROM audit_logWHERE entity_type = 'drugs' AND entity_id = '550e8400-e29b-41d4-a716-446655440000'ORDER BY changed_at DESC;