Get started
Get Data Gov running on your machine in about 30 minutes. You will clone the repo, restore a production database snapshot, and boot the development server.
What you will have at the end
Section titled “What you will have at the end”A local Rails server at http://localhost:3000 with the ActiveAdmin dashboard. You can browse drugs, diseases, clinical trials, and all 87 admin resources. You can run Thor pipeline tasks against real data.
Prerequisites
Section titled “Prerequisites”Install these before starting.
| Dependency | Version | Purpose |
|---|---|---|
| Ruby | 3.4.2 | Application runtime (manage with rbenv) |
| PostgreSQL | 17+ | Primary database (Docker image with pgvector) |
| Redis | 7+ | Sidekiq job queue backend |
| Docker | Latest | Runs Postgres and Redis containers |
| Node.js | LTS | Shakapacker/React builds |
| libpq | Latest | PostgreSQL C client library |
| graphviz | Latest | ERD generation (optional) |
You also need credentials for external services.
| Credential | Required for | How to get it |
|---|---|---|
| AACT database | Clinical trials data | Register at aact.ctti-clinicaltrials.org |
| Google OAuth | Admin login | Create credentials in Google Cloud Console |
| OpenAI API key | LLM pipelines | Get from OpenAI dashboard |
| AWS credentials | S3, Batch, CloudWatch | Optional for local dev — ask your lead |
Step 1: Install system dependencies
Section titled “Step 1: Install system dependencies”# Install native librariesbrew install libpq graphvizexport PATH="/opt/homebrew/opt/libpq/bin:$PATH"echo 'export PATH="/opt/homebrew/opt/libpq/bin:$PATH"' >> ~/.zshrc
# Install rbenv and Rubybrew install rbenvrbenv install 3.4.2rbenv global 3.4.2
# Install Bundlergem install bundlerStep 2: Clone and install
Section titled “Step 2: Clone and install”git clone git@github.com:Bioloupe-Inc/bioloupe-data-gov.gitcd bioloupe-data-gov
# Install Ruby dependenciesbundle install
# Install JavaScript dependenciespnpm installStep 3: Configure environment variables
Section titled “Step 3: Configure environment variables”cp env.example .envOpen .env and set these required values.
# Primary database (Docker defaults)DB_HOST=localhostDB_NAME=datalakeDB_USERNAME=bioloupeDB_PASSWORD=bioloupe
# AACT clinical trials database (read-only)AACT_DB_HOST=aact-db.ctti-clinicaltrials.orgAACT_DB_NAME=aactAACT_DB_USERNAME=<your-aact-username>AACT_DB_PASSWORD=<your-aact-password>
# Google OAuth for admin loginGOOGLE_CLIENT_ID=<your-client-id>GOOGLE_CLIENT_SECRET=<your-client-secret>
# OpenAI for LLM pipelinesOPENAI_API_KEY=<your-key>The env.example file documents all optional variables: AWS (S3, Batch, Athena), ChEMBL, Cision, FMP, Slack, Brevo, Klaviyo, ASCO, New Relic, and Airbrake. You only need these for specific pipeline features.
Step 4: Start local infrastructure
Section titled “Step 4: Start local infrastructure”Docker Compose provides PostgreSQL 17 (with pgvector) and Redis 7.
docker compose up -dVerify both services are healthy.
docker compose psYou should see two running containers. Stop them later with docker compose down.
Step 5: Restore the database
Section titled “Step 5: Restore the database”Fresh migrations are not supported. The schema has 206 tables and complex interdependencies. Restore from a production dump instead.
bundle exec thor db:restoreThis command lists available S3 backups and handles pg_restore automatically.
Step 6: Create your user account
Section titled “Step 6: Create your user account”Open a Rails console.
bundle exec rails cCreate an admin user.
PaperTrail.request(enabled: false) do User.create( email: 'your.name@bioloupe.com', name: 'Your Name', role: 'admin' )endValid roles: admin, editor, viewer, client. Use admin for full access during development. The client role is API-only and cannot access ActiveAdmin.
Step 7: Start the dev server
Section titled “Step 7: Start the dev server”For backend-only work, one terminal is enough.
bundle exec rails server# Open http://localhost:3000The root path redirects to /admin, the ActiveAdmin dashboard.
For full-stack development with React hot reload, use two terminals.
# Terminal 1: Webpack dev server with HMRbin/shakapacker-dev-server
# Terminal 2: Rails server on a different portrails s -p 3010# Open http://localhost:3010Or use Foreman with bin/dev to run both processes from a single terminal.
Step 8: Verify your setup
Section titled “Step 8: Verify your setup”Run these checks to confirm everything works.
# Run the test suitebundle exec rails test
# Check a Thor task worksbundle exec thor regulatory:fda:download_and_extract --help
# Check that the health endpoint respondscurl -s http://localhost:3000/up# Should return 200Step 9: Run a pipeline (optional)
Section titled “Step 9: Run a pipeline (optional)”Try a single Thor task to see how pipelines work.
bundle exec thor regulatory:fda:download_and_extractOr launch a full workflow from the Rails console.
WorkflowRunnerJob.perform_now( workflow_type: "ClinicalTrialsWorkflow", params: {}, reset_if_exists: true)You can also trigger workflows from the ActiveAdmin UI at /admin/workflow_instances.
Next steps
Section titled “Next steps”Now that Data Gov is running, read the docs in order.
- Data model — Understand the 206-table schema and how entities connect
- Clinical trials — Follow a trial from ClinicalTrials.gov into the knowledge graph
- Architecture — Learn the codebase patterns before adding features