Skip to content

Get started

Get Data Gov running on your machine in about 30 minutes. You will clone the repo, restore a production database snapshot, and boot the development server.

A local Rails server at http://localhost:3000 with the ActiveAdmin dashboard. You can browse drugs, diseases, clinical trials, and all 87 admin resources. You can run Thor pipeline tasks against real data.

Install these before starting.

DependencyVersionPurpose
Ruby3.4.2Application runtime (manage with rbenv)
PostgreSQL17+Primary database (Docker image with pgvector)
Redis7+Sidekiq job queue backend
DockerLatestRuns Postgres and Redis containers
Node.jsLTSShakapacker/React builds
libpqLatestPostgreSQL C client library
graphvizLatestERD generation (optional)

You also need credentials for external services.

CredentialRequired forHow to get it
AACT databaseClinical trials dataRegister at aact.ctti-clinicaltrials.org
Google OAuthAdmin loginCreate credentials in Google Cloud Console
OpenAI API keyLLM pipelinesGet from OpenAI dashboard
AWS credentialsS3, Batch, CloudWatchOptional for local dev — ask your lead
Terminal window
# Install native libraries
brew install libpq graphviz
export PATH="/opt/homebrew/opt/libpq/bin:$PATH"
echo 'export PATH="/opt/homebrew/opt/libpq/bin:$PATH"' >> ~/.zshrc
# Install rbenv and Ruby
brew install rbenv
rbenv install 3.4.2
rbenv global 3.4.2
# Install Bundler
gem install bundler
Terminal window
git clone git@github.com:Bioloupe-Inc/bioloupe-data-gov.git
cd bioloupe-data-gov
# Install Ruby dependencies
bundle install
# Install JavaScript dependencies
pnpm install
Terminal window
cp env.example .env

Open .env and set these required values.

Terminal window
# Primary database (Docker defaults)
DB_HOST=localhost
DB_NAME=datalake
DB_USERNAME=bioloupe
DB_PASSWORD=bioloupe
# AACT clinical trials database (read-only)
AACT_DB_HOST=aact-db.ctti-clinicaltrials.org
AACT_DB_NAME=aact
AACT_DB_USERNAME=<your-aact-username>
AACT_DB_PASSWORD=<your-aact-password>
# Google OAuth for admin login
GOOGLE_CLIENT_ID=<your-client-id>
GOOGLE_CLIENT_SECRET=<your-client-secret>
# OpenAI for LLM pipelines
OPENAI_API_KEY=<your-key>

The env.example file documents all optional variables: AWS (S3, Batch, Athena), ChEMBL, Cision, FMP, Slack, Brevo, Klaviyo, ASCO, New Relic, and Airbrake. You only need these for specific pipeline features.

Docker Compose provides PostgreSQL 17 (with pgvector) and Redis 7.

Terminal window
docker compose up -d

Verify both services are healthy.

Terminal window
docker compose ps

You should see two running containers. Stop them later with docker compose down.

Fresh migrations are not supported. The schema has 206 tables and complex interdependencies. Restore from a production dump instead.

Terminal window
bundle exec thor db:restore

This command lists available S3 backups and handles pg_restore automatically.

Open a Rails console.

Terminal window
bundle exec rails c

Create an admin user.

PaperTrail.request(enabled: false) do
User.create(
email: 'your.name@bioloupe.com',
name: 'Your Name',
role: 'admin'
)
end

Valid roles: admin, editor, viewer, client. Use admin for full access during development. The client role is API-only and cannot access ActiveAdmin.

For backend-only work, one terminal is enough.

Terminal window
bundle exec rails server
# Open http://localhost:3000

The root path redirects to /admin, the ActiveAdmin dashboard.

For full-stack development with React hot reload, use two terminals.

Terminal window
# Terminal 1: Webpack dev server with HMR
bin/shakapacker-dev-server
# Terminal 2: Rails server on a different port
rails s -p 3010
# Open http://localhost:3010

Or use Foreman with bin/dev to run both processes from a single terminal.

Run these checks to confirm everything works.

Terminal window
# Run the test suite
bundle exec rails test
# Check a Thor task works
bundle exec thor regulatory:fda:download_and_extract --help
# Check that the health endpoint responds
curl -s http://localhost:3000/up
# Should return 200

Try a single Thor task to see how pipelines work.

Terminal window
bundle exec thor regulatory:fda:download_and_extract

Or launch a full workflow from the Rails console.

WorkflowRunnerJob.perform_now(
workflow_type: "ClinicalTrialsWorkflow",
params: {},
reset_if_exists: true
)

You can also trigger workflows from the ActiveAdmin UI at /admin/workflow_instances.

Now that Data Gov is running, read the docs in order.

  • Data model — Understand the 206-table schema and how entities connect
  • Clinical trials — Follow a trial from ClinicalTrials.gov into the knowledge graph
  • Architecture — Learn the codebase patterns before adding features