Getting Started with SRAKE
Getting Started with SRAKE - SRA Knowledge Engine
SRAKE pronunciation: Like Japanese sake (酒) — “srah-keh”
Installation
SRAKE (SRA Knowledge Engine) provides multiple installation methods to suit your needs:
Requirements: Go 1.19 or later
go install github.com/nishad/srake/cmd/srake@latest
Verify the SRAKE installation:
srake --version # SRAKE - SRA Knowledge Engine
Run in container:
# Pull the image
docker pull ghcr.io/nishad/srake:latest
# Run with volume mounts for persistence
docker run -v ~/.local/share/srake:/data \
-v ~/.cache/srake:/cache \
ghcr.io/nishad/srake:latest \
ingest --auto
Download pre-built binaries:
# Linux/macOS
wget https://github.com/nishad/srake/releases/latest/download/srake-$(uname -s)-$(uname -m).tar.gz
tar -xzf srake-*.tar.gz
sudo mv srake /usr/local/bin/
Verify installation:
srake --version
Build from source:
git clone https://github.com/nishad/srake.git
cd srake
go build -o srake ./cmd/srake
./srake --help
Quick Start
--auto
flag intelligently selects between daily updates or full datasets based on your database state.
Data is stored in ~/.local/share/srake/srake.db
by default. Use SRAKE_DB_PATH
to override.# Latest daily update
srake ingest --daily
# Full monthly archive
srake ingest --monthly
# Local archive file
srake ingest --file /path/to/archive.tar.gz
# Direct from NCBI
srake ingest --file https://ftp.ncbi.nlm.nih.gov/sra/reports/Metadata/archive.tar.gz
API Features:
- RESTful API:
/api/v1/search
,/api/v1/stats
,/api/v1/export
- MCP for AI assistants:
/mcp
,/mcp/capabilities
- Quality control: similarity thresholds, confidence scoring
- Multiple formats: JSON, CSV, TSV, XML
Example: curl "http://localhost:8082/api/v1/search?query=cancer&similarity_threshold=0.7"
Use environment variables for custom paths:
SRAKE_DB_PATH=/fast/ssd/srake.db
- Use fast storage for databaseSRAKE_CACHE_HOME=/tmp/srake
- Use temporary storage for cache
Automation Features
srake follows clig.dev best practices for CLI design, making it perfect for automation:
Non-Interactive Mode
# Use --yes flag to skip all prompts
srake ingest --auto --yes
srake download SRP123456 --yes
Pipeline Composition
# Commands accept stdin for easy chaining
echo "SRP123456" | srake convert --to GSE
cat accessions.txt | srake download --parallel 4
srake search "RNA-Seq" | srake download --type fastq
Dry Run & Debug
# Preview actions without executing
srake download SRP123456 --dry-run
# Enable debug output for troubleshooting
srake convert SRP123456 --to GSE --debug
Structured Output
# Export in various formats for processing
srake search "human" --format json | jq '.results[].accession'
srake convert SRP123456 --to GSE --format csv > results.csv
See the Automation Guide for more advanced scripting examples.
Filtering Options
Filtering helps reduce database size by processing only the data you need.
Filter by Taxonomy
# Human data only (taxonomy ID 9606)
srake ingest --file archive.tar.gz \
--taxon-ids 9606
# Human, mouse, and zebrafish
srake ingest --file archive.tar.gz \
--taxon-ids 9606,10090,7955
# Exclude viruses
srake ingest --file archive.tar.gz \
--exclude-taxon-ids 32630,2697049
Filter by Date Range
# Data from 2024 only
srake ingest --file archive.tar.gz \
--date-from 2024-01-01 \
--date-to 2024-12-31
Filter by Platform & Strategy
Illumina RNA-Seq:
srake ingest --file archive.tar.gz \
--platforms ILLUMINA \
--strategies RNA-Seq
Oxford Nanopore WGS:
srake ingest --file archive.tar.gz \
--platforms OXFORD_NANOPORE \
--strategies WGS
Quality Filtering
# High-quality data only
srake ingest --file archive.tar.gz \
--min-reads 10000000 \
--min-bases 1000000000
--stats-only
to see what would be imported without actually inserting dataResume Capability
srake automatically tracks progress and can resume from interruptions:
Database Management
Database Info:
srake db info
# Shows:
# • Database size
# • Table counts
# • Index status
Custom Location:
srake ingest \
--file archive.tar.gz \
--db /custom/path/db.sqlite
Verbose Mode:
srake ingest \
--file archive.tar.gz \
--verbose
Configuration Options
Performance Tuning
# Adjust checkpoint frequency
srake ingest --file archive.tar.gz \
--checkpoint 5000
# Disable progress bar
srake ingest --file archive.tar.gz \
--no-progress
# Set worker count
srake ingest --file archive.tar.gz \
--workers 8
Next Steps
Getting Help
Need assistance? Check these resources: