Getting Started with SRAKE
Getting Started with SRAKE - SRA Knowledge Engine
SRAKE pronunciation: Like Japanese sake (酒) — “srah-keh”
Installation
SRAKE (SRA Knowledge Engine) provides multiple installation methods to suit your needs:
Requirements: Go 1.19 or later
go install github.com/nishad/srake/cmd/srake@latestVerify the SRAKE installation:
srake --version # SRAKE - SRA Knowledge EngineRun in container:
# Pull the image
docker pull ghcr.io/nishad/srake:latest
# Run with volume mounts for persistence
docker run -v ~/.local/share/srake:/data \
-v ~/.cache/srake:/cache \
ghcr.io/nishad/srake:latest \
ingest --autoDownload pre-built binaries:
# Linux/macOS
wget https://github.com/nishad/srake/releases/latest/download/srake-$(uname -s)-$(uname -m).tar.gz
tar -xzf srake-*.tar.gz
sudo mv srake /usr/local/bin/Verify installation:
srake --versionBuild from source:
git clone https://github.com/nishad/srake.git
cd srake
go build -o srake ./cmd/srake
./srake --helpQuick Start
--auto flag intelligently selects between daily updates or full datasets based on your database state.
Data is stored in ~/.local/share/srake/srake.db by default. Use SRAKE_DB_PATH to override.# Latest daily update
srake ingest --daily# Full monthly archive
srake ingest --monthly# Local archive file
srake ingest --file /path/to/archive.tar.gz# Direct from NCBI
srake ingest --file https://ftp.ncbi.nlm.nih.gov/sra/reports/Metadata/archive.tar.gzAPI Features:
- RESTful API:
/api/v1/search,/api/v1/stats,/api/v1/export - MCP for AI assistants:
/mcp,/mcp/capabilities - Quality control: similarity thresholds, confidence scoring
- Multiple formats: JSON, CSV, TSV, XML
Example: curl "http://localhost:8082/api/v1/search?query=cancer&similarity_threshold=0.7"
Use environment variables for custom paths:
SRAKE_DB_PATH=/fast/ssd/srake.db- Use fast storage for databaseSRAKE_CACHE_HOME=/tmp/srake- Use temporary storage for cache
Automation Features
srake follows clig.dev best practices for CLI design, making it perfect for automation:
Non-Interactive Mode
# Use --yes flag to skip all prompts
srake ingest --auto --yes
srake download SRP123456 --yesPipeline Composition
# Commands accept stdin for easy chaining
echo "SRP123456" | srake convert --to GSE
cat accessions.txt | srake download --parallel 4
srake search "RNA-Seq" | srake download --type fastqDry Run & Debug
# Preview actions without executing
srake download SRP123456 --dry-run
# Enable debug output for troubleshooting
srake convert SRP123456 --to GSE --debugStructured Output
# Export in various formats for processing
srake search "human" --format json | jq '.results[].accession'
srake convert SRP123456 --to GSE --format csv > results.csvSee the Automation Guide for more advanced scripting examples.
Filtering Options
Filtering helps reduce database size by processing only the data you need.
Filter by Taxonomy
# Human data only (taxonomy ID 9606)
srake ingest --file archive.tar.gz \
--taxon-ids 9606# Human, mouse, and zebrafish
srake ingest --file archive.tar.gz \
--taxon-ids 9606,10090,7955# Exclude viruses
srake ingest --file archive.tar.gz \
--exclude-taxon-ids 32630,2697049Filter by Date Range
# Data from 2024 only
srake ingest --file archive.tar.gz \
--date-from 2024-01-01 \
--date-to 2024-12-31Filter by Platform & Strategy
Illumina RNA-Seq:
srake ingest --file archive.tar.gz \
--platforms ILLUMINA \
--strategies RNA-SeqOxford Nanopore WGS:
srake ingest --file archive.tar.gz \
--platforms OXFORD_NANOPORE \
--strategies WGSQuality Filtering
# High-quality data only
srake ingest --file archive.tar.gz \
--min-reads 10000000 \
--min-bases 1000000000--stats-only to see what would be imported without actually inserting dataResume Capability
srake automatically tracks progress and can resume from interruptions:
Database Management
Database Info:
srake db info
# Shows:
# • Database size
# • Table counts
# • Index statusCustom Location:
srake ingest \
--file archive.tar.gz \
--db /custom/path/db.sqliteVerbose Mode:
srake ingest \
--file archive.tar.gz \
--verboseConfiguration Options
Performance Tuning
# Adjust checkpoint frequency
srake ingest --file archive.tar.gz \
--checkpoint 5000# Disable progress bar
srake ingest --file archive.tar.gz \
--no-progress# Set worker count
srake ingest --file archive.tar.gz \
--workers 8Next Steps
Getting Help
Need assistance? Check these resources: