CLI Reference
SRAKE CLI Reference
Complete reference for all SRAKE (SRA Knowledge Engine) commands and options.
Global Flags
These flags are available for all commands:
--no-color- Disable colored output-v, --verbose- Enable verbose output-q, --quiet- Suppress non-error output-y, --yes- Assume yes to all prompts (non-interactive mode)--debug- Enable debug output for troubleshooting--help- Show help for any command
Commands
srake ingest
Ingest SRA metadata from NCBI or local archives.
srake ingest [flags]Flags
--auto- Auto-select the best file from NCBI--daily- Ingest the latest daily update--monthly- Ingest the latest monthly dataset--file <path>- Ingest specific file (local or NCBI)--list- List available files without ingesting--db <path>- Database path (default: “~/.local/share/srake/srake.db”)--force- Force ingestion even if data exists--no-progress- Disable progress bar
Filtering Flags
--taxon-ids <ids>- Filter by taxonomy IDs (comma-separated)--exclude-taxon-ids <ids>- Exclude taxonomy IDs--date-from <YYYY-MM-DD>- Start date for filtering--date-to <YYYY-MM-DD>- End date for filtering--organisms <names>- Filter by organism names--platforms <names>- Filter by platforms (ILLUMINA, OXFORD_NANOPORE, etc.)--strategies <names>- Filter by library strategies (RNA-Seq, WGS, etc.)--min-reads <n>- Minimum read count filter--max-reads <n>- Maximum read count filter--stats-only- Only show statistics without inserting data
Examples
# Auto-ingest best file
srake ingest --auto
# Non-interactive ingest (no prompts)
srake ingest --auto --yes
# Ingest with filters
srake ingest --auto --taxon-ids 9606 --platforms ILLUMINA --strategies RNA-Seq
# List available files
srake ingest --list
# Debug mode to see detailed processing
srake ingest --auto --debugsrake search
Search SRA metadata with quality control and multiple search modes.
srake search <query> [flags]Search Flags
-o, --organism <name>- Filter by organism--platform <name>- Filter by platform--library-strategy <name>- Filter by library strategy-l, --limit <n>- Maximum results (default: 100)--offset <n>- Pagination offset--search-mode <mode>- Search mode: database|fts|hybrid|vector (default: hybrid)
Quality Control Flags
--similarity-threshold <float>- Minimum similarity score (0-1)--min-score <float>- Minimum absolute score--top-percentile <int>- Return only top N% of results--show-confidence- Include confidence level in results
Output Flags
-f, --format <type>- Output format (table|json|csv|tsv|xml)--output <file>- Save results to file--no-header- Omit header in output--fields <list>- Comma-separated list of fields to include
Examples
# Basic search
srake search "breast cancer"
# Search with quality control
srake search "RNA-Seq" --similarity-threshold 0.7 --show-confidence
# Vector semantic search
srake search "tumor gene expression" --search-mode vector
# Advanced filtering
srake search "transcriptome" \
--organism "homo sapiens" \
--library-strategy RNA-Seq \
--platform ILLUMINA \
--top-percentile 10
# Export filtered results
srake search "cancer" --format csv --output results.csvsrake convert
Convert between different accession types (SRA, GEO, BioProject, BioSample).
srake convert [<accession> ...] [flags]Flags
--to <type>- Target accession type (required)- Options: GSE, SRP, SRX, GSM, SRR, SRS, PRJNA, BIOSAMPLE
-f, --format <type>- Output format (table|json|yaml|csv|tsv)-o, --output <file>- Save results to file--batch <file>- Read accessions from file--dry-run- Preview conversions without executing
Examples
# Convert SRA Project to GEO Series
srake convert SRP123456 --to GSE
# Convert multiple accessions
srake convert SRP001 SRP002 SRP003 --to GSE
# Batch conversion from file
srake convert --batch accessions.txt --to SRX --output results.json
# Convert from stdin (pipe-friendly)
echo "SRP123456" | srake convert --to GSE
cat accession_list.txt | srake convert --to GSM --format json
# Preview conversion without executing
srake convert SRP123456 --to GSE --dry-run
# Debug mode to see conversion details
srake convert SRP123456 --to GSE --debugSupported Conversions
| From | To | Description |
|---|---|---|
| SRP | GSE, SRX, SRR, SRS, PRJNA | Study to related accessions |
| SRX | GSM, SRP, SRR, SRS | Experiment to related accessions |
| SRR | SRX, SRP, GSM | Run to parent accessions |
| SRS | SRX, GSM, BIOSAMPLE | Sample to related accessions |
| GSE | SRP, GSM | GEO Series to SRA/samples |
| GSM | SRX, SRR, GSE | GEO Sample to SRA/series |
| PRJNA | SRP | BioProject to SRA Project |
| SAMN | SRS | BioSample to SRA Sample |
srake runs
Get all runs for a study, experiment, or sample.
srake runs <accession> [flags]Flags
-d, --detailed- Include detailed information-f, --format <type>- Output format (table|json|yaml|csv|tsv)-o, --output <file>- Save results to file-l, --limit <n>- Limit number of results--fields <list>- Comma-separated list of fields
Examples
# Get runs for a study
srake runs SRP123456
# Get detailed run information
srake runs SRX123456 --detailed
# Export as JSON
srake runs SRP123456 --format json --output runs.jsonsrake samples
Get all samples for a study or experiment.
srake samples <accession> [flags]Flags
-d, --detailed- Include organism and taxonomy information-f, --format <type>- Output format (table|json|yaml|csv|tsv)-o, --output <file>- Save results to file-l, --limit <n>- Limit number of results
Examples
# Get samples for a study
srake samples SRP123456
# Get detailed sample information
srake samples SRP123456 --detailed
# Export as CSV
srake samples SRX123456 --format csv --output samples.csvsrake experiments
Get all experiments for a study or sample.
srake experiments <accession> [flags]Flags
-d, --detailed- Include platform and library information-f, --format <type>- Output format (table|json|yaml|csv|tsv)-o, --output <file>- Save results to file-l, --limit <n>- Limit number of results
Examples
# Get experiments for a study
srake experiments SRP123456
# Get experiments for a sample
srake experiments SRS123456 --detailedsrake studies
Get study information for any SRA accession.
srake studies <accession> [flags]Flags
-d, --detailed- Include abstract and full metadata-f, --format <type>- Output format (table|json|yaml|csv|tsv)-o, --output <file>- Save results to file
Examples
# Get study from an experiment
srake studies SRX123456
# Get study from a run with details
srake studies SRR123456 --detailedsrake download
Download SRA data files from multiple sources.
srake download [<accession> ...] [flags]Flags
-s, --source <type>- Download source (auto|ftp|aws|gcp|ncbi)-t, --type <type>- File type (sra|fastq|fasta)-o, --output <dir>- Output directory (default: “./”)--threads <n>- Download threads per file (default: 1)-p, --parallel <n>- Parallel downloads (default: 1)--aspera- Use Aspera for high-speed transfer-l, --list <file>- File containing accessions--retry <n>- Number of retry attempts (default: 3)--validate- Validate downloaded files (default: true)--dry-run- Show what would be downloaded
Examples
# Basic download
srake download SRR123456
# Download from AWS with parallel transfers
srake download SRR123456 --source aws --threads 4
# Download all runs for a study
srake download SRP123456 --type fastq --output ./data/
# Batch download from file
srake download --list runs.txt --parallel 4
# Download from stdin (pipe-friendly)
echo "SRR123456" | srake download --type fastq
srake runs SRP123456 | srake download --parallel 4
# High-speed Aspera transfer
srake download SRR123456 --aspera
# Dry run to preview downloads
srake download SRP123456 --dry-run
# Non-interactive download (no prompts)
srake download SRP123456 --yes
# Debug mode for troubleshooting
srake download SRR123456 --debugAutomatic Expansion
The download command automatically expands:
- SRP → all runs in the study
- SRX → all runs in the experiment
- SRS → all runs for the sample
srake metadata
Get detailed metadata for specific accessions.
srake metadata <accession> [accessions...] [flags]Flags
-f, --format <type>- Output format (table|json|yaml)--fields <list>- Comma-separated list of fields--expand- Expand nested structures
Examples
# Get metadata for an experiment
srake metadata SRX123456
# Get multiple accessions as JSON
srake metadata SRX123456 SRX123457 --format json
# Select specific fields
srake metadata SRR999999 --fields title,platform,strategysrake index
Manage search index for fast full-text and vector search.
srake index [flags]Index Operations
--build- Build search index from database--rebuild- Rebuild index from scratch (removes existing)--verify- Verify index integrity--stats- Show index statistics--resume- Resume interrupted index building
Index Options
--batch-size <n>- Documents per batch (default: 1000)--workers <n>- Number of parallel workers--path <dir>- Index directory path--with-embeddings- Build vector embeddings for semantic search--embedding-model <name>- Model for embeddings (default: SapBERT)--progress- Show progress bar--progress-file <file>- Save progress to file--checkpoint-dir <dir>- Directory for checkpoints
Examples
# Build search index with progress
srake index --build --progress
# Build with vector embeddings for semantic search
srake index --build --with-embeddings
# Build with custom batch size and path
srake index --build --batch-size 5000 --path /custom/index
# Build embeddings with quantized model (faster, less memory)
SRAKE_MODEL_VARIANT=quantized srake index --build --with-embeddings
# Resume interrupted build
srake index --resume
# Rebuild from scratch
srake index --rebuild
# Verify index integrity
srake index --verify
# Show index statistics
srake index --statssrake server
Start the API server for programmatic access and AI integration.
srake server [flags]Flags
-p, --port <n>- Port to listen on (default: 8080)--host <addr>- Host to bind to (default: localhost)--enable-cors- Enable CORS for web access--enable-mcp- Enable Model Context Protocol for AI assistants--db <path>- Database path--index-path <path>- Search index path--log-level <level>- Log level (debug|info|warn|error)
Examples
# Start server with all features
srake server --port 8082 --enable-cors --enable-mcp
# Custom database and index
srake server --db /path/to/db --index-path /path/to/index
# Production deployment
srake server --host 0.0.0.0 --port 80 --enable-cors
# With environment variables
SRAKE_DB_PATH=test.db SRAKE_INDEX_PATH=/tmp/index srake serverAPI Endpoints
/api/v1/search- Search with quality control/api/v1/stats- Database statistics/api/v1/studies/{id}- Study metadata/api/v1/export- Export search results/api/v1/health- Service health check/mcp- MCP JSON-RPC endpoint/mcp/capabilities- MCP server capabilities
srake db
Database management commands.
srake db <subcommand> [flags]Subcommands
info- Show database statistics and informationexport- Export database to SRAmetadb format
Examples
# Show database statistics
srake db info
# Export to SRAmetadb format
srake db export -o SRAmetadb.sqlitesrake db export
Export the srake database to SRAmetadb.sqlite format for compatibility with tools expecting the original SRAmetadb schema.
srake db export [flags]Flags
-o, --output <file>- Output database file path (default: “SRAmetadb.sqlite”)--db <path>- Source database path (defaults to ~/.local/share/srake/srake.db)--fts-version <n>- FTS version: 3 for compatibility, 5 for modern (default: 5)--batch-size <n>- Batch size for data transfer (default: 10000)--progress- Show progress bar (default: true)--compress- Compress output with gzip-f, --force- Overwrite existing output file
Examples
# Basic export with FTS5 (recommended)
srake db export -o SRAmetadb.sqlite
# Export with FTS3 for 100% compatibility
srake db export -o SRAmetadb.sqlite --fts-version 3
# Export from specific database
srake db export --db /path/to/srake.db -o SRAmetadb.sqlite
# Export with compression
srake db export -o SRAmetadb.sqlite.gz --compress
# Large dataset with custom batch size
srake db export -o SRAmetadb.sqlite --batch-size 50000Output Schema
The exported database contains:
- Standard tables:
study,experiment,sample,run,submission - Denormalized table:
sra(joins all tables for easy querying) - Full-text search:
sra_ft(FTS3 or FTS5 virtual table) - Metadata:
metaInfo(version and creation info) - Column descriptions:
col_desc(field documentation)
Compatibility Notes
- FTS5 (default): Modern, faster, smaller index size, better Unicode support
- FTS3: Use for compatibility with older tools that require FTS3
- The export maps srake’s modern schema to the classic SRAmetadb format
- JSON fields are converted to pipe-delimited strings
- Missing legacy fields are populated with appropriate defaults
srake config
Configuration and path management commands.
srake config <subcommand> [flags]Subcommands
paths- Show all active paths and environment variablesshow- Display current configurationinit- Initialize default configuration fileedit- Open configuration in editor
Flags (init)
--force- Overwrite existing configuration
Examples
# View all paths
srake config paths
# Initialize configuration
srake config init
# Edit configuration
srake config edit
# Show current config
srake config showsrake cache
Cache management commands for controlling disk usage.
srake cache <subcommand> [flags]Subcommands
info- Show cache information and sizesclean- Remove cache files
Flags (clean)
--all- Remove all cache including indices--older <duration>- Remove files older than duration (e.g., 30d, 24h)--search- Remove search result cache--downloads- Remove downloaded files--index- Remove search index (requires rebuild)
Examples
# View cache usage
srake cache info
# Clean downloads older than 30 days
srake cache clean --older 30d
# Remove all downloads
srake cache clean --downloads
# Clean everything (with confirmation)
srake cache clean --allOutput Formats
Most commands support multiple output formats:
- table (default) - Human-readable table with colors
- json - JSON format for programmatic use
- yaml - YAML format
- csv - Comma-separated values
- tsv - Tab-separated values
- xml - XML format (convert command only)
Environment Variables
Path Configuration
SRAKE_CONFIG_HOME- Override config directory (default:~/.config/srake)SRAKE_DATA_HOME- Override data directory (default:~/.local/share/srake)SRAKE_CACHE_HOME- Override cache directory (default:~/.cache/srake)SRAKE_STATE_HOME- Override state directory (default:~/.local/state/srake)SRAKE_DB_PATH- Override database path (default:~/.local/share/srake/srake.db)SRAKE_INDEX_PATH- Override search index path (default:~/.cache/srake/index)SRAKE_MODELS_PATH- Override models directory for embeddings
Search Configuration
SRAKE_MODEL_VARIANT- Model variant for embeddings: full|quantized (default: full)SRAKE_DEFAULT_LIMIT- Default search result limitSRAKE_SEARCH_MODE- Default search mode: database|fts|hybrid|vector
Output Control
NO_COLOR- Disable colored output globallySRAKE_NO_COLOR- Disable colored output for srakeSRAKE_DEBUG- Enable debug outputSRAKE_VERBOSE- Enable verbose output
Cloud Configuration
AWS_REGION- Affects download source auto-selectionGCP_PROJECT- Affects download source auto-selection
Exit Codes
0- Success1- General error2- Command line usage error130- Interrupted (Ctrl+C)