Skip to content

Instantly share code, notes, and snippets.

@bguo068
Created August 11, 2023 17:27
Show Gist options
  • Save bguo068/05260e9a0c19e7417c961b143d28c553 to your computer and use it in GitHub Desktop.
Save bguo068/05260e9a0c19e7417c961b143d28c553 to your computer and use it in GitHub Desktop.
obtain SRA run IDs from BioSample IDs via
  1. Install Entrez Direct command line tool. See instructions here https://www.ncbi.nlm.nih.gov/books/NBK179288/
  2. Make a list of BioSample IDs, one line per sample
  3. Run the efetch, elink and xtract tools to fetch the SRA run IDs
cat biosample_ids.txt | while read SAMPLE ; do 
  SRR=`elink -db Biosample -id $SAMPLE -target sra \
    | efetch -format docsum \
    | xtract -pattern Runs -element Run@acc \
    | tr '\n' ','`
  echo $SAMPLE $SRR; 
  sleep 1;  
done  > sample2srr_map.txt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment