How to download PubMed baseline data using rsync

PubMed raw data is hosted on the NCBI servers which provide convenient access using rsync.

Download PubMed baseline data using a rsync command like

example.sh
rsync -Pav ftp.ncbi.nlm.nih.gov::pubmed/baseline/\*.xml.gz Pubmed/

This example command will download all baseline data files as .xml.gz to the PubMed folder

You can explore the PMC directory structure by accessing the NCBI FTP server using your browser.


Check out similar posts by category: Bioinformatics