Downloading PubChem raw data using rsync

PubChem raw data is hosted on the NCBI servers which provide convenient access using rsync.

Download PubChem data using a rsync command like

example.sh
rsync -Pav ftp.ncbi.nlm.nih.gov::pubchem/Compound/CURRENT-Full/SDF/\*.gz pubchem/

This example command will download all current compounds in SDF format.

You can explore the PubChem directory structure by accessing the NCBI FTP server using your browser.


Check out similar posts by category: Bioinformatics