How to download & sync PubMed baseline + updates
In our previous post How to download PubMed baseline data using rsync we showed how you can update PubMed’s baseline data. This dataset is only updated yearly - however, you can download the updatefiles which are typically updated once per day.
The commands to download & sync both sets of files into the PubMed
rsync -Pav --delete\*.xml.gz PubMed/
rsync -Pav --delete\*.xml.gz PubMed/
The --delete
option will ensure that files that are deleted on the server will also be deleted locally. For example, when a new baseline dataset is being published, you need to delete the old year’s files to avoid having to process duplicate data.
If this post helped you, please consider buying me a coffee or donating via PayPal to support research & publishing of new posts on TechOverflow