Monday, October 05, 2015

A parallel and fast way to download multiple files

We can write a short script to download multiple files easily in command, e..g

for i in X Y Z; do wget http://www.site.com/folder/$i.url; done

If we want them to run in background (so that in a pseudo-parallel way), we can use -b option for wget.

But this is still not fast enough, and the parallel with wget -b won't give me any notice once it's done.

Here is my solution: axel + parallel

parallel -a urls.file axel

Let's say I want to download all brain sample bigwig files of H3K4me1 marks from the Roadmap Epigenomics project. Here is the code:

mark=H3K4me1
> url.$mark # to generate an empty file
for i in E071 E074 E068 E069 E072 E067 E073 E070 E082 E081;
do 
  echo http://egg2.wustl.edu/roadmap/data/byFileType/signal/consolidated/macs2signal/pval/$i-$mark.pval.signal.bigwig >> url.$mark
done
parallel -a url.$mark axel -n 5 

Regarding what's axel and how fast it is comparing to wget, please refer to this link: http://www.cyberciti.biz/tips/download-accelerator-for-linux-command-line-tools.html

3 comments:

  1. If you have searched for one kind of driver or the other online, you may noticed that there are issues you will have to contend with. Get Into Pc

    ReplyDelete
  2. Great write-up, I am a big believer in commenting on blogs to inform the blog writers know that they’ve added something worthwhile to the world wide web!. http://www.getinntopc.com

    ReplyDelete