What is curl?
curl:- is short for Client for URLs
- is a Unix command line tool
- transfer data to and from a server
- is used to download data from HTTP(S) sites and FTP servers
Checking curl installation
Check
curlinstallation:1
man curl
If
curlhas not been installed, you will see:curl command not found.
Learning curl Syntax
Basic
curlsyntax:1
curl [option flags] [URL]
URL is required.
curlalso supportsHTTP,HTTPS,FTP, andSFTP. For a full list of the options available:1
curl --help
Downloading a Single File
Example: A single file is stored at:
https://websitename.com/datafilename.txtUse the optional flag -0 to save the file with its original name:
curl -0 https://websitename.com/datafilename.txtTo rename the file, use the lower case
-o+ new filename:curl -o renamedatafilename.txt https://websitename.com/datafilename.txtDownloading Multiple Files using Wildcards
Oftentimes, a server will host multiple data files, with similar filenames:
1 2 3 4 5 6
https://websitename.com/datafilename001.txt https://websitename.com/datafilename002.txt . . . https://websitename.com/datafilename100.txtUsing Wildcards(*)
Download every file hosted on
https://websitename.com/that starts withdatafilenameand end in.txt:1
curl -O https://websitename.com/datafilename*.txt
Downloading Multiple Files using Globbing Parset
Continuing with the previous example:
1 2 3 4
https://websitename.com/datafilename001.txt https://websitename.com/datafilename002.txt ... https://websitename.com/datafilename100.txtUsing Globbing Parser
The following will download every file sequentially starting with
datafilename001.txtand ending withdatafilename100.txt.1
curl -O https://websitename.com/datafilename[001-100].txt
Increment through the files and download every Nth file (
e.g.datafilename001.txt,datafilename020.txt,…,datafilename100.txt)1
curl -0 https://websitename.com/datafilename[001-100:10].txt
Preemptive Troubleshooting
curlhas two particularly useful option flags in case of timeouts during download:-L: Redirects the HTTP URL if a 300 error code occurs.-C: Resumes a previous file transfer if it times out before completion.
Putting everythin together:
1
curl -L -O -C https://websitename.com/datafilename[001-100].txt
- All option flags come before the URL
- Order of the flags does not matter. (e.g: -L -C -O is fine)
Downloading data using Wget
What is Wget?
Wget:- Derives its name from World Wide Web and get
- Native to Linux but compatible for all operating systems
- used to download data from HTTP(s) and FTP
- better than curl at downloading multiple files recursively
Learning Wget Syntax
Basic
Wgetsyntax:1
wget [option flags][URL]
URL is required
Wgetalso supportsHTTP,HTTPS,FTP, andSFTP. For a full list of the option flags available, see:1
wget -- help
Downloading a Single File
Option flags unique to
Wget:-b: Go to background immediately after startup-q: Turn off theWgetoutput-c: Resume broken download (i.e continue getting a partially-downloaded file)1
wget -bqc https://websitename.com/datafilename.txt
Advanced downloading using Wget
Multiple file downloading with Wget
Save a list of file locations in a text file.
1 2 3 4 5 6
cat url_list.txt Returns: https://websitename.com/datafilename001.txt https://websitename.com/datafilename002.txt ...Download from the URL locations stored within the file
url_list.txtusing-i.1
wget -i url_list.txt
Setting download constraints for large files
Set upper download bandwidth limit (by defaul in bytes per with second) with
--limit-rate. Syntax:1
wget --limit-rate={rate}k {file_location}Example:
1
wget --limit-rate=200k -i url_list.txt
Curl versus Wget
curladvantages:- Can be used for downloading and uploading files from 20+ protocols.
- Easier to install across all operating systems.
Wgetadvantages: - Has many built-in functionalities for handling multiple file downloads.
- Can handle various file formats for download (e.g. file directory, HTML page)