The Ballad of cURL and wget


Categories: Tools Tags: curl wget terminal tools

Whether you’re a web developer, backend API virtuoso or even a terminal dweller, you’ll eventually end up using/creating a software solution against some server over a network or perhaps, you’ll need to check if some endpoint is working correctly. Fear not (as if you were :p), lots of options in this post.

First of all, why would you even care for such tools? Well, what would be the simplest way to check if a server returns HTTP/1.1 200 OK or HTTP/1.1 301 Moved Permanently? If these codes sound strange, I would suggest that you get some information on HTTP response status codes like Mozilla docs.
How do you know if your server sends correct data to the client? Or is the POST payload you send to a server valid?

Let’s go over one by one and apply some everyday useful tasks.

cULR: The long name is client URL. So simple you might mistake it for a incompetent tool or some mid 90s shareware. But it’s far, far from it. This is my go-to means to get over and be done with specific tasks. Now that I think about it, this part might sound subjective and highly biased - which it is, but I’ll give some spotlight to other players too :)

Ok, how do you start with? Well, most GNU/Linux distros come pre-installed with cURL. You’ll probably need to install it on other Operating systems.

Once ready, run the following command to check if curl is in place (in the terminal):

$ curl

Note that in Windows you’ll need to run something like .\curl.exe in your Powershell window.

The usual output you’ll get is:

 curl: try 'curl --help' or 'curl --manual' for more information
meaning curl is configured and ready for some action.

So, let’s check what we get when “curling” this very site (don’t forget to input the whole url path with either http or https):

$ curl

<!doctype html><html lang=en-us><head><meta name=generator content="Hugo 0.55.6"><meta charset=utf-8><meta name=viewport content="width=device-width,initial-scale=1"><meta name=description content="Personal #tech oriented blog. Programming and system administration topics."><link rel=apple-touch-icon sizes=180x180 href=/apple-touch-icon.png><link rel=icon type=image/png sizes=32x32 href=/favicon-32x32.png><link rel=icon type=image/png sizes=16x16 href=/favicon-16x16.png><link rel=manifest href=/site.webmanifest><title> - AFK reflection. #tech</title><link rel=stylesheet href=/css/style.css><link rel=stylesheet href=/css/fonts.css><link rel=stylesheet href=/css/theme-override.css> ...

You are now seeing the site’s response HTML inline “dumped” in the terminal. Not much value here, but when adding additional parameters, you’ll quickly figure out limitless possibilities.

If we want to get the response header, we’ll need to pass a parameter to the command.

$ curl -I

HTTP/2 200
cache-control: public, max-age=0, must-revalidate
content-type: text/html; charset=UTF-8
date: Thu, 08 Aug 2019 22:57:05 GMT
etag: "34eee92316069568d85fe87fa80e08fd-ssl"
strict-transport-security: max-age=31536000
age: 53692
content-length: 4195
server: Netlify
x-nf-request-id: c814fd00-2e76-4fe3-b8a0-f524daca45e7-1365349

You now see a (sometimes cached) response header of the server the site is running on. And you can see that no errors are returned by the server either. So by passing different parameters, cURL will parse and give you the information you need. cURL supports lots of protocols for your pleasure (FTP, HTTP, HTTPS, IMAP, POP3, LDAP, POST, PUT, PATCH, DELETE, GET…). After you get through the basics here, don’t hesitate to run “$ man curl” and search for that specific function that you might need.

Here is a list of some switches that I often catch myself using and might help you get started too:

-iprint response header and response content
-Iprint response header
-Lfollow redirects
-vverbose output
-kuse insecure request
-owrite to file (adding » instead of “-o” appends to that file)
-Ospecific for downloading single files
-X “DELETE”or POST, PUT.. sending message to a REST endpoint
-Ccontinue download at offset

And now for some examples:
Tip no1 - Using curl you can fetch an online script and “pipe” it to some other program (but be careful with those, you never know what you might end up running!).

$ curl | sh

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    43  100    43    0     0    105      0 --:--:-- --:--:-- --:--:--   105
Hello World!

Tip no2 - cURL is also capable of resuming broken downloads. Just add an additional switch to the terminal command.

$ curl -L -O

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0 4376M    0 1189k    0     0   799k      0  1:33:24  0:00:01  1:33:23  799k

Let’s simulate the network went down and the download session broke. Now you’ll get a chunk of the file with the downloaded bytes before the network issue occurred and in order to resume, you need to do it from the directory where the partial chunk is.

$ curl -L -O -C -

** Resuming transfer from byte position 1746432
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0 4374M    0  333k    0     0   643k      0  1:55:56 --:--:--  1:55:56  643k

A few things to note, I used the same command with an additional “-C -". The terminal will show the message of resuming the transfer from the byte it last downloaded.

On the other side, let’s check GNU wget which is the oldest tool of the two. Although it does not support the vast majority of protocols as cURL, it is still a quite capable asset to have around. The name is derived from World Wide Web and get, according to documented sources. It supports HTTP, HTTPS, FTP and FTPS protocols.

Similar to cURL, it’s presence is found on almost all GNU/Linux distributions. It is “possible” to send files with wget, but the main purpose is - download. It was ideally created to help with slow download speeds and resuming of broken downloads. cURL already handles all of these features just fine, yet I find it good to know that there are alternatives (in case cURL is not available) and how to get around with wget.

For a quick test, you can type in $ wget full-url-to-file. This will download the selected file to the directory you called wget from. Simple as that! If you want to skip certificate check, just pass in –no-check-certificate and your done. Or to catch on the curl server response command, try $ wget –server-response url.
To download something in the background, like large files, simply use $ wget -bcq full-url-to-file. This will download in a separate pid (process id), automatically resume on broken connection and quit when it finishes.

I would like to finish this blog by mentioning Postman too. The GUI based application capable as the before mentioned tools, available on all major platforms.
Keep in mind there is no one-tool-fixes-all! The key is choosing the right means to help you overcome your task.

As a reflection on the topic stars, use wget when you want to download a single file or a website. Use curl for the more fancy stuff.