Follow this Blog!

Jul 11, 2013

f Comment

How Do You Display HTTP Host or Request Domain or Request URL in Web Log?

Amazon How do I include request url or HTTP host in my web server's access and error log?

Here's the question. I am running several hosts on my one web server:,,,, They all share the same web server log. The default log entry looks like the following. - - [11/Jul/2013:13:44:48 +0800] "GET /fashion-guide-on-wearing-sweaters.html HTTP/1.1" 200 7201 "-" "facebookexternalhit/1.1 (+"
Nowhere does it tell me which HTTP host the request hits. It could be or How do I tell?

Obviously another solution is separate the web log into each respective log file. For the purpose of this discussion let's assume I put all the logs into one single file.
I cannot believe nobody on Google knows. Fortunately I know the answer and I'd like to share it with you.

How Log Formatting Works
A modern web server including Apache and Nginx uses log_format directive to define the format of the log entry. First locate your web configuration file. In Nginx's web configuration file, for example, the default log format is called combined and it looks like the following.

$remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent"
An example log entry following this format looks like the following. - - [11/Jul/2013:13:44:48 +0800] "GET /fashion-guide-on-wearing-sweaters.html HTTP/1.1" 200 7201 "-" "facebookexternalhit/1.1 (+"
Any invalid or empty value is shown as a hyphen.

If you use Nginx, the log-related configuration file is indicated by /etc/init.d/nginx. In my case the log-related settings are in /etc/nginx/nginx.conf.

The default log format, combined, cannot be altered.
How to Add a Custom Log Format with Request Domain or HTTP Host
The variables used in log_format include $http_HEADER where HEADER is the name of the HTTP header for the HTTP request. The name of the header should be changed to lowercase and any dashes replaced with underscores, as in $http_user_agent.

A modern browser always send the host HTTP header whose value is the sub domain and domain of the request URL. would be an example value of the host HTTP header.

Let's add a new log format called logFormatWithHttpHost in our web server's configuration file just above access_log or error_log directive.
log_format logFormatWithHttpHost '$remote_addr - $remote_user [$time_local] '
                    '* $http_host * '
                    '"$request" $status $body_bytes_sent '
                    '"$http_referer" "$http_user_agent"';
Now you can use this new log format logFormatWithHttpHost in access_log or in error_log by adding the following line in your web server configuration file.

access_log /home/ubuntu/web-server-log/access.log logFormatWithHttpHost;
After you save your changes, do a configuration test to validate your configuration file's syntax. On Ubuntu 10.4 the command is this.

sudo /etc/init.d/nginx configtest
Reload (sudo /etc/init.d/nginx reload) or restart (sudo /etc/init.d/nginx restart) your web server and monitor the web log to see the newly formatted log entries. Below is an example of such entries. - - [11/Jul/2013:13:44:48 +0800] * * "GET /fashion-guide-on-wearing-sweaters.html HTTP/1.1" 200 7201 "-" "facebookexternalhit/1.1 (+"
Now you have both domain and the path and you know the exact request URL!

Obviously this solution will fail if the browser does not include the host header in the HTTP headers. As mentioned every modern browser includes host header including Chrome, Firefox, Internet Explorer, Safari.
If you have any questions let me know and I will do my best to help you!
Please leave a comment here!
One Minute Information - by Michael Wen
Find Michael on Facebook
ADVERTISING WITH US - Direct your advertising requests to Michael