Archive for June 2009

Using Apache as a Proxy for Server 2003

Putting the “geek” in Dawn of the Geeks are two servers I have set up.  The first server is running Windows XP with Apache, MySQL, PHP and Subversion.  Recently I began developing .Net web sites using C#.   Namely the bank site which is used for managing and tracking personal finances.  Windows XP only allows a single web-site to run and only has IIS 5.1.  Since I have a copy of Windows Server 2003 I decided to put it to use and installed it on another server.  That server just has Windows 2003 with II6.

Now the issue is that all port 80 requests go through Apache. So the first step was to install mod_proxy and mod_proxy_html. Now I can point various hosts to the second server. However mod_proxy doesn’t pass along the original host header. It was sending it to the internal IP. Well, IIS 6 needs to know what host it’s serving for as well if you want to run multiple sites.

The trick is to modify your hosts file on the Apache server so that the host name maps to the internal server.

The flow is

bank.dawnofthegeeks.com hits the DNS server and points to the public IP of my router. The router sends the port 80 request to the apache server. The apache server sees that it’s dealing with bank.dawnofthegeeks.com and reverse proxy’s it to bank.dawnofthegeeks.com, however when Windows XP tries to resolve that host name it gets the internal IP from the hosts file rather than the public IP of the router from the DNS servers. So the request gets sent to the IIS server which sees the hostname bank.dawnofthegeeks.com and serves up the correct web back through the proxy.

Through the magic of the hosts file Apache reverse proxies a host to the same host.

Using JavaScript to Track Visitors

This is log.js that goes in a central location

log.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
if (typeof(km_scripts) == 'undefined') 
	var km_scripts = new Object();
 
km_myclass_import('http://js.dawnofthegeeks.com/hitlog.php?REQUEST_URI=' + document.location + '&HTTP_REFERER=' + document.referrer);
 
function km_myclass_import(jsFile) 
{
	if (km_scripts[jsFile] != null) 
		return;
	var scriptElt = document.createElement('script');
	scriptElt.type = 'text/javascript';
	scriptElt.src = jsFile;
	document.getElementsByTagName('head')[0].appendChild(scriptElt);
	km_scripts[jsFile] = jsFile;
}

This line of HTML goes at the bottom of every page that I want to track

1
<SCRIPT LANGUAGE="JavaScript" SRC="http://js.dawnofthegeeks.com/log.js"></SCRIPT>

hitlog.php does the hard work of logging the various information to a MySQL database. I can then go to my stats page and see the unique visitors and page view count for every domain that line of HTML appears on. I could break it down by pages as well but for now I’m not that interested in that level of detail in my reports.

In order to avoid having massive unmanagable tables (GoDaddy limits you to 200MB per database) each month has its own table. If I’m running out of space in the database I can export the older tables and delete them from the GoDaddy servers.

hitlog.php

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
 
// define: $dbhost, $dbuser, $dbpass, $dbbase;
 
function ConnectToJS()
{
	global $dbhost, $dbuser, $dbpass, $dbbase;
 
	$db = mysql_connect("$dbhost", "$dbuser", "$dbpass");
	if(mysql_select_db("$dbbase",$db))
		return $db;
	return 0;
}
 
$db = ConnectToJS();
 
$table = "logs_" . date("Y_m",time());
 
$sql = "
CREATE TABLE IF NOT EXISTS `$table` (
  `id` bigint(20) NOT NULL auto_increment,
  `hash` varchar(255) NOT NULL,
  `ip` varchar(255) NOT NULL,
  `ip_int` bigint(20) NOT NULL,
  `ip_host` varchar(255) NOT NULL,
  `ip_country_id` int(10) NOT NULL,
  `created_at` datetime NOT NULL,
  `path` varchar(255) NOT NULL,
  `referer` varchar(255) NOT NULL,
  `referer_host` varchar(255) NOT NULL,
  `query_str` varchar(255) NOT NULL,
  `agent` varchar(255) NOT NULL,
  `host` varchar(255) NOT NULL,
  PRIMARY KEY  (`id`),
  UNIQUE KEY `hash` (`hash`),
  KEY `host` (`host`),
  KEY `ip_country_id` (`ip_country_id`),
  KEY `referer` (`referer`),
  KEY `agent` (`agent`),
  KEY `created_at` (`created_at`),
  KEY `referer_host` (`referer_host`)
) ENGINE=MyISAM 
";
 
mysql_query($sql,$db);
 
 
$ip = $_SERVER['REMOTE_ADDR'];
$ip_host = gethostbyaddr ( $_SERVER['REMOTE_ADDR'] );
$path = $_REQUEST['REQUEST_URI'];
$parts = parse_url($path);
 
$ip_int = explode(".",$ip);
$ip_int = $ip_int[0]*256*256*256 + $ip_int[1]*256*256 + $ip_int[2]*256 + $ip_int[3];
 
$sql = "
	SELECT
		id
	FROM
		ip_to_country
	WHERE
		$ip_int >= ip_start AND $ip_int <= ip_end
";
$res = mysql_query($sql,$db);
$r = mysql_fetch_assoc($res);
$ip_country_id = isset($r['id']) ? $r['id'] : 0;
 
$created_at = time();
$referer = $_REQUEST['HTTP_REFERER'];
$ref_parts = parse_url($referer);
$referer_host = $ref_parts['host'];
$query_str = $ref_parts['query'];
 
$agent = $_SERVER['HTTP_USER_AGENT'];
$host = $parts['host'];
$hash = md5($ip . $created_at . $path . $referer . $agent . $host);
 
 
 
$sql = "
INSERT INTO 
	$table
SET
	hash = \"" . mysql_real_escape_string($hash) . "\",
	ip = \"" . mysql_real_escape_string($ip) . "\",
	ip_host = \"" . mysql_real_escape_string($ip_host) . "\",
	ip_int = \"" . mysql_real_escape_string($ip_int) . "\",
	ip_country_id = \"" . mysql_real_escape_string($ip_country_id) . "\",
	created_at = NOW(),
	path = \"" . mysql_real_escape_string($path) . "\",
	referer = \"" . mysql_real_escape_string($referer) . "\",
	referer_host = \"" . mysql_real_escape_string($referer_host) . "\",
	query_str = \"" . mysql_real_escape_string($query_str) . "\",
	agent = \"" . mysql_real_escape_string($agent) . "\",
	host = \"" . mysql_real_escape_string($host) . "\"
";
 
mysql_query($sql,$db);
ss_blog_claim=70b9168863fc97c91e6d88b40542a327 ss_blog_claim=70b9168863fc97c91e6d88b40542a327