Tom Murphy (the seventh, apparently) wrote a program that brute-forces NES games and tries to find out how to win them by increasing lexicographic orderings. I love the results starting around 7:35.
Tom Murphy (the seventh, apparently) wrote a program that brute-forces NES games and tries to find out how to win them by increasing lexicographic orderings. I love the results starting around 7:35.
You can do the weirdest things with mod_rewrite: crazy regexes, load balancing in all flavors, dynamic content generation and chain all sorts of complex rules to your heart's content.
But as it turns out, it is horrible when you just want to do one simple thing. In my case, that meant indeed rewriting my URLs. But let's start at the beginning...
I'm running my homepage on uberspace.de, a nice little provider with an Apache webserver and fastCGI. For FastCGI you store your own programs in their own directory, e.g. /fcgi-bin/myprogram.cgi, and Apache delegates the work to this program. Apache needs to know which program it should execute for which URL, and it makes a sensible default assumption: The URL contains the program name. Therefore, if I wanted to access my blog, I would go to http://fmutzel.de/fcgi-bin/myprogram.cgi/blog. If I wanted to access my main page, I'd have to go to http://fmutzel.de/fcgi-bin/myprogram.cgi/. That looks a bit ugly, though.
That's where mod_rewrite comes into play. It is designed to rewrite URLs to make them look nicer. Unfortunately, it can do a ton of things and that makes it horrendously complicated. Essentially, all I wanted to do is to map all requests to my custom made script. The internet suggests to put the following in a .htaccess file to configure Apache's mod_rewrite:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ /fcgi-bin/myprogram.fcgi/$1 [QSA,L]
This little snippet should check if the requested filename is a regular existing file, and if not, rewrite the entered URL x to /fcgi-bin/myprogram.fcgi/x. The thing in the square bracket are options, QSA being "Query String Append", which translates into "also copy everything after a question mark".
This should do the trick, and it did until I had a question mark in a URL. Question marks are a weird thing in URLs stemming from the time when URLs where paths to scripts and options and the question mark was used to signify "okay, up to here was a path to a file and the following are options", e.g. domain.com/users/whoever/blog?page=5 (nowadays, this is a bit obsolete since the folder path in the URL usually has nothing to do with the folders in the file system of the server). So, the thing you do if you don't want your question mark to be interpreted in a special way is to escape it by writing %3F instead, the same way as %20 identifies a blank space, and there's this whole mechanism of how to encode any character in URLs.
Unfortunately, that's exactly what I was doing anyway, but it still didn't work. The URL with the question mark gave me a 404 page. I checked my program and found out that the question mark simply didn't arrive at all. I wasn't sure who was responsible for eating it up - the code that I used? Apache? I googled around for a while and found nothing.
That's when I got the idea that mod_rewrite might be the culprit. At first I didn't think that could be the case, a module designed to rewrite URLs that can't rewrite URLs properly?
Turns out that is actually the case. As weird as it sounds, mod_rewrite decodes the question mark, then splits the part at the question mark and (when using QSA) re-appends a regular, un-encoded question mark when putting everything back together. There is a horrendously long bug report from 2005 in their bug tracker, which is closed in 2011 essentially because the bug report got too complicated. I'm not kidding.
It took me a while to find a solution, and I'm not very happy with it. What I've got now is this:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{THE_REQUEST} "\ (/.*)\ "
RewriteRule .* /fcgi-bin/myprogram.fcgi/%1 [QSA,L]
Now this thing goes back to a really low level. It checks the HTTP request string, the real low level bytes that the browser sends to the server when requesting a page, which usually looks something like "GET /some/path/here?a=b HTTP/1.1". This rewrite rule extracts the request URL on its own. It does that by searching for a space, followed by anything that starts with a slash, followed by a space.
Also note that $1 changed to %1. This means it references the match in parentheses in the RewriteCond and not the match in the RewriteRule (which would only have everything up to the question mark).
Apache's mod_rewrite is weird. The name suggests it was invented to rewrite URLs, but it doesn't do that so easily. But then again, you can hack it to do whatever you want with the browser's HTTP request...
Andre Torrez hacks the PS1 variable in bash to display a unicode burger. Neat idea. Unfortunately, my Linux font doesn't have a Unicode code point for that burger :( - luckily people are working on that problem.
export PS1="\w 🍔 "
Andy Whitlock from nowincolour.com has an interesting calculation: If it takes 10.000 hours to master a skill, how many more can you master?
In my case, the expected value is at about 6.