• Blog
  • Stuff
  • About

Learning NES games with lexicographic orderings

Apr 14 2013

Tom Murphy (the seventh, apparently) wrote a program that brute-forces NES games and tries to find out how to win them by increasing lexicographic orderings. I love the results starting around 7:35.

Apache's mod_rewrite is a beast.

Apr 12 2013

You can do the weirdest things with mod_rewrite: crazy regexes, load balancing in all flavors, dynamic content generation and chain all sorts of complex rules to your heart's content.

But as it turns out, it is horrible when you just want to do one simple thing. In my case, that meant indeed rewriting my URLs. But let's start at the beginning...

FastCGI

I'm running my homepage on uberspace.de, a nice little provider with an Apache webserver and fastCGI. For FastCGI you store your own programs in their own directory, e.g. /fcgi-bin/myprogram.cgi, and Apache delegates the work to this program. Apache needs to know which program it should execute for which URL, and it makes a sensible default assumption: The URL contains the program name. Therefore, if I wanted to access my blog, I would go to http://fmutzel.de/fcgi-bin/myprogram.cgi/blog. If I wanted to access my main page, I'd have to go to http://fmutzel.de/fcgi-bin/myprogram.cgi/. That looks a bit ugly, though.

URL rewriting

That's where mod_rewrite comes into play. It is designed to rewrite URLs to make them look nicer. Unfortunately, it can do a ton of things and that makes it horrendously complicated. Essentially, all I wanted to do is to map all requests to my custom made script. The internet suggests to put the following in a .htaccess file to configure Apache's mod_rewrite:

RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ /fcgi-bin/myprogram.fcgi/$1 [QSA,L]

This little snippet should check if the requested filename is a regular existing file, and if not, rewrite the entered URL x to /fcgi-bin/myprogram.fcgi/x. The thing in the square bracket are options, QSA being "Query String Append", which translates into "also copy everything after a question mark".

%3F

This should do the trick, and it did until I had a question mark in a URL. Question marks are a weird thing in URLs stemming from the time when URLs where paths to scripts and options and the question mark was used to signify "okay, up to here was a path to a file and the following are options", e.g. domain.com/users/whoever/blog?page=5 (nowadays, this is a bit obsolete since the folder path in the URL usually has nothing to do with the folders in the file system of the server). So, the thing you do if you don't want your question mark to be interpreted in a special way is to escape it by writing %3F instead, the same way as %20 identifies a blank space, and there's this whole mechanism of how to encode any character in URLs.

Bug Hunt

Unfortunately, that's exactly what I was doing anyway, but it still didn't work. The URL with the question mark gave me a 404 page. I checked my program and found out that the question mark simply didn't arrive at all. I wasn't sure who was responsible for eating it up - the code that I used? Apache? I googled around for a while and found nothing.

That's when I got the idea that mod_rewrite might be the culprit. At first I didn't think that could be the case, a module designed to rewrite URLs that can't rewrite URLs properly?

Turns out that is actually the case. As weird as it sounds, mod_rewrite decodes the question mark, then splits the part at the question mark and (when using QSA) re-appends a regular, un-encoded question mark when putting everything back together. There is a horrendously long bug report from 2005 in their bug tracker, which is closed in 2011 essentially because the bug report got too complicated. I'm not kidding.

Workaround

It took me a while to find a solution, and I'm not very happy with it. What I've got now is this:

RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{THE_REQUEST} "\ (/.*)\ "  
RewriteRule .* /fcgi-bin/myprogram.fcgi/%1 [QSA,L]

Now this thing goes back to a really low level. It checks the HTTP request string, the real low level bytes that the browser sends to the server when requesting a page, which usually looks something like "GET /some/path/here?a=b HTTP/1.1". This rewrite rule extracts the request URL on its own. It does that by searching for a space, followed by anything that starts with a slash, followed by a space.

Also note that $1 changed to %1. This means it references the match in parentheses in the RewriteCond and not the match in the RewriteRule (which would only have everything up to the question mark).

Conclusion

Apache's mod_rewrite is weird. The name suggests it was invented to rewrite URLs, but it doesn't do that so easily. But then again, you can hack it to do whatever you want with the browser's HTTP request...

Sonder

Apr 12 2013
In other news, is this tumblr now or what?

Put a Burger In Your Shell

Apr 6 2013

Andre Torrez hacks the PS1 variable in bash to display a unicode burger. Neat idea. Unfortunately, my Linux font doesn't have a Unicode code point for that burger :( - luckily people are working on that problem.

export PS1="\w 🍔 "

How many more skills can you master?

Apr 5 2013

Andy Whitlock from nowincolour.com has an interesting calculation: If it takes 10.000 hours to master a skill, how many more can you master?

In my case, the expected value is at about 6.

  • Tags
    • cooking (5)
    • d3 (1)
    • engineering (1)
    • experiments (2)
    • games (3)
    • German (3)
    • Hong Kong (11)
    • Japan (12)
    • linux (1)
    • London (1)
    • Malta (2)
    • music (5)
    • programming (8)
    • Soylent (1)
    • spam (11)
    • Switzerland (7)
    • water (7)
    • web (18)
    • windows (1)
    • wtf (6)
  • Archive
    • January 2023 (1)
    • September 2022 (1)
    • January 2022 (1)
    • September 2021 (1)
    • July 2021 (3)
    • June 2021 (2)
    • May 2021 (2)
    • April 2021 (4)
    • October 2021 (1)
    • March 2020 (1)
    • February 2020 (1)
    • December 2020 (1)
    • May 2018 (1)
    • April 2018 (1)
    • February 2018 (1)
    • December 2017 (11)
    • March 2016 (1)
    • October 2016 (1)
    • August 2015 (2)
    • July 2015 (2)
    • March 2015 (2)
    • October 2015 (1)
    • August 2014 (9)
    • June 2014 (1)
    • September 2013 (1)
    • August 2013 (1)
    • July 2013 (3)
    • May 2013 (6)
    • April 2013 (5)
    • March 2013 (2)
    • February 2013 (2)