Offline wikipedia revisited – fast offline wikipedia reader



I’ve mentioned offline wikipedia access before mainly because for all it’s flaws, wikipedia is probably the single largest, most comprehensive and best information resource out there. There may be other encyclopedias that are more accurate, but require subscription access… anyway for all the warts it’s a great resource. To many people though it comes as a great shock that we’re not plugged into the internet all the time.. (so many people say “why offline, that’s what makes wikipedia so good is that it’s current and if it’s not current it’s worthless.” When I was growing up we had an encyclopedia set from 1965, I grow up through the 70’s and 80’s and it was still VERY useful and there was very good information, now it may not have been “up to date” in many areas, but it was still informative and was right on a GREAT many things. I think if I manage to download wikipedia once a year I’ll get by on the “currency” of the information. Anyway… the main point is that many times internet access is 1)not reliable 2)not practical 3)not there….. For instance I do have wireless for the laptop, but don’t always hook up to wireless networks there are places here and there “bubbles” of access around town, but many of the places I go there just isn’t wireless internet available. Now I guess if I wanted to pay verizon another $60 a month that would increase I would have MORE pervasive access, but frankly…..


… that seems like a ginorumous waste of money to me for the way I work. Not to mention those times that we lose power from a storm or other event. I have battery backups and modest solar capabilities, but my cable provider DOESN’T persist through such outages. The last time we had a major power outage I hooked up the batteries to watch the local news and only made it 15 minutes in before the cable went dead. (They have repeaters up and down the road to get the signal up here and those repeaters require power and have backup batteries, once their batteries drain then we’re dead in the water…) So… as we saw before there are alternatives – there are versions of wikipedia in tomeraider format for pdas which is nice.

There’s also the static html version that can be downloaded.

You can also setup your own wiki on a local machine, and import the wikipedia database but that seems like overkill.

This most recent approach is fairly nice, doesn’t take long to setup and is probably the best approach I’ve seen yet (and you can update the dump yourself without waiting for the next static export or tomeraider export.) THe idea is that the database is split into managable chunks and searched within the compressed (bz2) chunks. A simple web server/web interface glues it to the web browser and you have a simple fast offline wikipedia reader. I’ve set this up on an old P3 500 mhz sony vaio running the latest ubuntu (7.04) (it has 256mb of memory btw) and am able to search/browse fairly well. (I chose the spanish language wikipedia dump.) Choosing a dump other than the default is as simple as modifying the filename the script looks for.

All in all it may have taken a half hour to download and an hour or so to split up the bz2 dump on that machine. (I don’t even want to start thinking about a mysql import on that machine of the wikipedia…..OUCH! – much less running apache and mysql on there….)

   Send article as PDF   

Similar Posts