Offline wikipedia revisited – fast offline wikipedia reader



I’ve mentioned offline wikipedia access before mainly because for all it’s flaws, wikipedia is probably the single largest, most comprehensive and best information resource out there. There may be other encyclopedias that are more accurate, but require subscription access… anyway for all the warts it’s a great resource. To many people though it comes as a great shock that we’re not plugged into the internet all the time.. (so many people say “why offline, that’s what makes wikipedia so good is that it’s current and if it’s not current it’s worthless.” When I was growing up we had an encyclopedia set from 1965, I grow up through the 70′s and 80′s and it was still VERY useful and there was very good information, now it may not have been “up to date” in many areas, but it was still informative and was right on a GREAT many things. I think if I manage to download wikipedia once a year I’ll get by on the “currency” of the information. Anyway… the main point is that many times internet access is 1)not reliable 2)not practical 3)not there….. For instance I do have wireless for the laptop, but don’t always hook up to wireless networks there are places here and there “bubbles” of access around town, but many of the places I go there just isn’t wireless internet available. Now I guess if I wanted to pay verizon another $60 a month that would increase I would have MORE pervasive access, but frankly…..


… that seems like a ginorumous waste of money to me for the way I work. Not to mention those times that we lose power from a storm or other event. I have battery backups and modest solar capabilities, but my cable provider DOESN’T persist through such outages. The last time we had a major power outage I hooked up the batteries to watch the local news and only made it 15 minutes in before the cable went dead. (They have repeaters up and down the road to get the signal up here and those repeaters require power and have backup batteries, once their batteries drain then we’re dead in the water…) So… as we saw before there are alternatives – there are versions of wikipedia in tomeraider format for pdas which is nice.

There’s also the static html version that can be downloaded.

You can also setup your own wiki on a local machine, and import the wikipedia database but that seems like overkill.

This most recent approach is fairly nice, doesn’t take long to setup and is probably the best approach I’ve seen yet (and you can update the dump yourself without waiting for the next static export or tomeraider export.) THe idea is that the database is split into managable chunks and searched within the compressed (bz2) chunks. A simple web server/web interface glues it to the web browser and you have a simple fast offline wikipedia reader. I’ve set this up on an old P3 500 mhz sony vaio running the latest ubuntu (7.04) (it has 256mb of memory btw) and am able to search/browse fairly well. (I chose the spanish language wikipedia dump.) Choosing a dump other than the default is as simple as modifying the filename the script looks for.

All in all it may have taken a half hour to download and an hour or so to split up the bz2 dump on that machine. (I don’t even want to start thinking about a mysql import on that machine of the wikipedia…..OUCH! – much less running apache and mysql on there….)

Related Posts

Blog Traffic Exchange Related Posts
  • Linux Software Raid Notes - Replacing Drives This post is going to be somewhat of a "link dump" for me of some pages that I've been perusing lately. After playing with RT (request tracker) - I added a few ticket items for the home network. Now, if you've been a longtime reader and sorted through ALL of......
  • Linux software raid notes Here are a few other notes on linux software raid. I created a directory called raidinfo to keep information in to make it easy to maintain the raid array. First... from the software raid howto, I've done the following.... sfdisk -d /dev/hda > /raidinfo/partitions.hda sfdisk -d /dev/hde > /raidinfo/partitions.hde So..........
  • Ubuntu Center - web control for your ubuntu machine I ran across ubuntucenter today, which aims to be a web based control panel for any ubuntu based machine, providing file access, etc. Here's their summary... Ubuntu Center is a web based interface for accessing all kinds of information that's being stored on your computer running Ubuntu Breezy, Kubuntu, XUbuntu,......
Blog Traffic Exchange Related Websites
  • How to Leverage the Power of Web 2.0 to Get Highly Targeted Traffic. Web 2.0 is the latest buzzword in Internet marketing. Not surprising at all, considering what a goldmine it is in terms of all the benefits an Internet marketer can draw from it. Basically, web 2.0 is a platform that allows people to connect, interact with the website and with each......
  • Web Marketing: The Opportunity Is Here Online marketing essentially includes selling commodities or services over the web. It may also include transactions done over any wireless media or e-mail. Ancillary services, for example electronic purchaser relationship management systems are commonly categorized under online marketing. It has a few business models, such as E-commerce ( where commodities......
  • Tips for Investing Online The concept of doing your investing online through the use of an online web portal has truly revolutionized everything that financial trading is all about, especially in stock exchanges and in the FOREX market. Availability of resources like quick internet connectivity and portable laptop computers has made it even simpler......
en.pdf24.org    Send article as PDF   

Similar Posts


See what happened this day in history from either BBC Wikipedia
Search:
Keywords:
Amazon Logo

Comments are closed.


Switch to our mobile site