« Previous

How to Browse Wikipedia Offline/Locally?

Posted on June 6, 2007
Filed Under Misc | 6 Comments


Update (8/14/2007): This article is outdated due to incompatible of Wikifilter and the latest Wikipedia XML dump file. Check comments for the workaround.

Although the credibility of Wikipedia is not well respected among academic communities, that doesn't stop Wikipedia from becoming the most popular encyclopedia on the web. According to the update-to-date statistics, the English Wikipedia contains almost 2 millions articles. Don't scream if I tell you that you can store all articles in English Wikipedia in your computer and view them locally, because you can.

What you will need:

  • Wikipedia dump file
  • Web Server
  • WikiFilter
  • Microsoft Windows (to run WikiFilter)
  • 10GB of free hard disk space

Grabbing Dump Files

You can obtain dump files from Wikipedia download page. Follow this link and download the dump file with the title "Articles, templates, image descriptions, and primary meta-pages." You can download other dump files such as the one including images, but beware that those files can be enormously large after decompression.

Decompress the downloaded dump file using a decompression utility such as WinRar. The resulting file is a XML file with size approximately 8GB.

Dump File Indexing

Download WikiFilter from Source Forge and extra the zip file.

Run WikiIndex.exe to start WikiFilter. The interface of WIkiFilter is pretty obvious, enter the full path to the XML file you extracted from the previous step, press "load" button, and press "Start" button to start indexing.

Installing a Web Server

If you haven't had a web server running on your computer, you can read this article to find out how to install Apache HTTP server.

Important: Please be aware that if you want to use Apache, you HAVE TO use Apache HTTP server version 2.0.x, since 2.2.x is not compatible with WikiFilter extension.

After the server is installed, you need to enable WikiFilter extension for your server. Be sure to restart the server after the modification has been made.

The official WikiFilter installation guide has detailed information on how to do this for Apache and IIS.

Troubleshooting

During Apache startup, if you receive "The request operation has failed!" after you enabled the WikiFilter extension, you might have been using the incorrect version of Apache HTTP server. Please make sure you are using version 2.0.x.

Start Browsing

Type http://localhost/wiki to start browsing.

Wikipedia Home

Links

WikiFilter Offical Installation Guide
WikiFilter Download

Go to Top

More interesting posts ...

  • No Related Post
Comments
6 Comments so far
  1. Alejandro Tejada June 29, 2007 4:37 pm

    Hi,

    Wikifilter works great with small xml
    databases dumps, but produces an error
    while indexing big xml dumps like the
    most recent english wikipedia xml dump
    that have almost 11 GB.

    Look at this screenshot from the folder
    where i put the xml dump and the Wikifilter
    program:
    http://www.geocities.com/capellan2000/WikiFilter_Indexes.PNG
    It seems that the index file is left
    unfinished and after this, no page from
    english Wikipedia loads fine.
    This is commented in the SourceForge
    messages in his website.

    By the way, im using Microsoft IIS server.

  2. Cuong August 14, 2007 9:40 pm

    Confirmed. Wikifilter doesn't work with the latest dump file (tested with the dump file on 8/2/2007). There's no sign of patch or update from SourceForge.

    However, there is alternative, which is install the static version.
    You can get the static version at http://static.wikipedia.org/downloads/.

    This method is less elegant because it will produce large number of folders and files on your hard drive. Also the download size is relatively large due to duplicated HTML code in each file.

  3. Adam October 27, 2007 4:32 pm

    Wikifilter can still do the other dumps (for example wikiquote, wiktionary) however it has some problem with english wikipedia, when writing the categories.

    Wikifilter has not been updated for a long time, hopefully someone will fix it.

  4. Redhuan D. Oon January 18, 2009 4:39 pm

    Hi,
    I wonder if this is useful in the case of just wanting to archive my work in wikiversity, say i want to store just my topics for my students in offline mode.

    Thanks for this resourceful site.

    Redhuan

  5. saadjaved February 22, 2009 5:45 am

    As mentioned by Alejandro Tejada
    i tried converting 18.1GB enwiki-20081008-pages-articles.xml
    and wikifilter2.3.exe is unable to convert it into art format properly at all... it gives output of 2 or 3 types of unfinshed art files...
    whats the use of downloading such a huge file when wikiindex can not deal with it.. please fix this issue..
    is there any sucssess story ?

  6. ulakasp December 20, 2009 8:59 pm

    I setup the IIS 7.0 with ISAP Filter and ran wikifiler.exe to index the enwiki-20091218-pages-articels.xml file but i keep getting msg that there is not default document like index.html etc.
    this is what i get
    A default document is not configured for the requested URL, and directory browsing is not enabled on the server.
    has anyone had this issue.

Leave a Comment

If you would like to make a comment, please fill out the form below.

Name (required)

Email (required)

Website

Comments

Attention
If you want to post source code, please wrap it with <pre> and </pre>

Categories

Polls

  • Your opinion about the design of this site:

    View Results

    Loading ... Loading ...