How to Browse Wikipedia Offline/Locally?
Update (8/14/2007): This article is outdated due to incompatible of Wikifilter and the latest Wikipedia XML dump file. Check comments for the workaround.
Although the credibility of Wikipedia is not well respected among academic communities, that doesn't stop Wikipedia from becoming the most popular encyclopedia on the web. According to the update-to-date statistics, the English Wikipedia contains almost 2 millions articles. Don't scream if I tell you that you can store all articles in English Wikipedia in your computer and view them locally, because you can.
What you will need:
- Wikipedia dump file
- Web Server
- WikiFilter
- Microsoft Windows (to run WikiFilter)
- 10GB of free hard disk space
Grabbing Dump Files
You can obtain dump files from Wikipedia download page. Follow this link and download the dump file with the title "Articles, templates, image descriptions, and primary meta-pages." You can download other dump files such as the one including images, but beware that those files can be enormously large after decompression.
Decompress the downloaded dump file using a decompression utility such as WinRar. The resulting file is a XML file with size approximately 8GB.
Dump File Indexing
Download WikiFilter from Source Forge and extra the zip file.
Run WikiIndex.exe to start WikiFilter. The interface of WIkiFilter is pretty obvious, enter the full path to the XML file you extracted from the previous step, press "load" button, and press "Start" button to start indexing.
Installing a Web Server
If you haven't had a web server running on your computer, you can read this article to find out how to install Apache HTTP server.
Important: Please be aware that if you want to use Apache, you HAVE TO use Apache HTTP server version 2.0.x, since 2.2.x is not compatible with WikiFilter extension.
After the server is installed, you need to enable WikiFilter extension for your server. Be sure to restart the server after the modification has been made.
The official WikiFilter installation guide has detailed information on how to do this for Apache and IIS.
Troubleshooting
Start Browsing
Type http://localhost/wiki to start browsing.
Links
WikiFilter Offical Installation Guide
WikiFilter Download
More interesting posts ...
Leave a Comment
If you would like to make a comment, please fill out the form below.
If you want to post source code, please wrap it with <pre> and </pre>



Hi,
Wikifilter works great with small xml
databases dumps, but produces an error
while indexing big xml dumps like the
most recent english wikipedia xml dump
that have almost 11 GB.
Look at this screenshot from the folder
where i put the xml dump and the Wikifilter
program:
http://www.geocities.com/capellan2000/WikiFilter_Indexes.PNG
It seems that the index file is left
unfinished and after this, no page from
english Wikipedia loads fine.
This is commented in the SourceForge
messages in his website.
By the way, im using Microsoft IIS server.
Confirmed. Wikifilter doesn't work with the latest dump file (tested with the dump file on 8/2/2007). There's no sign of patch or update from SourceForge.
However, there is alternative, which is install the static version.
You can get the static version at http://static.wikipedia.org/downloads/.
This method is less elegant because it will produce large number of folders and files on your hard drive. Also the download size is relatively large due to duplicated HTML code in each file.
Wikifilter can still do the other dumps (for example wikiquote, wiktionary) however it has some problem with english wikipedia, when writing the categories.
Wikifilter has not been updated for a long time, hopefully someone will fix it.
Hi,
I wonder if this is useful in the case of just wanting to archive my work in wikiversity, say i want to store just my topics for my students in offline mode.
Thanks for this resourceful site.
Redhuan
As mentioned by Alejandro Tejada
i tried converting 18.1GB enwiki-20081008-pages-articles.xml
and wikifilter2.3.exe is unable to convert it into art format properly at all... it gives output of 2 or 3 types of unfinshed art files...
whats the use of downloading such a huge file when wikiindex can not deal with it.. please fix this issue..
is there any sucssess story ?
I setup the IIS 7.0 with ISAP Filter and ran wikifiler.exe to index the enwiki-20091218-pages-articels.xml file but i keep getting msg that there is not default document like index.html etc.
this is what i get
A default document is not configured for the requested URL, and directory browsing is not enabled on the server.
has anyone had this issue.