aesthetics  →
being  →
complexity  →
database  →
enterprise  →
ethics  →
fiction  →
history  →
internet  →
knowledge  →
language  →
licensing  →
linux  →
logic  →
method  →
news  →
perception  →
philosophy  →
policy  →
purpose  →
religion  →
science  →
sociology  →
software  →
truth  →
unix  →
wiki  →
essay  →
feed  →
help  →
system  →
wiki  →
critical  →
discussion  →
forked  →
imported  →
original  →
edit index GetWiki:Overview

GetWiki, wiki and xml

GetWiki was first developed early in 2004 to "get wiki" content from site to site. At the time, MediaWiki appeared to be a more intuitive and user-friendly application than other Wiki-engines, full of features, actively tested and improved by dedicated developers from around the world. However, this does not mean improvements could not have been made to MediaWiki, or still made - no application is perfect - and having a "team" of developers does not guarantee the quality of software will be high. Since using it as a Codebase for GetWiki in 2004, MediaWiki has become overly complicated "bloatware", while GetWiki has remained very simple, in keeping with the spirit of "Wiki Wiki".

During its latest releases, MediaWiki remains a Wiki engine created primarily for Wikipedia, and when used elsewhere its lackings and bugs are laid bare. Time spent working with MediaWiki from an outsider's perspective made bug-fixing mandatory, but potential improvements to the interface, security and standards were obvious as well. Thus, GetWiki first emerged as a series of bug-fixes simply to get MediaWiki to actually work as advertised, then development continued, diverging as a "Fork". GetWiki is now an optimized wiki engine bearing no resemblance to its predecessor, and works on a variety of servers and hosting environments. GetWiki 2.0 strongly continues this trend of independence.

That independence was a real problem for the Pseudopedians. Overreaction to the legal XML importing and licensing changes was relentless, and even Jimbo Wales himself threatened to sue (Jul/Aug, 2005), emailing, "I will take legal action if necessary, do you understand that? I don't know why you choose to be so difficult". Being "difficult" meant that GetWiki 1.0 licensing was going to stay as originally released, regardless of their unfounded complaints to the FSF and other threats. The reaction GetWiki set off for the Pseudopedia-faithful was quite the firestorm at the time. The new application is GPL-free, Pedia-free, and will be released under the Creative Commons License.

Features in GetWiki 2.0

The fun in developing software independently is that sensible choices can (theoretically) be made more easily, and in due time, rather than when trying to please a group or committee. Watching how the WikiSphere has evolved (even, devolved) over the years, both toward a staggering juggernaut of misinformation in one Wikipedia, to a proliferation of small, free-thinking websites challenging the "Borg" mentality, it is clear that the concept of "wiki" just ain't as cool as it once was (circa 2003-2005). These more independent sites, which at one time might have been wikis, are far more likely to be forum-based discussion sites and blogs. GetWiki is free to stray outside the lines proscribed by WikiMinders, MeatBall and Wikipedia, to become a different form of website interaction.

"Plug" Simplicity

GetWiki 1.0 was adapted from MediaWiki, a jumbled mass of disorganized, undocumented PHP scripts, some 80+ files, with code-block functions calling functions calling functions calling functions across those files. A change in one place often involved chasing down changes to be made in many other places in those files, and the data saved in the database was often duplicated in different places, resulting in a lack of what is called "normalization". Clearly, experience shows that an application developed by committee does not lead to better software. That GetWiki 1.0 (and 1.x) even worked at all on non-Wikimedia servers was surprising, given those confused origins.

Starting over from scratch was necessary, but where to start? The "Plug" application was developed to run in late 2007 and early 2008, and through that experience, it became clear how to "reimagine" GetWiki, and thus, reimagine Wiki. The features developed independently for GetWiki 1.0 and 1.x (see below) were adapted to the new 2.0 codebase, all written from scratch. From the original "GetWiki" feature (XML importing), to the new classifications and the difference engine, to the social networking and discussion forums, even the basic parsing functions and many other new features, it all was replaced with five (5) PHP files, along with JavaScript, CSS and SQL files in support, all with clear comments about what is happening. The files are tightly organized, too, and the new database schema is normalized, so that content information is stored in only one place throughout the database. An upgrade procedure is built in, as well.

Categorical Facets

In GetWiki 1.0, there was no way to formally classify articles, as the feature was not part of the original codebase used, and more importantly, the feature is misused and overused on Wikipedia and the other Pseudopedias. "Categories" were avoided in GetWiki as the problems associated with them became apparent: The proliferation of categories and schemes, to the point of having one category for every page (which seemed the goal of Pseudopedians); the reliance on putting the category code into the article itself, which could cause problems and confusions in editing and page histories; and the need to create categories of categories, and so on, in order to offer some manner of navigation.

In GetWiki 2.0, categories are handled in the proper manner. First, they are separate designations from the actual source text of the page, so we can change one without affecting the other, either on purpose or accident. This also allows the possible restriction of changing categories to site administrators, or "sysops", but otherwise solves several technical problems. The categories are also displayed on each article, as a form of "breadcrumb" navigation (up to three levels), all linking to a central index function which works as a database should. This encourages categories to be broad and easily understood, instead of becoming so numerous they would need their own classification.

Furthermore, in GetWiki, Faceted Classification is also used, which is superior in some ways to Categorical, or Hierarchical Classification. "Facets" are added fields to the data surrounding the pages, but not the actual text source of the page. They create "facets" of index information about the subject, type, origin and author of an article. These facets are used, along with the categories, to create a multi-dimensional index, which can be used intuitively to narrow the numbers of articles matching index criteria. For example, one can find out the number of Philosophy articles last edited by Proteus. One could find all the articles forked in Philosophy, or all the original articles on Systems Theory, for examples.

So, for a particular page, we have this 3-dimensional schema:
  • Title (Dynamism)
  • Category ("philosophical studies")
  • Subject ("complexity")
  • Type ("wiki")
  • Origin ("forked")
  • Author ("proteus", or the last editor)

Social Networking your Blog

A major change in GetWiki 2.0 is the jettisoning of nearly all the assumptions about wikis based on the "wiki way". The basic idea that a webpage could be edited by anyone in practice means that every "user" must be policed under suspicion, which is ironically in direct conflict with the stated goals of "wiki". On a large site, this Wiki Way directly produces Groupthink and troubling invasions of privacy, as seen on The Pseudopedia. On small sites, following such practices generates at worst, a Police State, and at best, a lot of work for what might be one person running the site. Wiki was supposed to be about the content. In the new world of websites, we've learned it's more important to build trust between members from a positive place, rather than aspersions.

GetWiki introduces social networking to the WikiShere. An article need not be a publicly editable document in order to be accurate and factual, in fact, we've seen ample proof that public editing does not lead to better articles, and certainly not just because a wiki claims to be an "encyclopedia". On a Blog, by comparison, we do not assume that we should be able to randomly edit the posts made by other bloggers. We read the content and make comments at the bottom. The blogger can correct inaccuracies, if any, and the discussion tends to stay on the topic. Back in WakiWorld, "pedians" assume that they should be able to edit anything, even another member's bio page, and they make pseudo-comments on a separate page, a page which has to be saved separately with every change, and always, they cast aspersions on the authors rather than focus on the given topic.

GetWiki now combines the best of the two approaches, adding the networking component to build trust. On GetWiki, the member who starts an article is like a "blogger", and others can make threaded comments at the bottom of the blog page. The blogger and their immediate "GetWiker" friends, and of course, the site's editors/admins, can edit the page. The versions of the page are saved, so that transparency is preserved, but if a member is not in that "network", they cannot edit or classify (or indeed, vandalize) that page. This fits with normal expectations of how websites operate.

Discussion is better than Talk

In the WakiWorld, we have to go to separate pages, learn about mysterious namespaces, to discuss an article, and it's called "talk", which makes it sound superfluous. Once there, the comments are usually about the people, rather than the topic. We have to format the page directly in order to show "threads", and more often than not, such formatting isn't followed by everyone in the same way, and so it becomes difficult to figure out who is talking to who. From the world of forum websites, we can only laugh at such practices, which get in the way of true discussion.

GetWiki 2.0 adds forum-style discussion boards to the mix. The boards have topics with comments, and are displayed in a threaded manner, a "tree" of comments, which makes it visually obvious who is talking to who. Only one version of the post or comment is saved, and only a moderator can edit the content. The same system is used for the comments at the bottom of articles. A similar approach has been taken to provide true private messaging between members. So, with GetWiki, members can comment on articles, get into discussions in the forums, and send messages to each other, even if they could not otherwise create or edit wiki articles, and while the display of the message has all the features of wiki-style markup, from links to tables and lists, the discussion carries none of the problems.

Features in GetWiki 1.0

Like most software, the development of GetWiki came about as much by accident and inspiration, along with a little desperation, than by any plan. After seeing the massive job it was to manually copy and save selected articles from Pseudopedia into another Wiki, the initial impetus was clear enough: create a simple method by which we could save only the desired articles from the remote Wiki, instead of loading an entire database "snapshot" or continue copying manually.

XML Importing

This has been perhaps the most important feature of GetWiki, one which is incredibly simple, and controversial. The basic idea to use XML came about as a result of ongoing research on the technology after working it into projects for rimric interactive, but it was only then a recent addition that XML export was added in MediaWiki. After this, it was easy enough to develop an extension to MediaWiki, "GetWiki", in order to import the XML content from another MediaWiki-based Wiki.

It was truly amazing how much knee-jerk criticism this feature received from Wikipedians, when the entire content of the Pseudopedia has always been freely available under the GFDL as well as provided in form of regular database "dumps" for anyone to download, use, mirror, edit or fork. Importing article-by-article is no different from this in principle, and has the added benefits of:
  • selecting articles with merits important to a particular Wiki
  • distributing server load, both electively and selectively
  • dividing single data snapshots into dynamic individual article histories
  • accentuating user interaction in creating a Wiki

XHTML/CSS Standards

After work on the XML import utility, it was clear the HTML output of MediaWiki used more than one standard (convention). The output, written by many developers, used several now deprecated standards to output HTML code to the browser, and given the long prefered focus on XML standards and XHTML/CSS design, this would not do. Somewhere, in this process of labouriously going through each PHP file to update all of the various HTMLs to clean XHTML ((W3:Markup/|W3)), the MediaWiki 1.1.0 being used as a code base was forked, because at this point, the changes could not easily be turned back into a later MediaWiki release. These changes seem to have caused a flurry of activity by the MediaWiki developers in trying to update all those mixed "td"s and "TD"s, unquoted attributes, unclosed elements, and other remnants. The latest versions of MediaWiki have, as a result, come a long way toward XHTML standards.

Yet, realistically, given the nature of how a Wiki works, it is not likely a WikiWeb will be 100% valid XHTML for long. Users are free to enter valid XHTML, invalid HTML, or any combination of code, some of which cannot be corrected at output time, because the authors' intentions cannot be anticipated in all cases - especially on a large Wiki. Also, as seen more recently in building code to translate TeX formulas, sometimes elements will be nested improperly, either breaking validation or affecting the display of the article. This by no means implies the goal was not a good one, though.

GUI Enhancements

In the course of using MediaWiki and GetWiki, moving between, user interface issues began to emerge. Right off, the colours used in MediaWiki began to strain the eyes, and so, why not make a small change here and there? Eventually, enough changes were made so that whole pages, such as the user preferences page and others, were completely reorganized and prettied up. This was not done merely out of creative angst, but from a desire to make the interfaces of GetWiki as easy to understand and intuitive to use as possible, especially for new users.

All Wikis strive, or should strive, for simplicity, but few seem clear for new users. The visual design of important pages, such as a "Main Page", is usually not given enough care in order to effectively introduce just what a wiki is to a newcomer or prospective contributor. Changes in GetWiki toward this have been at most subtle, because the basic MediaWiki design is a good one, but outlining check boxes, carefully selecting font colours, and other improvements can at least make the functions clearer. This is an area which will see continued improvements in 2.0.

Security Hardening

One problem, which has grown out of proportion with Wikipedia's actual importance, has been the security of its content. Any casual glance at the recent changes and mailing list discussions reveals an extraordinary effort spent on correcting vandalized content, discovering/claiming dummy user accounts, called "sock puppets", and in general, over-monitoring and micro-managing the users of the Wikipedias. As GetWiki developed, and due to more sensible policy on GetWiki, better measures needed to be put in place, rather than rely on a bunch of sysops to revert edits and discuss procedure. Many Wikis do not have the benefit of so many adminstrative users, and so cannot rely on the "eyeball" method of content security.

So, blocking non-account edits by default does wonders for preventing casual vandalism, by far the largest category. Linking a previously blocked IP address or address range to block the generation of new accounts is another big step, as well as blocking any edits or uploads from blocked IP/ranges even after login from a non-blocked account is another. Allowing the "admin" user, who can promote/demote other users, to view the last used IP address-host-range and email information for each account is a big help. The result? There is now virtually no vandalism on a GetWiki site, at least compared to what happens daily on Wikipedia, and no need for teams of overzealous and biased sysops to patrol the site and try to "out" the "sock puppets".

Removing Biases

Looking through the MediaWiki code also reveals a high number of Pseudopedian biases which are inappropriate for other types of sites. From the use of certain server dependencies, to the names of variables, to the assumption of namespace mechanics, to the aforementioned security drawbacks based on groupthink culture, MediaWiki is heavily biased toward the way Wikipedians want to work. GetWiki has slowly removed many, and eventually all, of these biases, favouring settings which are easily customizable for the wiki in question.

One example: MediaWiki requires both a MediaWiki, for messages, and Wikipedia, for many default settings, namespace on your wiki, where GetWiki removes Wikipedia and makes MediaWiki optional (There is an additional namespace placeholder corresponding to your export wiki setting, be it Wikipedia, Wikinfo, or any GetWiki 1.0+ or MediaWiki 1.1.0+ wiki). More and more small bits of code have been translated, removed, added or otherwise modified to make GetWiki a generalized wiki engine, rather than a Wikipedian's wiki engine. Of course, there is nothing wrong with running a Pseudopedian's engine, but for the growing number of wikis which strive to distance themselves from Wikipedia, or in general maintain independence, using Wikipedia's software may seem limiting.

edit index
[ last updated: 2:01pm EDT - Wed, Aug 13 2008 ]
LATEST EDITS [ see all ]
M.R.M. Parrott