Monday, January 6, 2014

Git and MediaWiki

I'm again and again amazed at the modular archtecture of Git, the distributed version control system. One feature that I've just discovered is its MediaWiki integration. In short, you can clone the contents of a wiki into a local git repository, pull, make changes in the wikitext, commit them, and even push them back onto the wiki. Amazingly, this is all possible without any special server-side support; the normal MediaWiki API is sufficient.
If you want to try, you'll need dev-vcs/git-1.8.5.2 or later with the mediawiki useflag enabled. (Because of missing dependencies, at the moment the useflag is masked on all arches except amd64, but that will hopefully change soon.) The documentation for the module can be found on github.
Here's a small demonstration. Since cloning a whole wiki is timeconsuming and also (if done repeatedly) not very nice towards the wiki operators, let's clone only all pages of the "Desktop" category on the Gentoo wiki:
git clone -c remote.origin.categories='Desktop' \
       mediawiki::http://wiki.gentoo.org
This creates a directory wiki.gentoo.org, we change directory in there and call "git log --stat", and obtain a version history:
commit d10a150c78e743d1e62c39f5a7feadf81c552e28
Author: Tox2ik <Tox2ik@wiki.gentoo.org>
Date:   Wed Jan 1 14:52:24 2014 +0000

    Broke up a long sentence.

 Fontconfig.mw | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

commit 1da83052b8062239e265458c538b2efb32ca25a8
Author: El Salmon <El Salmon@wiki.gentoo.org>
Date:   Tue Dec 3 19:05:37 2013 +0000

    Checking configuration

 Fontconfig.mw | 4 ++++
 1 file changed, 4 insertions(+)

commit 9583e17c82508d166d9a30733765d3bdfd9d7f3e
Author: Emery <Emery@wiki.gentoo.org>
Date:   Fri Oct 25 04:53:35 2013 +0000

    Tense changes (changes to tense).

 Tallscreen_Monitor.mw | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

...
The files in the Git working copy are raw wikitext files, and can easily be edited by hand. So, we edit a file and commit that change locally into Git:
commit 6af2bfa0bba4cf825d4e52242709ffccfe341223
Author: Andreas K. Huettel (dilfridge) <dilfridge@gentoo.org>
Date:   Sat Jan 4 23:47:55 2014 +0100

    Add klayout to the scientific applications

 Recommended_applications.mw | 1 +
 1 file changed, 1 insertion(+)
To be able to push the change back into the wiki, we need to log in there. So, we tell Git our wiki username:
git config remote.origin.mwLogin Dilfridge
Afterwards, a simple "git push" is enough to publish our change; Git will ask for the wiki password.
While definitely a cool feature, there are some limitations. The most obvious one that I have come across is the following: if different people clone from the wiki, they obtain different git histories, i.e. the commits in their histories will have different hashes. This means sharing changesets via git is only possible if the cloning from MediaWiki takes place only once, and then the generated git repository is cloned in turn. In addition, I suspect the author of a change in the wiki is set by the username when pushing the change, not by the author of the git commit.
Anyway, I think this might be useful.

5 comments:

  1. maybe useful for mirroring? seems a bit heavy handed of a solution.

    ReplyDelete
  2. I am getting "Warning: Error 3 from mediawiki: editconflict: Edit conflict detected." when trying to push the first small change. Nobody editing the wiki at that time or something like that. Do you know if there are any current issues with this in ~amd64 gentoo? My wiki works without authentication, does git-mw *need* that?
    Greets, Stefan

    ReplyDelete
    Replies
    1. For any with the same problem: make sure you did
      git config remote.origin.wmLogin
      That solved the problem for me.

      Delete
  3. Running Git 2.4.5 on Ubuntu 14.04, I got

    git clone -c remote.origin.categories='Desktop' mediawiki::http://wiki.gentoo.org
    Cloning into 'wiki.gentoo.org'...
    Searching revisions...
    No previous mediawiki revision found, fetching from beginning.
    Fetching & writing export data by pages...
    Listing pages on remote wiki...
    2: 301 Moved Permanently : error occurred when accessing http://wiki.gentoo.org/api.php after 1 attempt(s)
    Checking connectivity... fatal: bad object 0000000000000000000000000000000000000000
    fatal: remote did not send all necessary objects

    ReplyDelete
    Replies
    1. It works if you use https instead of http in the wiki URL. Probably some configuration was changed locally to auto-redirect to https, and git doesnt understand that.

      Delete