Jump to content

WIKI Software - Key Features


Pete
 Share

Recommended Posts

Hi folks

Copying this topic from another website where I host a WIKI to get more feedback.

WIKI's are confusing to me. They're usually listed as their own software package, however all they appear to be (to me at least) is a content management system with the following features:

  • Page history
  • Easy linking to other pages (links also don't break when pages are renamed??)
  • Page comments with threading (indentation)
  • Contributions from all levels of user (if desired)
  • An editor - WYSIWIG and/or [[tag]] based
  • Spam filter of some form
  • ...?

I'm interested in learning what other features make a WIKI software different to a normal content management system. It occurs to me that
there might be a gap in the market for something more user/admin-friendly and easier to upgrade.

This comes from MediaWiki being a total nightmare to upgrade - they now have an upgrade script (after how many years?) but features get changed or dropped all over the show and you're always looking for extensions to do what should be core pieces of functionality (spam filtering etc) and before long you end up with an installation and a dozen extensions to juggle every time it needs upgrading. There has to be a better way, and I believe we might be 80% there with a basic ProcessWire installation.

I'm not suggesting I would attempt this myself or any time soon if I did, but unless I'm missing something all WIKI software seems to follow the same general pattern.

From a data point of view, the bit where they tend to run into trouble is that all pages seem to be under the root with no categorisation,
which leads to things like duplicate page name issues, plus not the best queries on the database behind the scenes (at a guess). Since everything in a WIKI can relate to everything else, tagging might be better than structured categorisation, as those who use PW's Page fieldtype are probably already aware :)

Also, the caching options in MediaWiki at least aren't as good as they should be - something ProCache would help with (could have everything static and load the user bar via AJAX or something).

I didn't intend to start of sounding like I was moaning about WIKI software, but I guess I am :)

If anyone involved in a WIKI would like to fill in some feature gaps from my list above that would be much appreciated.
 

To recap, because I do go on a bit, I would like to know:

  • What features I missed from the first list   
  • What features you would like to see in WIKI software

I feel like all of this is achievable in ProcessWire with some modules to handle page history (nearly there already I see), users, threded page discussion etc and it's all just a matter of some code to come up with something better.

This is just me thinking out loud for now as I've got way too much work on at the moment, but it's one of those things where I've been thinking for a while "surely it wouldn't be that hard to do in ProcessWire" - a WIKI is just a content management system after all.

Link to comment
Share on other sites

Ah, you have touched on a complete bug-bear with me!

I absolutely hate mediawiki and its associated programming ethos - it alienates the very people who need to use it, highly intelligent knowledgeable and very busy people who do not want to learn something new!

In practice, a wiki is basically just a knowledge base and as such can fall into two categories:

Extended FAQ / User Guide

Outside of Wikipedia, this is how it is most used - the PW Wiki is a prime example. 

This type of system works best in a strict hierarchical way - so, one person puts together the structure and every body else adds content. The hierarchy could (and should) be defined in many ways - categories, book content (menu structure), index, tags and so on. So, a true user manual.

In this case, it is also best that contributors are restricted to how information is added and prompted for certain types of information.

I like the idea of blocks of information here. The most basic block, for example, would be title and any categorisation information. The next block would be a summary. Another might be a Question or Proposition. Bocks would be available for body content, sidebar information, cross linking, graphs, financial information, citations, news links and so on. It would be up to the administrator to decide which blocks are available for users (or groups of users, perhaps - see editorial below)

This sort of prompting and division of information means that consistency in content can be imposed - it is easy to make things LOOK the same, but making things READ the same is harder, especially if you give people a free for all system!

Enclylopedia

Wikipedia is the obvious example here.

The point of an encyclopaedia is that it is a reviewed and edited collection of authoritative articles written in the style of a named contributor. This means that the publishing system has to be very accessible to the contributor - whether that is in a free-for-all system or a traditional editorial system.

A member of my family who is a retired, noted professor, pointed out that he would contribute to Wikipedia if he could somehow just upload a Word document. But he has absolutely no interest in learning wiki markup and he finds the new (easier) system also confusing. He is a scientist too.

He has a point.  If you want a system where anyone can join in, then you should offer them a range of ways for them to contribute - the publishing format should not be their problem, but the publishers.

So, no wiki markup, intuitive toolbars with nice big logical buttons and the ability to upload and convert source material (even if some fiddling needs to be done following import).

Again, additional blocks are useful - not just as tools but as prompts.

Editorial

This is one thing that make a system much more useful to a wider range of publishers, as it were.

One of the big differences between a CMS (as we know it) and a news system is the editorial hierarchy. Years ago I wrote my own CMS using Dreamweaver and the addon mysql tools. The editorial system allowed for:

  • Full editorial hierarchy including section editors, photo librarians, managing editors, staff authors, freelance authors etc
  • Article control - Authors wrote articles, then submit to an editor. Editor reviewed and could publish, archive or send back to the author for amendments.  (Editors could also set an article to be reviewed by another editor if they needed help)
  • Shared Authorship - an article could be written collaboratively or by just one author.
  • Library system - research notes, images, contacts and so on could be put in a central library and access controlled
  • History trail - everything from who reviewed what and when
  • Legal - additional layer for legal review. Basically, if an editor marked an article for legal review, no one could publish until released by the legal team.

So, all the people and systems you would expect in a newspaper office or publishing house applied to a web system.

This might seem OTT for a wiki, but it is a logical extension and having the foundations of such a system built in at the bottom means it can be extended easily later.

Output

This is, or could be, where PW scores most highly. It strikes me that the output should be in small, simple blocks that means that someone can throw away the supplied template(s) and start from scratch using a very basic API - probably a set of function includes for things like lists, and then just the normal PW bits for most of the rest.

Link to comment
Share on other sites

Something that strikes me is that there are a number of WIKI contributors that like writing in Markdown/Textile/BBCode etc and actually dislike WYSIWIG editors sometimes even preferring HTML (WYSIWIG's get a bit clumsy with tables for example), so a solution should either force everyone to work one way, or perhaps allow the user to do things their own way or allow them to choose their favourite method of editing provided everything is stored in the DB as HTML and each editor can convert on the fly to your preferred editing style. That said, I would prefer to say use this or this - ONE markup-type editing system and ONE HTML editor rather than have too many things to keep track of, given their potential differences.

Uploading a Word document is a nice idea as you say, but parsing all the different versions Microsoft has produced over the years could be a headache (they changed it to an XML structure and have their own horrifying version of HTML as well with so much superfluous markup it's unreal!). Either way an editor would almost certainly need to tweak the contents afterwards (but I guess that's part of the job?).

Link to comment
Share on other sites

The parsing is a big issue.

I noticed in the liferay wiki that they have either creole markup or html with a wysiwyg - however, they are not properly compatible with each other, so when you choose one system you cannot leap over halfway through the article.

Here is a mad idea.

Every field is actually two or more fields - one has the text in it and the other the markup.

So, you decide to write a paragraph with wiki markup. The text is entered into what looks like a text area. You tell it you are using wiki markup and then go ahead. In reality, two versions are stored - one just the text and the other the markup, or text plus markup perhaps.

If you then decided you wanted to do it html/wysiwyg - you can tell it that. It converts and creates a new markup field, this one with html in it. You then edit and save. Any text editing is saved again in the non-markup field and any mark up is stored in the TWO markup fields - html and wikimarkup.

OR - Any markup is stored as something like XML and then converted into whatever markup for the desired UI in memory on the fly. Possibly better that, if you have a chunky enough server.

This way you can allow people to edit in as many ways as you create converters/interfaces. The markup always ends up usable and the original text is always clean and intact. The user is only presented with one interface at a time, of course.

I quite like the idea of messing around with repeat fields for content.

So, you would write your article in blocks (paragraphs) - each would have a title field (optional), the text area defaulted to whatever your preferred editor interface is and an image upload.  You would have to mess with the user interface to make it look continuous rather than like a blocked form.

This is my thing about data integrity. I like the idea that somewhere in  the system the text is sitting there in as clean a form as possible and does not require the system to be working to view it and understand it. So in 500 years time, it still makes some sort of sense.

I agree about word, but I think you would stick to modern versions - there are limits!

(Edit - this is a bit like using Lightroom where all your photo edits are stored in a separate file and applied when viewing)

Link to comment
Share on other sites

Do you see this as a profile or a module?

One thing about profiles which is slightly awkward is if you want their functionality added onto your existing installation.

Where as a module could do that exactly. Though with something that will add fields and templates, then there has to be some basic questions asked in installation:

1. Do you want this at the root of the site or under a parent? If the latter, name an existing page or one you would like created.

2. Are you using any of these field name? wiki_body, wiki_summary .... This is to just check since if we are dealing with clever markup situations, then the fields used will probably be pretty specialised.

Also, it would need some clever image management system since if Wiki's are about sharing information across the system, then that has got to include images and other media.

I did at some point come up with the idea of a clever image system, but can't remember what I did with it! It was using pages, but the idea was that each image had a record of which article it has been used by rather than being kept in an article related directory ... something like that.

Link to comment
Share on other sites

I think it would have to be a profile to begin with as there would be numerous modules required as well as several custom templates. It's not something I would see as easy to add to a site as an afterthought. It is technically possible via the wonders of dependencies to install several modules at once, but the code involved just in getting everything installed, copying templates to the right place and *shudders* making assumptions (like that there are no similarly named templates in the system or templates folder) when copying things to the relevant place put me off that idea - at least in the first version.

Usually though, if I was going to write an installer that takes the above into account, I'd make sure to use prefixes in template names and files to avoid overwrites/conflicts - wiki_ would make sense there.

The file upload system in Mediawiki is what seems to handle all images and other files. It's simply a form where you select the file, give it a description, select which licensing is applicable etc and upload it, then use the filename to insert it into pages. It's pretty simplistic and would take about ten minutes in ProcessWire to come up with a better version that also categorises files as you upload them - for example you start with a root category, then in the form you specify a parent category (defaults to root) and/or a new category name and on save it checks for these and expands the image tree as necessary. This is another little module idea that means you don't eventually end up with a million images under one folder on the disk. That said, the categorisation you would use here isn't dissimilar to what you would use for the pages on the WIKI, so why not kill two birds with one stone there and use the "tag"/category pages that articles can be tagged against to store the relevant images under?

My biggest beef is that many WIKIs seem to end up forming a structure but it is by default an unstructured system. Wikipedia is an encyclopedia, so the first thing you do is use the search, but often WIKI operators (same with the PW WIKI actually) end up structuring a table of contents to navigate through the pages. The ideal is a system that does the latter, but can also be used for the former, so image/file uploads needs some thinking about as in the encyclopedia scenario you would be better leaving images attached to the article rather than trying to store them centrally.

I'm getting a headache again :)

Link to comment
Share on other sites

The more I think about it Joss, the more it's like ProcessWire gets it right already for most of this type of content - you usually upload an image for the page you're using it on, and most of the time it's not used elsewhere, so maybe if your pages are correctly categorised you can just use those categories/tags and a simple image manager module (a variation on this: http://processwire.com/talk/topic/3219-images-manager-alpha/ ) to see what images are where and search them for use in pages on the WIKI.

Link to comment
Share on other sites

I agree with that.

The thing about a wiki is that it is trying to emulate a library (not always very well).

ProcessWire has a base structure based around one idea - the page. That is extremely useful if used for absolutely everything.

Interestingly enough, the one thing it does not use it for automatically (as far as I am aware) is for an image. From my brief look, Soma's module is trying to do that - make an image into another page entry. That makes perfect sense.

If images were always actual page entries (in the same way that repeaters are pages really), then the categorisation system is there by default and PW's native hierarchical system makes the perfect image library.

Really, that is probably the starting point of a wiki built with ProcessWire - Everything entered or added is a submission to the library, whether that is text, a media file, an annotation, a comment, whatever. In our language, everything submitted is a page in our hierarchy.

That is a very powerful starting point.

From there, the next stage is not the interface but is to look at it from the librarian's or curator's point of view. How is each page categorised? How does it fit within a hierarchy, how is it accessible. To a certain extent this is not an internet thing, this already exists in bricks and mortar libraries and is a well tried and tested system.

After that is all properly sorted, THEN you can have fun working out the perfect interface to allow submissions to the library to be made in a very accessible manor that is completely agnostic - the final data being stored in the way the librarian needs rather than how the interface presents the data.

Link to comment
Share on other sites

I think this is teaching me about how a lot of WIKIs should be run but not necessarily how they are run. Many (or most?) start without categorisation - either because they start off small or because there are no options for categorisation - and then continue down that path which just exacerbates the issue.

I sort of disagree with treating everything as a page though purely for reasons of scalability in this case.

With images for example I'm reasonably sure that assumptions can be made as to the fields required (I can see people flinching reading this). My thinking is that images on WIKIs will number larger than the amount of pages themselves, so a new image fieldtype with a few extra fields would be more efficient and scalable there. I think I've listed the main fields with title, description, license (if any) as well as our categorisation idea and that's pretty much it. I would tie them to the category pages themselves as previously mentioned.

That said, it is possible to do it the way you suggest and then everything is its own page obviously, however you will end up eventually with potentially a million folders in your site/assets/files directory. It's been discussed before here on the forums, but at some point there is the suggestion that the file system doesn't like huge numbers of folders in one directory (really depends on the config). That's a wider issue though and I know last time it was discussed ryan had an idea on that front, but since it's not reared its head again I guess it's not been an issue for now, but for something like this it could become one relatively quickly.

Of course, this method allows you to create any content type you like whereas my suggestion is rigid... I think I see which direction we're going to go in as PW is all about having the choice and as long as it's thought about carefully (and things are cached nicely at template level as a minimum) it shouldn't be a problem.

More thoughts to follow.

Link to comment
Share on other sites

Ah, yes, I was sort of thinking that you did not have a folder per image - this is where my tech knowledge is not good enough. Obviously, the actual behind-the-scenes bit needs to be as efficient as possible, but it would be nice that somehow all information from the database point of view is treated in the same way, 

The other field(s) you need for images is their metadata, which is easy enough to extract on upload. It makes sense to extract this from any image since it will often contain copyright info, geographical information and so on. Useful stuff and can reduce form filling if available. You can also make it a restriction - a very tightly run library would not want images that did not have the required metadata and would abort the upload. It is one of those twists that though many would not use it, makes for a complete system

Edit:

There is also the ownership route of all information, whether that is text or image. The system should show what article the image has been used for, but also who uploaded it originally and on what date. Again, although not important for many applications, this sort of audit trail is needed for more academic use of a wiki.

Edit 2:

You are right about how wikis are run as opposed to should be run. I think there is an argument that says if you make something where anarchy can exist, inevitably it will exist.

Perhaps there should be the option on installation that says "I want this installed with full publishing rules" or "I want anarchy." If the former, then editorial controls, review before publishing, proper categorisation and indexing are all enabled. It becomes a highly controlled, properly managed environment. If the latter, then it is more like Mediawiki.

Link to comment
Share on other sites

On a related subject, I thought Wikia.com was a hosted WIKI site running different software than MediaWiki - turns out it's still MediaWiki and related to Wikipedia's founder (see the "Our History" section halfway down this page: http://www.wikia.com/About ).

What surprises me is that, for one, that site looks so much better than most WIKI sites, but also from a technical viewpoint what they are labelling in that section as "New Features" are things that can easily be done in ProcessWire - something that gives me hope that the ideas we're mulling over here aren't that far-fetched.

It almost seems like the competition have thrown in the towel somewhat doesn't it? The vast majority of WIKIs on the web use self-hosted MediaWiki or are on Wikia nowadays, but as we've said in the CMS field, there could be better alternatives in many situations than the top-billed products.

I am of course getting way ahead of myself there. For anything like this to get off the ground one of the first steps is to dissect an existing product (I'm only thinking database and file structure here) to see how they handle things and get an idea of potential pitfalls before we fall in them :)

Link to comment
Share on other sites

I think the database bit is less problematic. If you work from the fact that academia is about "bits" of information put together in such a way as to make a coherent argument, then that is pretty much the way processwire works anyway - far more so than other systems which tend to take lots of information and shove it together with separators on a row somewhere. Possibly economical, but doesn't make sense.

If, however, you were creating a specific site for dissertations for example, for each you would have distinct blocks (fields) for Title, Long Premise, Executive Summary, Content, Chapters, Index, Bibliography, Author, Copyright and so on. Since every dissertation works in the same way you now have very logical blocks of data where you can ask very specific questions like "list all works where piglettes is listed in the index." 

Now, that sort of breakdown is very familiar to any PW user and I think is a really useful starting point.

So, I think the starting point has to be to ignore everything that has been done in the last 10 or so years and go back to the origins - look at how things like the British Library are organised and curated, look at how something like the London Times works, look at how a printed magazine works ....

Basically, look at how a potentially large body of work needs to be managed and the come up with a system that fulfils that need rather than rewrites the rules.

I think there is an additional usage - a lot of people use mediawiki and others as a sort of communal knowledge base as part of a project. The problem there is that there is a disconnect between the knowledge base and the project itself which helps no one. It would be interesting to find a way of bridging that gap. For instance, I like the idea of comments within code to become searchable notations within a knowledgebase so you can search for specific functionality via comments and not just by knowing what the function just happened to be called.

Anyway, I am wandering. I think what needs to be nailed down is the non-techincal how things should be submitted-stored-retrieved and from there working out how that can be realised using PW.

Link to comment
Share on other sites

A bit of an aside here. 

When it comes to interfaces and accessibility, I think there are two distinct areas:

Editorial and Management.

Management would use the back end and would itself be subdivided between Administration and Print & Design

Administration would simply be everything to do with the application - updates, modules, and all that.

Print & Design would be about the overall look of the web publication plus (for more controlled applications like news sites) final formatting and layout. Print may also have final publication control in some environments.

On the front end it is all about authorship and editorial. The front end can be subdivided as required with multiple editorial roles, author roles, librarian roles and so on. Even legal, if that was required.

This seems over complicated for a wiki. However, allowing this sort of division of labour and roles at the outset does not mean that you cannot disable much of this if you want a simpler version - easier than somehow trying to add stuff later.

Link to comment
Share on other sites

Managing the Hierarchy could be fun, and I think PW default page structure is perfect for this, though I think a novel way of displaying it for management purposes might be worth investigating.

If you look at it, it is a family tree with Home at the top and then departments, followed by subject, followed by shelf followed by items, just like in a library.

If someone was clever, they could put together a JQuery tree that displayed to the Shelf level, but then changed display when you wanted to list everything on the shelf. For browsing/management, you simply work your way down the tree. At any point you can open the entire list that is beneath that point in the tree, or you could work your way till that branch runs out of twigs!

Yep. its a TwiggyPedia! (Sorry ...)

So, that is the standard way of laying it all out. But it is not the only way, and there is an argument to say that several ways should be employed all at once. The standard way is the equivalent of wandering into a library and looking around, heading eventually to the shelf that interests you. However, behind the scenes the library uses a different method like the Dewey Decimal Classification

http://en.wikipedia.org/wiki/Dewey_Decimal_Classification

These types of system use a number that is unique to each book and is based on category and other criteria. In libraries it makes books easier to return to the correct shelf, but of course in a digital wiki can be used for similar purposes.

Moving on from that is tagging. Now tagging is often seen as a flat relationship between otherwise non-related items, but I suppose it could also be hierarchical if the tagging system was.

My unformed theory here is that people are often inaccurate with their tags. If you do an article on Richard Branson, you may tag it "Entrepreneur" and "beard"

But, is this very bearded? A bit hairy? ZZ Top? Time for a thesaurus! So, why not have a tagging system that is based on a thesaurus. You type in Beard and you are offered a choice of similes - stronger ones at the top and weaker ones lower down. If you choose a weaker one, you could choose to dig deeper. The result would be tag trail, starting with your first thought and then listing how you went from there.  If you weighted them, then the tagging is becoming hierarchical - you could choose to look at other articles that shared strongest tags or weakest tags.

Other categorisation can be geographical and of course time - WHEN is this about.(as opposed to when was it written)

Now your wiki is becoming truly ordered

  • Like 1
Link to comment
Share on other sites

Just thinking about the profile vs module thingy

Another good reason for doing this as an addon, rather than a fresh installation, is that a powerful wiki could make up part of a bigger installation that included brochure sites, discussion boards, shops and all kinds of other things.

One of the problems with wikis is that they are separated off from the rest of the organisation's web, so there is no cross sharing of data and users. As an addon, rather than a site in its own right, it brings it closer to the rest.

Link to comment
Share on other sites

 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...