[html4all] HTML tags on the wiki
Robert Burns
rob at robburns.com
Mon Aug 27 10:44:35 PDT 2007
Hell 4all,
I'm getting a little closer to figuring out the wiki issues. I think
I've uncovered how to add file uploads to the wiki. This will mean we
can use the built-in wiki markup for adding images. Such wiki markup
has alt= added to all images. Unfortunately it repeats the alt= on
the img in the title attribute of the img element's parent anchor
element. When adding the keyword 'caption' to the markup, the text is
then repeated a third time in the images caption (along with img at alt
and a at title).
One option is to allow all HTML wrapped in <html> tags. This is
discouraged as a security risk when using a completely open wiki[1].
We could easily restrict access somewhat and it would probably be
safe to enable this feature.
the other thing I was looking for was an HTML tag whitelist. It turns
out that the white list is written in PHP and it has to be changed
before compiling the MediaWiki software[2]. This is something I think
perhaps DanC should do in setting up a new wiki for W3C. However, its
not something I feel I could accomplish, but I might look into it.
Basically it would involve making changes to teh whit list and then
compiling our own HTML4All WikiMedia version.
There is a bug report to MediaWiki to add some of the obvious missing
tags (abbr, defn, q, etc).[2][3] Obviously there's no danger in
allowing these tags and they can also improve accessibility for the
wiki. However, the presumption seems to be that adding <img> and
<object> would be dangerous. I'm not sure if that's the case. They
may be, but I'd rather hear the arguments than simply assuming they
are. Perhaps enabling <object> without <param> would make it usable,
but remove some of the bigger chances for a security hole. Some of
the <object> attributes might also open some danger like: codebase,
codetype, and clasid. Again, I'm not saying these are security holes,
but they could be. Any security holes would have to be exploits of
software users already had installed on their system.
Finally, I think perhaps the MediaWiki software approach to this
might be to instead add new syntax for images to differentiate
between alt=, title= and captions. Something like:
[Image:srcURI | caption:caption-text | title:title-text | some-alt-
text ]
This way the wikimedia software would convert that syntax into either
an img or object element depending on the configuration. Perhaps
filing a new bug on that would be helpful.
I was also thinking of writing to DanC on this to alert him to the
fact that W3C may want to compile its own version of MediaWiki to add
support for more semantics and accessibility.
I'd like to hear from the group on what approach I should take.
Obviously if I or anyone else can handle a recompile and installation
of our own MediaWiki software that would be great. However, if we
can't do that, should we require login to the site and turn on all
HTML tags?
Take care,
Rob
[1]: Enable all HTML in <html></html> wrapper:
<http://www.mediawiki.org/wiki/Manual:%24wgRawHtml>
[2]: MediaWiki bug report on adding some additional HTML tags:
<http://bugzilla.wikimedia.org/show_bug.cgi?id=671>
[3]: Diff of proposed HTML tag white list:
<http://bugzilla.wikimedia.org/attachment.cgi?id=3331>
More information about the List_HTML4all.org
mailing list