robinmassart
re/web_tech.gif

Web Technology

Whilst for many people the Internet is still just a fancy messaging system, I have been using the net as my instant access, user friendly, global knowledge base, information retrieval system since starting University in 1993. For whilst it has often been said that knowledge is power, in my opinion the key to success lies not so much in what you know, as in knowing where to find the information you need. So here I present some brief articles I have written to put forward my opinion on issues surrounding the Internet.

Browser History   Web Standards   XML - So What?   The Semantic Web  

03.02.2003. Browser History. top
The history of web browsers started with the ?invention? of the world-wide-web by Tim Berners-Lee at the beginning of the ?90s. The first widely used browser was Mosaic. Netscape entered the browser market shortly after this giving birth to what is commonly known as the ?Browser Wars?. For Microsoft quickly realized that browser technology had the ability to supplant the operating system as the base element for computer users. In short, what operating system you used would be immaterial as long as you had a good web browser. This of course threatened the whole of Microsoft?s business model and they quickly developed a browser of their own ? Internet Explorer, pushing its capabilities quicker than web standards could be fully developed and completely integrating it into it?s Windows OS. Thus 90% of home computer users effectively had IE thrust on them overnight ? with no way of removing it if they didn?t like it. The rest as they say ?is history?. IE became the de-facto standard and Netscape more or less went bust, being bought by AOL (who, curiously, themselves relied on IE browser technology). The internet has suffered ever since (and Microsoft has landed in a number of courts around the world on anti-trust and abuse of market power charges).

Instead of one standard, the web ended up with a ?one plus many? standard, where web developers had to cover a multitude of browsers, effectively at least doubling their work. Only recently have web standards made a strong comeback. And thanks to the efforts of the largely standards compliant browsers Mozilla, Opera and others are actually being followed. With universal support for HTML4.01 (at the least) more or less upon us, the future of the web appears to be back on the right track.

15.05.2003. The case for Web Standards. top
There is a lot of talk at present about the need for web standards and their use in real life situations. Many developers/designers, whilst appreciating the need for common standards, tend to ignore them if something can be more easily implemented in another way. The main reason for this, they claim, is largely that not all functionality is available to them or few browsers adequately support the standards. The first point is not applicable in most cases, whilst the latter is becoming increasingly obsolete with the latest generation browsers of Mozilla, Opera and to some extent IE. With the correct use of HTML 4.01, CSS 2 and DOM 2, pretty much all needs of the web developer and designer can be met. Where not, a standards compliant XML document complete with DTD and XSLT should do the trick. Thus the case for ignoring standards is very weak.

Why have standards in the first place? The need for standards is mainly business driven. They enable different businesses to interact in a common and pre-defined way. Much like there is only one ?standard? kilogram, so there should only be one standard version of HTML 4.01. Imagine the confusion if every supermarket had its own definition of the kilogram. Well, this unthinkable situation has plagued the web for years. In the early days, when web standards were somewhat basic, the two main browser vendors decided on their own interpretation of HTML. They often added their own elements to increase functionality and usability. What resulted was an unprecedented mess for the web developers. Any web page worth its salt had to be written at least twice. The extra cost to business was and still is enormous.

In defence of the early browsers, much of the apathy towards standards was on the grounds that existing standards were limited and failed to keep up with the possibilities of the web. On the other hand, whilst it takes time to develop sound standards, those for HTML 4.01, CSS 2 and DOM 2 have been around since roughly the turn of the century. So it is in fact the browser vendors who have been dragging their heels. In particular the market leader who understandably has little interest in fully supporting standards.

21.07.2003. XML ? So What? top
What is XML? XML stands for eXtensible Mark-up Language. Thus XML is purely a mark-up language which allows the developer to define their own elements, attributes and document structure. This is done through something called a DTD ? Document Type Definition. Of course the world of XML is more than this, incorporating other acronyms such as XSLT, XPath and XPointer and a few more. All of these are useful for displaying, manipulating and extracting data from an XML document.

So, if XML is ?just? another mark-up language, which can be adapted to suit your own needs ? why is there all this fuss about XML? Well, following on from what was said about standards below, XML is undoubtedly an important development. For it is a standard for creating your own mark-up standard. Thus XML has produced off shoots such as MathML (used to mark-up up mathematical content) and WML (used to mark-up wireless content). I created my own standard to mark-up my guest book entries. On the one hand, XML is great for business since there is at last a standard for creating your own proprietary mark-up language. On the other hand, though, it is nothing new. XML is based on SGML, which has been around for a couple of decades and also enables this. SGML, however, is very complicated, whereas XML can be developed by people with little technical knowledge. Possibly this is where it?s real strength lies, since the XML structure could in principal be developed at boardroom level, leaving it?s implementation using XSLT and suchlike to the developer.

Mark-up in essence, is a way of applying meaning to content. Thus the HTML element <P> tells the browser that its contents are to be displayed as a paragraph. And all systems that understand HTML will know this. However business has been exchanging information for years without the need for XML. Thus whilst XML is undoubtedly very useful for business, it is essentially nothing new or at least not that revolutionary. A company could come up with its own format for defining content and happily use this instead of XML. Of course if interacting with other companies it would have to explain its format, but with XML businesses also have to agree on the structure of their XML document, before information can be reliably exchanged.

If you have read this far you may well get the impression that I am a little skeptical about XML. This is not quite true. I believe XML is a very powerful tool for making business interactions a lot easier and more standardized. What I am skeptical about, though, is much of the hype I have read about XML, the worst of which is along the lines of ?XML will empower computers to do business by themselves?. Let?s get one thing straight, XML cannot and never will do such a thing. Either I am completely missing the point or there?s a lot of rubbish being written about XML by people who should know better. The problem which this creates is that people who have no reason to know better believe the hype and expect XML to do things it cannot do.

In XML, the program interpreting the XML code, whilst being able to determine whether the XML is correct through the DTD and how to display the content using XSLT, can have no idea about the actual meaning of the information it is handling. Thus, whilst an XML interpreter may be able to correctly display a tag such as <ADDRESS> and will be able to determine that the an element of type ADDRESS can contain other elements such as <POSTCODE>, <STREET> and <CITY>, it has no way of knowing that an address is the location of someone?s home or office. Of course it may well know what to do with the data because the businesses exchanging the data have previously agreed what these tags are for. However, the sort of thing described above implies that I can write my own XML file, say about aerodynamics, and send it to any XML interpreter. This will, upon processing the file, know that my <LIFT> tag refers to the aerodynamic upward force generated by a suitably shaped wing, and not to a cabin enabling vertical movement in a large building. This is what I understand by the claim, ?XML will empower computers to do business by themselves?. XML cannot and never will be able to this.

02.09.2003. The Semantic Web top
According to ?SemanticWeb.org?, the Semantic Web is a vision: ?the idea of having data on the web defined and linked in a way that can be used by machines?. In other words, I can take my <ADDRESS> tag from the article above and in the Semantic Web, assuming it has been created according to the (as yet undefined) Semantic Web spec and send it to any Semantic Web machine. This machine will then know that the tag it refers to is the location of someone?s home or office. This is exactly the sort of thing implied by statements such as: ?XML will empower computers to do business by themselves?. Apparently the claims I was sceptical about were confused between the capabilities of XML and what they had heard about the Semantic Web. Possibly the confusion arises because the intention of the Semantic Web is to have a similarly simple syntax as that for XML. Thus XML and Semantic Web are often mentioned together. They are, however, most definitely not the same thing. In any case I am happy to have discovered that my scepticism was well founded.

I would love to now go onto to give a detailed description of the Semantic Web. Sadly, I have to admit that, having read a fair amount of material concerning the Semantic Web, I am now even more confused than before! If you are interested, www.semanticweb.org is a good place to start. To me, my confusion just shows the complexity of the task ahead. For, whilst it is not difficult to understand the specs for HTML, XHTML, CSS and DOM and with a bit of effort those of XML and its components, trying to get to grips with the Semantic Web is another thing altogether for the Semantic Web novice. If you thought XML was overflowing with acronyms, you should take a look at the Semantic Web. On top of this terms such as ontology, inference, assertion, axiom, intractability and higher order logic are commonly encountered and hint at the fact that this task encompasses linguistics, mathematics and metaphysics. The whole exercise appears to take on a philosophical nature. There?s certainly a lot of debating going on.

It is important to note right now though, that the Semantic Web is far from complete and certainly not available as a finished product along the lines of HTML or XML. On the other hand the sheer scale of the undertaking coupled with its very open approach means that the possibilities could be quite frightening. It may well make the Matrix look like kids stuff. Artificial Intelligence, though, is one thing the Semantic Web definitely is not. The Semantic Web is not attempting to make machines understand our language. Rather, it is trying to define a way to allow us to express ourselves in more detail to machines.