Semantics and the Web: An Awkward History

 

hello this is a story where we the fans of meaning conveyed by markup mostly lose after a long winning streak to soften the edges a bit i’m telling the story with playmobil figures this is a lot of sugar wrapped around a fundamentally bitter flavor hopefully the result pulls off that trick like chocolate does balancing ingredients is hard though and i’m not sure the bitterness has had enough time to ferment i hope this new telling of old stories will help you re-examine what you know to revisit community and communities playmobil though amazing has obvious limitations so all characters and situations are extremely approximate many thanks to kyle weems also known as css squirrel for permission to use his comic as always my opinions are likely not my employer’s opinions markup lets us add labels to content typically text content to create labeled structures it is a relatively lightweight way to create information that we can share across multiple platforms reuse in different contexts explore in a text editor and transform into something different compared to many other formats markup is easy to process and manipulate with generic tools it has just enough flexibility for many though not all use cases sometimes it’s a good idea to go back to the origins to figure out how we got here [Music] no that’s not how it was moonwatcher and his friends are thrilling but as someone reminded me when i first presented this bit there was meaning before markup in the beginning was the word not the tag or did white space come first most tellings of markup language history and theology start with charles goldfarb like all the primary characters you’ll see in this talk he’s a point in which ideas converged generic coding later known as markup first emerged in the late 1960s when william turner cliff stanley rice and norman sharpe got the ideas going at the graphics communication association the gca goldfarb’s implementations at ibm with his colleagues edward mosher and raymond laurie the gm and l made him the point person for these conversations markup didn’t have angle brackets yet and it looked different but it is markup with start and end tags putting labels into text and then having separate programs process that text and labels opened up new possibilities clearly these were good ideas and ibm had the equipment and people to run with them more and more people heard about what was possible here and interest spread and spread and spread tools for applying gml mostly came from ibm gml work focused on publishing especially within ibm itself a vast publisher they were also happy to sell you printers compositors and a variety of tools for creating and managing documents with gml there was something great starting here first the gca then the american national standards institute ansi and then the international organization for standardization brought together experts and implementers to build a more comprehensive set of tools the military-industrial complex eagerly participated there is a securely invisible figure there many professions saw opportunities and possibilities in this work people who wanted to build things and clean things up were eager to join the crowd grew bigger as sgml emerged with supporting technologies like high time for hypertext document style semantics and specification language abbreviated diesel and more appearing this looks more familiar in those days when storage was precious markup minimization was attractive you can identify the structure the elements of this document using the start tags who needs end tags when your software can figure out logical boundaries skipping end tags is the smallest and simplest form of minimization and there are many more options sgml offered lots of tricks to keep things small offering convenience to both human authors and storage capacities the price of that was software complexity and to some extent document type definition or dtd complexity the sgml tools landscape was a superset of the gml landscape with tools available from multiple vendors for a variety of different kinds of projects you were hiring people often consultants as much as you are choosing software much of the software was expensive but even the open source tools like sgmls sp and jade presented challenges you might grab an sgml parser that could process your style of markup but then what even the tools for sale were generally software projects more than software tools you could find a set of tools that fit your project and assemble and customize them to meet your needs companies customized sgml editors to their own vocabularies and workflows the details of high times hypertext connections depended on what and how you wanted to connect thisl projects needed developers to create stylesheets that produced results you could always store documents in a file system but if you wanted useful querying fragment management or versioning you probably needed to build and integrate something bigger some sgml tools were more complete than others but even in the best situations integration and maintenance were a lot of work one of my favorites was the tei pizza chef an interface for mixing and matching markup options what do you want in your documents is kind of like what do you want on your pizza they’ve renamed it roma but the concept is similar sgml included a sample vocabulary built on a model from the earliest days of gml the american association of publishers and others used it regularly at cern in 1989 tim berners-lee extended that vocabulary slightly for the internet to help researchers communicate his server sharing real information along with a simple browser may have been more convincing than the diagram this comes a bit later from 1999 but does a nice job of showing html in use it has tags like sgml and uses attributes mostly for links this document included tags at the end of elements the first few people liked it and they brought more friends including people who started writing more browsers with new capabilities that brought in more people as the web became more accessible the crowd kept growing reaching a larger and larger slice of not always technically sophisticated people so many people so quickly in the diversity of applications was overwhelming even santa claus showed up many respectable people started using the web for business kids ponies trash pandas and more came on to the web quickly there were apparently a few pirates too the web could go as far as your imagination would take you presenting information in many ways to many people as the saying goes on the internet no one knows you’re a dog and of course everyone knows that the web is primarily a medium for spreading cat pictures pretty much from the beginning of course there were bots of various kinds proxies and spiders came first but bots crawl html constantly and of course there were people wearing tin foil hats people wanted to learn about the web especially to share content on the web html was the foundation decorated by css and javascript the html tooling opportunities were vast and came in many forms the most visible part of the web story was the browser when the web first appeared so did a burst of browsers across all kinds of platforms lots of them were licensees of the same spyglass mosaic code but many of them were original creations or added their own perspectives it wasn’t all about coding though much like we continued to talk about people surfing the web i had lots of conversations with people early in the web who described creating content in different ways obviously creating websites was like publishing learning from print though clearly electronic some of those workflows tools and ideas transferred well or music many voices coming together integrating with flow among different themes web development was like cooking or baking mixing ingredients cooking them just right and plating them for the world to enjoy the web was a new medium joining other 20th century media especially but not only electronic media many described their efforts in terms of fiber arts gathering the threads of hypertext spinning stories weaving the web stitching it all together to see how it fit and how it wore over time even though the web safe palette was only 216 colors i heard a lot about painting indeed color and design more broadly became more and more important as the technology evolved there were so many styles of tools ready for use in small workshops or gardens for code some of this work was demolition of course construction was a key metaphor under construction became so common a phrase that it fell out of style from overuse and some of the construction was at a truly massive scale bigger than anything that computing had exposed directly to the public because html was structured text and computers had more and more audio capabilities the web was more than just a visual medium screen readers had existed for a while but the web opened new possibilities thanks to a structured foundation of markup and browser apis bringing content to far more people also because every html document was a text document different approaches could create and edit it handwork with text editors was often how people learned but it’s also how people even now maintain pretty complicated sets of documents text-based html editors with features like highlighting syntax collapsing sections and convenient search tools made it easier to work with markup as projects got more complicated turning into groups of documents with repetitive navigation content and style template systems could ease the work over time what you see is what you get or wysiwyg content editors appeared though some of them were powerful yet clunky the resulting markup was not always precise a whole range of options quickly grew on the server side the common gateway interface cgi and its successors let developers build large projects solid foundations for different kinds of projects might use the same core materials but also different tools at different scales dumpsters are available in a variety of sizes and styles to be honest though these have always been available demolition of old projects waste and disasters are common and frequent parts of computing some dumpsters are more exciting than others the web was so popular that there were even browser wars being a browser vendor was hard over a hundred companies licensed the spyglass mosaic code and in the end only one kept going microsoft internet explorer had massive backing from a company with other strong supporting projects netscape and microsoft were pretty quickly the big two of the browser world fighting over the treasure of the web both companies had browser teams focused on winning this space evaluating changing landscapes for opportunities well a lot of the battles were over strange issues not everything was a fight in an html 3.2 meeting netscape supposedly dropped blink and microsoft matched it by surrendering marquee but most things were harder than that some were huge determining the future of the web for decades to come it was clear that doing all formatting directly in html was a maintenance nightmare but what should the formatting layer look like the w3c’s cascading style sheets to find a way to declare that this kind of element identified by this selector should have these properties applied you didn’t need to be a programmer to use it microsoft backed css would the presentation layer for the web be the w3c’s declarative cascading style sheets or more traditional imperative programming code netscape preferred javascript based style sheets these applied the programming language that netscape had created to the challenges of formatting instead of a list of declarations jss ran a javascript program that imperative code set formatting properties using any javascript logic you wanted the battle was joined it wasn’t just netscape versus microsoft a lot of developers using their tools also took sides shaping their projects to fit one model or the other money in the shape of the web were enough to bring a lot of people into the battle following leaders with corporate and technical agendas cascading style sheets won that battle but every battle had its costs most people would recover though many people left the web or careers based on the web out of frustration next up was the crucial question of how code would interact with the contents of html documents could you script all parts of a document through a document object model or would only certain pieces layer elements be available to script microsoft had built the document object model into the dynamic html of its internet explorer 4.0 a lot of developers myself included were eager to join the dynamic html charge netscape required additional layer elements and you could only script those layers developers quickly found that supporting both in a document was difficult layers were limiting and the additional markup didn’t always work well with document editing and management tools the internet explorer side advanced a tight formation of dom nodes was hard to defeat layers cost netscape a lot and never made it to standardization instead netscape joined the document object model work at the w3c from the perspective of let’s create meaningful markup with minimal clutter this result was a very good thing again though many of the casualties were developers the perpetual battle wasn’t about a specific technology from the beginning of the web developers had found and relied on bugs and browsers developers whose sites depended on the bugs wanted them to continue as they were many were parsing bugs some were layout bugs so many possibilities netscape supported one set microsoft had done the same thing but of course the details were different as always mostly web developers suffered wondering why things that worked in one browser didn’t work in another in the midst of these headaches a group of developers decided it was time to do something different not pick sides among the contestants but intervene from the middle the group tried to break up the battles trumpeting the message that web standards would make developers lives easier to both microsoft and netscape and others fighting over features mostly moved from browser releases to committee rooms they also worked on getting developers to move away from table-based layout and spacer gifs which were hard to maintain to cleaner html structures with css despite its explosive growth and popularization of markup technology the web’s use of markup did not go over well in all quarters the web sucks it is lightweight shallow trivial and disposable it is simple enough that any idiot can use it and this weirdly enough is considered good that was tim bray in 1995 in wired magazine the sgml community knew it could do more team xml was forming it wasn’t just two leaders john bosack and tim bray built on sgml ideas and a growing eagerness in the community to try something simpler growing interest in web standards created an opening for a new approach to neatly structured markup john and tim spearheaded a new process at the home of the web the world wide web consortium the w3c in a dark corner at the w3c though certainly not secretly the sgml editorial review board eventually the xml working group put sgml on a diet starting with sgml itself though diesel and high time would come soon they shrank the specification dramatically with help from technical lead james clark perhaps they didn’t literally take chainsaws to the specification or reduce it to a postcard but they dropped a lot of features to produce something simple to standardize and implement across a wide variety of software platforms xml’s discipline made markup syntax much easier to process xml lacked minimization rules but it offered structural flexibility similar to sgmls committees could describe markup vocabularies and structures or because it was easy to process a browser was enough to get you started you could create your own personal markup formats of course not everyone stayed around for xml some people remained with strictly sgml work preferring to keep key features but facing a slow fade as tools development shifted primarily to xml the xml crowd grew though not to the same size or diversity of uses as the html crowd xhtml which put html into xml syntax helped spread xml to a wider crowd the steering committee of the web standards project said quote xml strict rules for conformance give xhtml a backbone that html has never had and promise to open up new possibilities for xhtml processing storage and creation as the standard spreads it’s a big step forward well okay really that was me quoted though with their and premature in a press release tim bray was on the steering committee too so i think his opinion of the web might have shifted by then lots of people did come to xml itself because it offers a standard and relatively simple way to create labeled information and documents it wasn’t the first to propose that but it was the first to demonstrate that it could be useful powerful and flexible all at the same time and of course there were bots again xml made some of their jobs easier though in practice many kept their parsing powers much more flexible than xml required sometimes bot-like creations even appeared in person at xml conference hotels the sgml crowd had a bigger vision for xml on the web xslt descended from diesel promised to replace the browser’s rendering with css xlink descended from high time described much more sophisticated linking only a few parts of that vision happened in web browsers though by 2000 the three most active browser vendors supported xml document display using css but in different ways opera’s browser lets you create hyperlinks with css extensions if you wanted to try the lightest part of the emerging x-link standard a much simpler descendant of high time than nutscape could help xslt was a coming attraction microsoft’s offerings were a little different internet explorer 5 shipped with an early xsl engine but promised to catch up to official xslt soon they also included xml data islands and though its importance wasn’t clear yet an early version of xml http request logic people wanted though to work with xml outside of the browser well html focused on making things happen in the browser xml offered a vast array of tools beyond that if you are willing to program the connections yourself xml’s release set off a huge wave of new tools both large and small xml parsers didn’t need to be part of a browser they even came in multiple flavors event based sax parsers and tree based dom parsers xslt processors let you transform xml to other xml or html or something else whatever you wanted xml databases let you store and manage xml documents wherever you want it consultants could use even more precise tools to build exactly the systems you needed though if you looked around both information and tools were available to do it yourself a whole set of websites appeared a print xml magazine was even available at newsstands for a while it was an xml gold rush prospectors brought knowledge and tools to the mine welcomed by media support for the latest and greatest weird scenes inside the gold mine ensued as companies and developers applied xml to everything they could business-to-business interactions common office applications object serialization and storage configuration files content for the web print e-books and elsewhere it was a busy busy time xml and its surrounding technologies could be lucrative many companies and consultants hauled gold out of these mines earned from complex projects after a long day in the mines community could help there were often laptops and phones as well as mugs out and bots learning from electronic conversations xml quickly moved to the core of web services and messaging some built directly on structures within xml documents like soap well others took a more general approach working with supporting protocols like rest using http’s key verbs of get put post and delete developers could build complex restful applications for senders rests combined in http verb a url hosting a web server and a message xml made a great payload for messages sent via rest providing easily accessible internal structure people sent piles of xml from place to place web services of many kinds were the new big thing it was a heady time with lots of excitement and dreams of more to come of course that enthusiasm changed things that tiny speck quickly turned into a book for me literally a notebook of specifications and then a library of specifications and more specifications and yet more specifications some of the standard sprawl was specifically about the markup but some of it was about building the semantic web or web services on top of that core the layering wasn’t always neat especially for namespaces and schemas xml’s ambitions kept growing demolishing old projects to make way for the new xml was a common activity discussions that started with parsers and processors shifted to larger frameworks and prefab communications markup construction sites were busy though one group of past supporters abandoned the specifications as they grew the browser vendors they didn’t need these parts and their customers weren’t asking for them xml had appeared while browser vendors were still competing as internet explorer dominated microsoft declared victory releases became less frequent with fewer features a decade later when chrome took over internet explorer’s dominant position it initially sold itself on performance and stability not new possibilities for markup a lot of xml work was really object serialization as a close look at say apple plist files or any of the microsoft office formats that use xml or the xml on the wire web services approaches will reveal a text-based technology that was meant to be an object serialization format could have huge advantages douglas crockford extracted a convenient does enough object notation format that was already part of javascript the full json standard is 16 pages crockford presented json at xml2007 and john cowan replied from the audience that i for one welcome our new json overlords text arrays numbers and strings are often all it takes to store and convey information the crowd that had been using xml for apis and object serialization stayed to hear and implement crockford’s message json syntax was also easy to turn into a neat subset of yaml yaml 8 markup language which had spun off the xml dev mailing list earlier json promised simpler thinking all you need is this little bit and you can get back to working with objects instead of documents there wasn’t much to sell exactly json consulting is far more occasionally a thing than xml consulting was but people lined up to get the simple connection to their tools the same restful approach to building remote services for example worked just fine with json all that really changed was the format of the messages themselves so many new messages so few angle brackets now the wrecking ball came for xml integration not for all of it of course but the main structure got a major trim at the same time xhtml faltered many web developers were unenthusiastic about precision markup syntax and a key crowd of programmers inside the browser makers wanted script everywhere possible javascript found its moment the w3c was deep into declarative and often xml models but some powerful w3c members were not flash ajax and then iphone and android apps drove endless conversations about applications and who builds applications like those programmers typically programmers used to imperative programming and scripting google accommodated those developers by pouring resources into its v8 engine making sure that javascript could run fast after the w3c rejected their position paper that crew from opera and netscape set up their own conversation at what wig and others were clearly interested google and apple joined quickly with google’s ian hixson hixi the most visible what started as a proposal to add functionality to forms quickly turned into a full proposal for a next version of html html5 as more and more people came over to the html5 side the w3c at first tried to include their ideas after all the what wig core was w3c members but eventually it surrendered in 2010 the w3c halted work on xhtml official xforms work ended in 2015.

 

in 2019 the w3c stopped work on html leaving it to the whatwig living standard the whatwig core is small but control of the gateway specification for the web and its implementation gives them huge power this wasn’t of course a simple or friendly conversation objective is always a huge red flag for me but i’ll leave it to you to evaluate this statement i can’t tell the story as well as kyle weems already has in his css squirrel comics many parts of css squirrel only make sense if you know html and css culture deeply but i think this moment of irrelevant will tell you how html5 is going to be and you’ll like it is clear enough on its own so far what wig has triumphed if you’re talking about contemporary html their work is now primary even beyond the specifications though more and more html documents are collections of div elements html hasn’t gone away it’s still in there but it gets less and less attention teaching and tooling changed javascript replaced html as the primary focus of web development while libraries and frameworks became a more common part of the conversation even standard simple uses of html switched to markdown asciidoc and similar at least they generate markup that’s more meaningful than layers of divs xml’s growth curve had already flattened developers had carved out the subsets they preferred to use now xml started shrinking fast as json took over api duties as html5 took over and as developers shifted toward models and frameworks where all divs are created equal markup and especially the semantics it could represent mattered less and less the largest and most lucrative gold mines shifted to become all about json and apps web developers moved further toward divs as the primary structural element development resources shifted away from xml and markup even when it is still supported xml is often deprecated markup was out of style xml was old-fashioned semantic markup presentational or not fell out of html use worse html was and is routinely derided as unimportant because it was quote not a programming language unquote the what wig crew came to balisage to tell the crowd that quote the web ecosystem routed around the damage of xml’s influence by making html better suited for extensibility than ever before end quote which well web components aren’t really taking the world by storm so much either at its former stronghold the w3c the xml core working group ended in 2016 and the xslt xquery working group ended in 2018.

 

 

the web toolyard changed dramatically html went from the center of everything to a tiny corner where beginners could get their feet wet fewer and fewer html elements are used more and more with generic div and span often used even in cases where html offers something more specific more accessible newcomers fill their pictures with these bland formulations and hammer labels onto them for styles and javascript to use designers still focus on creating visually appealing experiences creating work that adapts to both larger and smaller canvases design is at least recognized as an important field of work but frameworks build tools and programming culture constantly demand that styles be treated like even as a part of javascript but over here there’s still some of the old power left built right into your browsers you don’t have to be old to use semantic markup or to focus on accessibility all of this can be yours and it’s easier to read anyway sometimes you can even build a conversation or a class around the possibilities it’s a small crowd though the valued parts of client-side web development in the browser the parts that companies want to pay for became front-end engineering construction and build metaphors are now about javascript other approaches still exist but programmers doing engineering are the respectable part of web development now programming languages not markup languages there is still markup though in many cases only the logic driven by javascript provides it meaning way over on the other side of the tool yard xml tools persevered only the 1.0 versions of xml and xslt made it into browsers but because xml hadn’t bonded itself tightly to web browsers you could still get xml parts if you wanted they might not always be maintained of course in some cases javascript provided glue tools like saxon js provide xslt 3.0 support in the browser at just the cost of a 500 kilobyte download it does of course also work on json we who value markup who come to conferences like this are now effectively a remnant church the ones who remain i know that many people for many good reasons are eager to be compared to christian churches especially churches that see themselves holding on as the last beacons in a sinful world christians may wonder why i’m comparing markup to churches i apologize for that awkwardness and christian centricity but the metaphor works too well like many remnants there’s a strong dose of looking to inner strongholds for answers a denial that we have lost anything in contempt for the worldly some of the review comments for the paper behind this presentation felt very much in that zone the majority of human authored content on a daily basis is produced in xml via ms word odf based products which is to say nothing of the us government printing office output a large amount of web traffic does consist of json and other semantically bare formats but does anyone care about the semantics of porn downloads and then a question and what about all the other folks making submissions about how nevertheless they persisted using semantic markup are they just technical chickens running around not realizing their heads are cut off i’ll try to focus on the bright side a different reviewer suggested rather than a eulogy for markup’s heyday how about a look at how views of and use of markup have changed over time and speculate on how they might change into the future i i don’t exactly have that but there is a sanctuary or sanctuaries actually i’ll focus on this one the balisage conference today but also remember xml prague xml summer school declarative amsterdam markup uk some web conferences and more come on in explore the stained glass for a moment and let your eyes adjust to the light dearly beloved we are gathered here today to discuss the many things you can achieve with well-considered markup vocabularies and workflows we have a long and welcoming history despite regular change the crowd is somewhat grayer but still has some amazing hats a library of specifications outlining possibilities and eagerness to talk about documents data standards implementations and more the audience is as always a mix of old-timers newcomers and some bots we’ve gathered for decades to share information with each other to discuss sacred texts even maybe especially forgotten texts sometimes we convene a panel of people with very different ideas who all want to share and compare it’s not just talk of course the markup flame stays lit you can explore what are effectively church gardens delightful test beds places to try things out before scaling them up a careful selection of tools and materials and detailed descriptions of how to use them make it easier to get things right weeding and watering keep things growing we keep things looking tidy the flowers of past work remain beautiful even inside the church there is the occasional straw man don’t you have some too our meeting room celebrates technologies and people who have come before maybe we can find solidarity across approaches the walls tell stories of technologies and people triumphant and not triumphant we learn from them all banners and posters celebrate organizations specifications and community gathering places it’s not all just formal services we gather for coffee cake and other enjoyable treats supporting conversation we’re happy to talk about clean structures minimally redundant markup humans in the loop conversions overlap creating more xslt hackers digital humanities trees graphs ebooks cookbooks style sheets projects gone wrong or right and so much more even in this much smaller remnant there are still fields of interest that only connect through markup but even always mark up with angle brackets everyone is welcome even hopefully some programmers who might perhaps be looking for a change

As found on YouTube

Traffic Xtractor ᶜˡⁱᶜᵏ ᵗʰᵉ ˢⁿᵒʷᵐᵃⁿ Page 1 Of Google & YouTube In MINUTES! Software Gets As Much FREE Traffic As You Want With A FEW CLICKS OF YOUR MOUSE… NEW Features Include: Video Title & Description Curating & Optimization Google suggest keywords ⇝ Google related keywords ⇝ Bing suggest keyword ⇝ Bing related keywords