Wednesday, April 21, 2010

Changing the Web Browser use patterns for performance boost.

Changing the Web Browser use patterns for performance boost.

Brief

To have a best performance in the browser, better to use natively supported methods and data formats. XSLT for initial rendering and following DOM modification, instead of server-side templates or JS DOM rendering. Data formats shall be kept in  memory in format native to browser-XML, instead of strings, JS hash maps or JSON. Behavior defined in XML/SMIL, instead of JS or proprietary extensions via META, OBJECT, etc.

Caching, precompiling, prerendering on deployable package level is not a standard yet. But always a subject for browser extension on customized platform. We have a unprecedented number of browser platforms appeared last years, especially on mobile and embedded platforms. More to go. It's time to make your own browser!

Intro

In HTML browser the DOM primarily serving the HTML rendering goals and a little of additional functionality. Initially HTML was plain text rendering engine and once new requirements appeared from actual web use, the new functionality was added. Due to dogmatic perception of HTML as the base, additions where scotched as extensions with own lingual and functional presentation.

Misconception of tearing web application on 3 independent parts (HTML DOM, CSS and JS) created enormous gap as in other dimensions (only modularization includes all 3 tiers, security, authentification/signing, packaging, etc) as in ability to create performance-near-optimal browser engines.

Behind of the 3 tiers of web page

Besides of UI, the first thing appeared to be in demand is ability to refresh the page in order to keep content relevant. It was also used for various other reasons like session timeout notification. That is when HTML started to accept non-UI stuff related to behavior. This exact case was covered by META refresh tag. The first attempt has come before 3-tiered HTML was idolized and had a declarative presentation. Special non-UI tags where quite convenient to plug-in extra functionality into HTML. But all of them where encapsulated from each other and had almost nothing in common. From the logic to the lingual presentation. The OBJECT, EMBED, APPLET, SCRIPT, STYLE having so little in common from all sides, that integration with browser in common and convenient way is not possible. Each one presented self-concluded tier could be tuned only as insolated entity, without ability to optimize the web app as a whole. Plugins easily recognized this vacuum and to cover this gap took over whole web app. Powerful plugins like  Flash, JavaApplet, SilverLight incapsulated as UI as styling as functionality as all other necessary for web app means. And many web sites and devices where redesigned completely to use more mature technology than HTML itself.

Substitution of browser by plugin was a strong move, but no one plugin had enough guts to substitute the browser and HTML. Probably due to proprietary nature or complexity.

There was no common standard of treating DOM behind of HTML. That has been changed a bit when XHTML introduced. Now along with HTML namespace other functionality could be set on DOM as application API(DOM) model. That created the standard base for extending web browser. But not changed the 3-tiered pattern ice-frosted into web developer mind: HTML/JS/CSS

HTML

This set of tags presenting the base structure of web application in current Web 2.0 apps. It is made on back-end by web server framework or by build for cloud distribution, simple HTML occasionally.

The reused components ( resused modules | widgets | gadgets | web controls ) bodies often are prerendered(embedded) into HTML tag set.
The page is also kind of component on it's own.

Performance impact of component concept absence:

  • Taking extra network bandwidth - text is longer.
  • Inability to use discrete caching. Component is subject for separate caching.
  • Inability to use precompilation. Binary compiled template loads and runs faster. Compilation could be done as on client as on server side as well.
  • Increasing parsing time/resources
  • Increasing the rendering time/resources

Development/maintenance - blends context of page and those components all together creating chaos of

  • Naming conventions on all tiers.
    CSS selectors meant to work only with dedicated component need to be aware of whole page and other components.
    JS operating with DOM needs to know how to separate own belongings from remaining app.
  • Security restrictions collision. Especially for embedded one into another controls. Editable/selectable are the simplest cases.
  • Mixing errors and namespaces. The malformed tag (like <div/> ) will mess not just own control but whole page
Solution - use HTML templates. Better with scope insulation capabilities. Only good solution is XSLT.
Surrogate for it will be embedded or XHR-ed HTML template.
Surrogate for precompilation is invisible but rendered page with all CSS applied. Like hidden behind style="visibility:hidden;widh:0;height:0;position:fixed"
The real improvement will be rendering template directly into HTML DOM tree.

JavaScript

Hacker's and web developer holy grail. Feeds JS developers well since nobody else could have a deal with it.

There are category which could be and need to be taken off the simple/medium complexity for:
Defining:

  • popular event: drag, hover, mouse enter/leave,
  • timer/interval
  • custom events
  • data retrieval (like in FORM): initiated, in progress, completed, interrupted, paused, error

Actions for event handlers:

  • animation - change parameter/attribute or referenced node by some formula with ability to use existing DOM and relative inside of component path-es
  • data retrieval: get/post/etc, pause, resume( even after stop), stop
  • component insert, render, remove, pause, stop, resume, restart(w/ updated parameters)
  • Timers, recurrent operations
  • component (and top page - Browser) actions: set url/params, back, forward, set to favorites, preserve locally(a-la offline). Some parameters are not part of component state now. Encoding for example. Artificial parameters set could serve same goal as URL with hashes. Component shall define which state is subject for preserving in navigation stack.

Connecting event handlers with DOM nodes current:

  • embedded into HTML tags as attributes.
    CONS: mix of structure(DOM) and functional tier(JS), hard to read and maintain.
        Absence of JS validation.
    PRO: Html Validation
  • attaching in JS via addEventListener or set node attribute.
    CONS: absence of structure matching(DOM) validation;
        DOM node lifecycle synchronization is manual and difficult to maintain. Leads to memory leaks and dead code calls.
    PRO: JS compilation validation. If it really matters with not-strict language :)
  • Declarative tags
    • SCRIPT - IE only w/ for  attribute
    • META refresh
    • SVG/SMIL animation
      PRO: no need for memory management; uses native implementation; no need for JS validation.
      CONS: undeveloped control and feature set.
  • CSS selectors + attached JS.
    Unfortunately IE only:
    • width:expression(document.body.clientWidth > 950 ? "950px": "100%" );
    • behavior:url(behave_typing.htc); - HTC for CSS
      PRO: no worries about lifecycle
      CONS: no JS validation

    Simulation is on JS frameworks like jQuery.live().

Requirements

  • Event handlers shall be attached to the matching nodes during rendering process.
  • JS code replaced with strict language code and operates with natively accessible entities(DOM nodes)
  • Strict language matching browser supported one - XML with DTD.
  • language to be compileable into native code.
  • Selectors and template rules to be matched the UI rendering engine - use XSLT

Solution - use XSLT for rendering and blending UI with event handlers. That way UI lifecycle matches the event handler one.
Replace JS with XML-driven rules. Validation is done on XML level. For legacy browsers JS  implementation need to be created.

CSS

Another kind of hackers ( AKA web designers ) paradise. Instant holywar over table-less vs. fluid layouts resulted in fixed pixel layouts on most of web pages. Those few managed to do it right never agreed with opponents. I always been curious, where those W3C standards created to serve social society rather web pages needs?

I could not imagine less modular and optimization unfriendly language than CSS. If somebody needs to create the mess in web page code - there is your tool!
The tricks around and use guidelines somehow helping to manage that monster. But in reality we have majority of web working on trust and "approximately acceptable" quality.
It is never been about 100% compatibility even on CSS tests on it's own. Refer to Asid tests and their support across the browsers.
Now add the complexity of Web 2.0 app with hundreds developers sneaked on your page over popular/opensourced frameworks.
HTML5/CSS3 will not be the cure there. This standard use and implementation patterns need to be redefined. The CSS on its own does not carry anything useful except of the HTML rendering parameters, targeting rendering media, etc. In another words, semantics has a sence to sertain level, syntax is a trash.

Requirements for presentation layer definition.

  1. Modularity.
  2. Scope.
  3. Inheritance.
  4. Rule set language( CSS vs XSLT+XPath)

Unfortunately nothing from listed requirements is available on CSS. From another hand, having those in XSLT cost nothing. And switching the theme/skin will be possible not just for hardcoded DOM UI structure as for CSS but making DOM UI modified as well. The device-specific UI in that vision is just set of XSLT rules applied on same primary app.

All together

Optimization of presentation layer with keeping only really used rules, bundled only used resources with matching for only defined languages and/or themes in XML/XSLT is straight forward procedure. Unlike in ANY HTML web framework currently existed.

Caching and precompilation for XML and XSLT gives possibility to run native code versus parsing and interpreting every time the HTML page loaded and run.

Browser enhancement

I have been thinking on ability to utilize WebKit (Google Chrome engine) in embedded environments. It happens the WebKit is most active opensource browser engine and will be best candidate for embedding. It is already a base for several embedded browsers.
Embedding of WebKit already been utilized in few places, including ChromeFrame - WebKit engine inside of Internet Explorer. Even if support for XML and XSL on WebKit is suck, but it is manageable. And due to well developed XML/XSLT technologies improvement does not required R&D and will be limited to integration. Commercial and free opensourced products are in broad selection.

The speed and optimization of in-browser application is limited by the W3C logic of HTML. That pattern has been altered a bit in embedded browsers like Ant Galio to push the performance. Also Ant Galio covers ability to run multiple (sub) applications simultaneously and insulation of HTML sub-applications. Most of embedded platforms providing the packaging and deployment support.
 
That needs to be covered during the WebKit port. Android OS also covers some subset of needed functionality (?).

It appears that enumerated above half-way solutions have not resolved primary bottlenecks of HTML design patterns.
I will suggest more efficient and radical improvements on the embedded browser and the way of using HTML applications.
  1. Dom operations by the native compiled code or at least without interpreter(JS). The HTML DOM currently is rendered by either HTML parser (slow in Galio) or JS (slow in comparison with native code). Proposal is to replace those methods with strict rule engine. XSL is perfectly suitable for initial rendering. It is compileable into native code, supported by WebKit and other rendering engines. It needs extension to be applicable for run-time DOM changes. At the moment you could render XML and use domDocument.clone() to pass result back into HTML. This could be optimized by rendering directly into HTML document.
  2. CSS engine. Unsurprisingly there is no native code compilation for CSS rules. Current engines treat CSS as independent and out-of context rule set due to unpredictability of DOM structure. Once the DOM structure or DOM creation rules are fixed during packaging, there is nothing prevent to render native code for applying  CSS on this fixed DOM structure.
    What about dynamic DOM? If the DOM changes rules are known, more complex but still native code could be created.
    This is doable even in current engines by converting CSS into XSL.
  3. Event handling. Current problem that there is no platform optimization could be done. JS does not allow replacing and optimizing the code sequence and methods. Some engines support build-in rule set for event handlers without JS by special XML sequence. Like animation on mouse over or timer. See in Chrome sample of clock - no JS - it uses the SMIL instead. Replacing JS with native compiled code and removing the need for dynamic event handler could be done in same (as SMIL) way.
  4. Reusable modules/widgets at the moment are not widely utilized. The only good solution is HTC (html control by Microsoft). All others are available either on server side or on JS library level. No one provides proper packaging with signed code and embedded resources(html,css,js, images,etc) for reusable HTML components. Closest match will be Flash module or JavaApplet. Having module/widget defined in natively compiled  code is an ideal solution. It does not exist yet. The logic is trivial and we could alter the WebKit to accept reusable web component(s) resources from bundles. The native compiled code could be provided by XSL( for HTML, JS and CSS). Definitely, will be backwards compatible to use in old-fashion way.
Defined steps are dealing with all 3 basic components of HTML document: structure(HTML), presentation (CSS, images, etc) and behavior(JS). But this paradigm is already outdated and multiple frameworks resolving the bottlenecks. The review and proposed solutions list is quite big and I will be glad to follow, since it matches my vision on web application declared in xmlaspect.org
Proposed could look too radical (and it is in some way) but it is 99% based on existing standards and solutions. And remaining percent of own development is definitely worth of impact on performance of application and development process efficiency as well ( that aspect I will cover some time later). The effort will give back more if the compatibility layer for other browsers will be published. That way many bug fixes and necessary tools like profilers will be available and eventually new technology from proprietory became a standard. It is nice to have a standard compliant platform and apps even before the standard is accepted. Not to mention branding the standard with own name.

Links

2010/04/18

©2010 Sasha Firsov

No comments:

Post a Comment