The Linux Foundation


From The Linux Foundation

Revision as of 18:45, 28 April 2008 by Oedipus (Talk | contribs)

(diff) ←Older revision | view current revision (diff) | Newer revision→ (diff)


Minutes of the Open A11y Expert Handlers SIG Call 2008/04/21


  • Neil Soiffer (NS/chair)
    • Vladmir Bulatov (VB)
    • Pete Brunet (PB)
    • Gregory J. Rosmaita (GJR, scribe)
    • Janina Sajka (JS)
      • regrets: none logged

Agenda Review

Approval of Last Meeting's Minutes

Minutes of Expert Handlers Conference Call 2008/04/21

Note: These minutes have not yet been finalized.

TOPIC: Editing Scenarios / Changes to DOM

PB: have to consider integration between browser, handler and AT

PB: browser knows nothing about content, a plug-in could be used;

PB: plug-in would have built-in capabilities; respond to plug-in specific keystrokes

PB: also need relation between AT and an AT plug-in to convert specialized content to understandable text, speech, braille

NS: step back a moment - forget Expert Handler for now, what happens today when someone inputs text -- something updates the DOM in some way

PB: browser environment?

NS: sure

PB: text widget - when active and has focus works like most word processing programs; left and right arrow - char to char; up and down, move to adjacent line; control left arrow control right arrow, move from word to word;

NS: how does it know

PB: AT watches keystrokes;

GJR: control up arrow and control down, move up/down by paragraph is an AT-supplied function, i.e. not typically an app based keyboard command.

NS: receives keystroke event and checks DOM as to where it is

PB: either the DOM or in the case of IA2 can access caret location in the current object via IAccessibleText methods.

PB: Microsoft has done a lot of documentation on standard keystrokes

NS: need to know more about mac/apple

PB: listing of keystrokes neeeded

GJR: try checking AOL Developers' DHTML Style Guide

PB: keystroke set that doesn't interfere

JS: read word move to with control+RightArrow - read char with RightArrow

NS: when moved by word, would read whole word - placed thing to read on MSAA interface; expert put out "here is what i think you should read after you do this"

PB: could do that

JS: where was, move focus, interpret?

NS: expert handler controls movement through key structure

NS: made expert handler decide what to speak

NS: example: land on fraction, state a over b rather than "fraction"; expert handler took control over output (what user hears)

NS: for text editor, is it the AT program that follows navigation through document tree

PB: AT knows due to standardization, this keystroke means to do x and AT will interpret what to speak when x encountered

PB: Expert Handler should be able to talk to AT by firing an event

NS: measure as opposed to note -- how would AT realize that? Expert Handler would determine "here is what to read"

PB: need to talk with AT vendors

NS: 2 separate calls

PB: flag as "speech" or "braille" better -- request whichever, so is ready for when user ready

NS: AT sees a key press, knows needs to communicate -- is there an event that fires that says POR has changed

GJR: not in MSAA, but comes with IA2

PB: built-in application mechanism

NS: no MathML but image - embedded MathML or TeX in image or buried in a parameter of an applet; so math role says "this image really is math", can have application go through and try to discover embedded TeX or MathML - can convert to MathML - then gain better accessibility for MathML; one option: 1) screen-readers read alt text (braille-conventions marked by ARIA describedby?) javascript modifies DOM and puts things there; second method is leave DOM alone, define standardized way for AT to get information and communicate with it and the user

NS: first static approach, second is more dynamic - needed for synchronized rendering, braille, etc.

PB: best way would be to let author put up applet and use alt for marked up string; AT needs converter to interpret string; determine right sequence for text or braille; then deliver to TTS or braille display

NS: need standard interface for AT to discover EH (Expert Handler); EH has to ask for speech -- more complicated but more powerful

PB: 2 types of EH -- 1) AT EH - makes MathML understandable; and 2) UA EH - handles editing capability

JS: could be plug in for UA, OpenOffice or stand-alone

NS: text is already supported - what exists today: have form field; still needs to support text along with math; exist in wiki (TeX conversion to MathML) - plugins for wikis

NS: editing is usually either WYSIWYG textual/linear

JS: pen-based or tablet-based computing model?

NS: microsoft has a tablet-based math converter that works really well; not well known

NS: another product MathTalk were one speaks (somewhat stylized speech -- alpha plus bravo (for a plus b) -- speak and math shows up; WISYWIG editors, and text editors

NS: speech and handwriting probably only 1 percent of users;

JS: but have been applied to produce MathML

NS: speech turns into TeX which can then be converted to MathML

NS: WISYWIG: MathType, Word's Math editor, and TeX dominant

NS: "TeX-lite" interface on wikis -- type between $ (dollar signs) and will be rendered by wiki as MathML; when edit, goes back to TeX

JS: limited list of what things can be

NS: TeX-like syntax or MathML

PB: EH plug-in to browser; is there concept of objects as move through structure, or one big textual object?

PB: when caret moves

NS: text

NS: objects inter-related - might move into child or parent - ask where am i in a fraction - how to get info?

PB: An example of structure is a listitem in a list. The listitem has text on the listitem acc name.

NS: that will be flat, but contained in structure

PB: listitem has accessiblename from which you can get text; can query list to find out how many children it has

NS: need something more than "there 4 children" -- how to make sense in a musical context

NS: structured markup is what EH is

PB: structure then exposed as a11y call to AT

NS: each markup language has own set of names

JS: if focus on one - limited list of what things can be -- those elements grouped together make a larger document structure; bindings definable and queriable; list of terms that incorporate children

NS: MathML DTD or Schema describing structure

GJR: XML Events 2 "implements" attribute for this

PB: event can be fired from numerator, or one of terms of numerator -- as each expression subobject gets control -- depending on how the browser EH plug-in is currently navigating the expression -- under control of the user via the EH plug-in UI -- thus user has control over speech output

NS: complex expressions - may need summary -- not sure how to deal with that

NS: can easily imagine chemical structural language

JS: speak whole thing by default, let user stop and localize by subparts/piece by piece

JS: command that localizes level of navigation

JS: expose models first

PB: big picture: two EH plug-ins, one for the browser, one for the AT. The browser EH provides UI to navigate through the special content; The AT EH knows how to transform specialized markup into speech or braille; browser EH provides keyboard interface for objects/elements - the granularity can be controlled by the user - the typical navigation requests would be previous, current, and next item at the currently specified level of granularity - when objects in the browser EH get focus they fire focus events; the AT asks the object for its acc name which would expressed in a standardized markup language for a particular discipline (math, music, chem) and then the AT would call the AT EH for a transformation of a specific kind (Braille, TTS) and then the AT would output the TTS/Braille to the TTS or Braille engine.

GJR: check to see if what is needed is addressed sufficiently in the XML Events 2 "implements" attribute discussions and the new Element Traversal Specification (W3C Working Draft)

Excerpt From Element Traversal Specification

This specification defines the ElementTraversal interface, which allows script navigation of the elements of a DOM tree, excluding all other nodes in the DOM, such as text nodes. It also provides an attribute to expose the number of child elements of an element. It is intended to provide a more convenient alternative to existing DOM navigation interfaces, with a low implementation footprint.

1. Introduction
This section is informative.
The DOM Level 1 Node interface defines 11 node types, but most commonly authors wish to operate solely on nodeType 1, the Element node. Other node types include the Document element and Text nodes, which include whitespace and line breaks. DOM 1 node traversal includes all of these node types, which is often a source of confusion for authors and which requires an extra step for authors to confirm that the expected Element node interfaces are available. This introduces an additional performance constraint.
ElementTraversal is an interface which allows the author to restrict navigation to Element nodes. It permits navigation from an element to its first element child, its last element child, and to its next or previous element siblings. Because the implementation exposes only the element nodes, the memory and computational footprint of the DOM representation can be optimized for constrained devices.
The DOM Level 1 Node interface also defines the childNodes attribute, which is a live list of all child nodes of the node; the childNodes list has a length attribute to expose the total number of child nodes of all nodeTypes, useful for preprocessing operations and calculations before, or instead of, looping through the child nodes. The ElementTraversal interface has a similar attribute, childElementCount, that reports only the number of Element nodes, which is often what is desired for such operations.
2. ElementTraversal interface
This section is normative.
The ElementTraversal interface is a set of attributes which allow an author to easily navigate between elements in a document. In conforming implementations of Element Traversal, all objects that implement Element must also implement Element Traversal. Four of the attributes, firstElementChild, lastElementChild, previousElementSibling, and nextElementSibling, each provide a live reference to another element with the defined relationship to the current element, if the related element exists. The fifth attribute, childElementCount, exposes the number of child elements of an element, for preprocessing before navigation. A conforming User Agent must implement all five attributes. A User Agent may implement similar interfaces in other specifications, but such implementation is not required for conformance to this specification, if the User Agent is designed for a minimal code footprint.
This interface must be implemented on all elements, regardless of their namespace. For the purpose of Element Traversal, an entity reference node which represents an element must be treated as an element node. Navigation must be irrespective of namespace, e.g. if an element in the HTML namespace is followed by element in the SVG namespace, the nextElementSibling attribute on the HTML element will return the SVG element.

Summary of Action Items Assigned at the 2008/04/21 Handlers Call

ACTION: Neil contact Aaron Leventhal and Alexander Surkov about Expert Handlers expectations/desires from an implementor's viewpoint; invite to next telecon (2008/04/28)

[Article] [Discussion] [View source] [History]