September 2017 Die DTD ist mit DtdEdd und html401 (Dez. 2018 mit WHHT) geschrieben. Daher sind die Namen des markup kleingeschrieben und die der Deklarationen groß (vgl. HTML1). Da die RWR von html401 benutzt werden, werden in der EDD alle in den RWR nicht vorkommenden Elementnamen großgeschrieben.
Juli 2017 mit neuer DtdEdd ohne Tag, sondern mit elenam, partentnam, genentnam, attlistnam erneuert.
2015 Diese DTD war als internes subset einer SGML document type declaration geschrieben.
Ich kommentiere die document type declaration und das subset aus (zweite Zeile und letzte Zeile). So kann die DTD als externes subset benutzt werden.
DTD for HTML+
Markup minimisation should be avoided, otherwise the default <!SGML>
declaration is fine. Browsers should be forgiving of markup errors,
while authoring tools *should* enforce compliance with the DTD.
id This attribute allows authors to name elements such as headers
and paragraphs as potential destinations for links. Note that
links don't specify points, but rather extended objects.
charset This allows authors to switch to a different char set for
quotations or list etc. This is particularly useful for oriental
languages which need two byte character codes, e.g. see RFC 1468
"Japanese Character Encoding for Internet Messages"
<!ENTITY % foo "X | Y | Z"> is a macro definition for parameters and in
subsequent statements, the string "%foo;" is expanded to "X | Y | Z"
Various classes of SGML text types:
CDATA text which doesn't include markup or entity references
RCDATA text with entity references but no markup
PCDATA text occurring in a context in which markup and entity references
%cextra; and %pextra are designed to allow document specific
extensions to the HTML+ DTD, e.g.
<!DOCTYPE htmlplus [
<!ENTITY % cextra "| PROPNAME">
<!ELEMENT PROPNAME - - CDATA>
Use the RENDER element to specify how the browser should
display new elements in terms of existing ones, e.g.
<RENDER tag="PROPNAME" style="I">
Browsers should render the following types of emphasis
distinctly when the obvious rendering is impractical
I = italic, B = bold, U = underline, S = strikethru,
TT = teletype font, SUP = superscript, SUB = subscript
REV = reverse video for highlighting hit areas in the result of a query
Q = inline quote (render according to local conventions)
Treten entity names wegen der marked sections mehrfach auf, gebe ich nur der ersten einen partentnam.
these entities are used to simplify element definitions
standard ISO/WWW icons courtesy of Bert Bos and Kevin Hughes, see
These can be used in place of default symbols for list items or as
part of hypertext links, and save time needed to download images.
Browsers can define them in terms of library images or as URL/URNs.
<!ENTITY ftp SDATA "ftp" -- ftp server -->
<!ENTITY gopher SDATA "gopher" -- gopher server -->
<!ENTITY telnet SDATA "telnet" -- telnet connection -->
<!ENTITY archive SDATA "archive" -- archive server -->
<!ENTITY filing.cabinet SDATA "filing.cabinet" -- filing cabinet -->
<!ENTITY folder SDATA "folder" -- folder or directory -->
<!ENTITY fixed.disk SDATA "fixed.disk" -- fixed media drive -->
<!ENTITY disk.drive SDATA "disk.drive" -- removeable media drive -->
<!ENTITY document SDATA "document" -- unspecified document type -->
<!ENTITY unknown.document SDATA "unknown.document" -- unrecognised document type -->
<!ENTITY text.document SDATA "text.document" -- text/plain, text.html etc. -->
<!ENTITY binary.document SDATA "binary.document" -- binary data -->
<!ENTITY binhex.document SDATA "binhex.document" -- binhex format -->
<!ENTITY audio SDATA "audio" -- audio sequence -->
<!ENTITY film SDATA "film" -- film or animation, such as an MPEG movie -->
<!ENTITY image SDATA "image" -- photograph, drawing or graphic of any kind -->
<!ENTITY map SDATA "map" -- geographical or a schematic map -->
<!ENTITY form SDATA "form" -- fill-out form -->
<!ENTITY mail SDATA "mail" -- email messages -->
<!ENTITY parent SDATA "parent" -- parent of current document -->
<!ENTITY next SDATA "next" -- next document in current sequence -->
<!ENTITY previous SDATA "previous" -- previous document in current sequence -->
<!ENTITY home SDATA "home" -- home document -->
<!ENTITY toc SDATA "toc" -- table of contents -->
<!ENTITY glossary SDATA "glossary" -- glossary of terms etc. -->
<!ENTITY index SDATA "index" -- searchable index -->
<!ENTITY summary SDATA "summary" -- summary -->
<!ENTITY calculator SDATA "calculator" -- A calculator -->
<!ENTITY caution SDATA "caution" -- Warnign sign -->
<!ENTITY clock SDATA "clock" -- A clock -->
<!ENTITY compressed.document SDATA "compressed.document">
<!ENTITY diskette SDATA "diskette" -- A diskette -->
<!ENTITY display SDATA "display" -- A computer screen -->
<!ENTITY fax SDATA "fax" -- A fax machine -->
<!ENTITY mail.in SDATA "mail.in" -- mail-in tray -->
<!ENTITY mail.out SDATA "mail.out" -- mail-out tray -->
<!ENTITY mouse SDATA "mouse" -- mouse/pointing device -->
<!ENTITY printer SDATA "printer" -- hardcopy device -->
<!ENTITY tn3270 SDATA "tn3270" --tn3270 terminal session -->
<!ENTITY trash SDATA "trash" -- waste paper basket -->
<!ENTITY uuencoded.document SDATA "uuencoded.document" -- uuencoded data -->
Basic types of elements:
<!ELEMENT tagname - - CONTENT> elements needing end tags
<!ELEMENT tagname - O CONTENT> elements with optional end tags
<!ELEMENT tagname - O EMPTY> elements without content or end tags
Weil ich die end-tag minimization (und die start-tag minimization) weggelassen habe, indem ich das minimize feature in der SGML declaration OMITTAG NO gesetzt habe, entfallen die - - und - O (Lothar Seidel).
The content definition is:
- an entity definition as defined above
- a tagname
- (brackets enclosing the above)
These may be combined with the operators:
A* A occurs zero or more times
A+ A occurs one or more times
A|B implies either A or B
A? A occurs zero or one times
A,B implies first A then B
A&B either or both A and B (in either order A B or B A)
Browsers *must* tolerate missing DIVn tags, e.g. the presence of an <H1> tag implies a DIV1 element enclosing it and the following text.
The SGML standard unfortunately doesn't permit such inferences due to a decision made to simplify writing general SGML parsers.
It would be nice to allow an optional prologue and epilogue, before and after the divisions respectively. Unfortunately, the epilogue would lead to an ambiguous content model (according to sgmls).
<-- Was im Entwurf von HTML1 zu wenig war, ist ist in dieser experimentellen Version zu viel. Würde sich HTML auf irgend eine bestimmte Struktur im BODY festlegen, dann wäre es nicht mehr in der Lage zu dem zu werden, zu dem es in html4 werden wird, dem document type, der in der Lage ist, alle nur denkbaren Strukturen anzunehmen. -->
Paragraphs which act as containers for the following text Browsers *must* be capable of inferring missing <P> start tags from the content model. Basically, if the parser comes across unexpected %text; then there's a missing <P>.
List items for UL and OL lists
By default UL list items have bullets while OL list items are numbered in a style dependent on nesting. For extra impact specify an explicit image with the icon attribute or a string with the label attribute. This attribute can also be used with SGML entities to specify standard icons, e.g. label="&folder;"
Hypertext Links from points within document nodes
The HREF attribute specifies the link destination as a URL or URN. In figures, the SHAPE attribute defines the extent of the link as a polygonal region, and is used with the FIG element.
The PRINT attribute determines how the browser should deal with links when printing this document. This makes it possible for users to print a document and related subdocuments with a single menu action. If PRINT="Section", then the link is followed and printed as a follow-on section after the current document. If PRINT="Footnote" and the linked document is sufficiently small then it is included as a footnote. If PRINT="Reference" then the document's URL (and title) is included in a footnote or in a list of references at the end of the document.
The TITLE attribute may be used for links in which the destination node doesn't define a title itself, e.g. non-html documents.
The REL attribute is used to specify how the browser interprets the link when this document is being used as a hypertext path REL="Path" causes the linked document to be treated as a path and inserted into the current path, while REL="Node" treats the linked document as a node on the current path. REL="Embed" is a hint to embed the referenced node into the current document.
The SIG attribute allows authors to specify a digital signature of linked documents to check that they haven't been changed. It starts with a prefix denoting the algorithm used, in particular SIG="md5:2l3k4j2lkj423l" denotes the MD5 signature: 2l3k4j2lkj423l which is encoded using the standard MIME base64 representation
Other kinds of relationships between documents
There are a set of standard RELationship types which alter the browser's navigation menu:
UseIndex searchable index
UseGlossary shared glossary
Contents shared contents page
Previous previous document in a hypertext path
Next next document in a hypertext path
Bookmark named with the title attribute
Made Defines who is the "maker" of this document
Help provides help on this document
Annotation an additional note on current document
Reply a note with equal footing to current document
Subdocument defines parent->child relationship
Parent defines child->parent relationship
StyleSheet an associated style sheet
Bookmarks allow authors to define a set of useful links which are to be accessed via a menu, rather than as conventional in-line hypertext links. Previous and Next links are inserted by the browser when interpreting a separate document as a path. See above description of REL="Node" and REL="Path" for <A>.
The FROM attribute makes it possible to specify annotation links separately from the document text flow. The FROM attribute specifies an ID for the source of a link, while the HREF attribute specifies its destination. HTTP servers can use the WWW-Link: header to "insert" such annotations into documents.
Servers should read the document head to generate HTTP headers corresponding to META elements, e.g. if the document contains:
<meta name="Expires" value="Tue, 04 Dec 1993 21:29:02 GMT">
The server should include the HTTP date format header field:
Expires: Tue, 04 Dec 1993 21:29:02 GMT
Other likely names are "Summary", "Keywords", "Created", "Owner" (a name) and "Reply-To" (an email address)
These elements can be used to construct IAFA style templates which describe the contents of the server. The templates are then used for automatic indexing of servers.
The effectiveness of indexing is dependent on using well defined keywords drawn from standard sets.
The Document-Icon header is used to specify a URL for use as an icon in the hotlist and history list.
a pre-pass is needed to count columns and determine min/max widths before sizing to match window size
browsers should tolerate an omission of the first <TR> tag as it is implied by the context
The form contents are sent to the server upon pressing a submit button. Forms can be associated with scripts, e.g. to make one selection field effect which options are enabled for other fields. Clicking on a selection or typing into a text field result in events which are processed by the script. Event handlers are associated with each field or with the form itself. The script language is deliberately restricted to avoid any security issues.
Fields can be disabled (greyed out) or marked as being in error.
The MESSAGE element may be used by the server to set error messages.
Servers can store state information in forms with hidden input fields. These are not displayed and can be used to hold transaction handles etc.
Types of INPUT field:
text: one line text fields, size gives visible width of field in chars
where value may grow beyond this up to MAX (MAXLENGTH) chars.
password: like text fields but with no echo of typed characters
checkbox: for simple yes/no choices
radio: for one from many choices, each radio button in a group
has the same NAME but a different VALUE.
submit: Sends form to server. If the SRC attribute specifies an
icon the point clicked is sent to the server as per ISMAP.
The default NAME for this field is "Submit". The VALUE is used
as the buttobn label. The NAME/VALUE are submitted with the form
contents to allow servers to work out which button was pressed.
reset: Resets fields to their initial values. The VALUE is used as the
button label. The SRC attribute can be used for an icon button.
int: for input of integers, SIZE attribute gives width of field
float: for input of floating point numbers
date: for input of dates
url: for input of universal resource locators
hidden: used by server for state info, opaque to client
range: integer range from MIN to MAX, rendered as a slider etc.
scribble: Pen input, which may include time and pressure info, the
background can be initialised to an image with the SRC attribute
audio: sound input with up to MAX seconds
multiline text input fields, we probably will want to generalise this to accept arbitrary clipboard data e.g. hypertext and images, in addition to plain text
The EDIT attribute when present allows you to type and edit the selected option.
The SRC attribute allows for graphical menus, e.g. users wanting to buy a house could click on each of the areas on a map that they were interested in.
The SHAPE attribute defines a region in the image that is specified by the SRC attribute for the SELECT element
Scripts executed by the client need a way of displaying warning/error messages. We define an element so that the server too can initialise this one-per-form message area.
Clients should preferably avoid displaying the message in-line, as the window size may prevent the user from seeing the message.
figures which subsume the role of the earlier IMG element.
Behaves identically to IMG for align = top, middle or bottom. Otherwise figure is inserted after next line break (soft or hard). For align=left, the image is left aligned and text is flowed on the right of the image, and similarly for align=right, with no text flow for align=center (the default). The caption is placed under the image.
Finer control of the vertical positioning relative to the text line is possible with the baseline attribute. When present, the figure acts like the IMG element but is shifted so that the baseline occurs at the specified number of pixels above the bottom of the image. If this is given as a floating point number, it is interpreted as a fraction of the image height and must lie in the range (0.0 to 1.0)
The <A> element is used for shaped buttons handled by browser, while the ISMAP mechanism sends pointer clicks/drags to server. The text contained by this element is used for text-only displays and authors should remember to provide effective descriptions, including label text for shaped buttons.
<!-- img is left in for at least the short term -->
Proposal for representing formulae
Delimiters should stretch to match the size of the delimited object. <SUB> and <SUP> are used for subscripts and superscripts
X <SUP>i</SUP>Y<SUP>j</SUP> is X i Yj
i.e. the space following the X disambiguates the binding.
Invisible brackets which may also be used for numerators and denominators:
<BOX>1 + X<OVER>Y</BOX> is 1 + X
<BOX><OVER>X + Y</BOX> is X + Y
Horizontal line between numerator and denominator
The symbol attribute allows authors to supply an entity name for an arrow symbol etc.
LaTeX like arrays. The align attribute specifies a single letter for each column, which also determines how the column should be aligned, e.g. align=ccc"