o !bd@sHdZddlmZddlZddlmZddlZddlZddlm Z ddl m Z m Z gdZ dZGd d d eZGd d d eZejZejZejZejZejZejZejZejZejZejZejZd dZGdddeZ Gdddej!Z"zddl#m"Z"Wn e$yYnwe"j%Z%ddZ&GdddeZ'e'dZ(Gdddej!Z)dS)z#Core classes for markup processing.)reduceN)chain) stringrepr) stripentities striptags)StreamMarkupescapeunescapeAttrs NamespaceQNamezrestructuredtext enc@s eZdZdZgZiZddZdS)StreamEventKindz#A kind of event on a markup stream.cCs|j|t||SN) _instances setdefaultstr__new__)clsvalr-/usr/lib/python3/dist-packages/genshi/core.pyr$zStreamEventKind.__new__N)__name__ __module__ __qualname____doc__ __slots__rrrrrrrs  rc@seZdZdZddgZedZedZedZedZ edZ ed Z ed Z ed Z ed Zed ZedZd%ddZddZddZddZd&ddZd'ddZd(ddZdd Zd!d"Zd#d$ZdS))ra.Represents a stream of markup events. This class is basically an iterator over the events. Stream events are tuples of the form:: (kind, data, position) where ``kind`` is the event kind (such as `START`, `END`, `TEXT`, etc), ``data`` depends on the kind of event, and ``position`` is a ``(filename, line, offset)`` tuple that contains the location of the original element or text in the input. If the original location is unknown, ``position`` is ``(None, -1, -1)``. Also provided are ways to serialize the stream to text. The `serialize()` method will return an iterator over generated strings, while `render()` returns the complete generated text at once. Both accept various parameters that impact the way the stream is serialized. events serializerSTARTENDTEXTXML_DECLDOCTYPESTART_NSEND_NS START_CDATA END_CDATAPICOMMENTNcCs||_||_dS)a:Initialize the stream with a sequence of markup events. :param events: a sequence or iterable providing the events :param serializer: the default serialization method to use for this stream :note: Changed in 0.5: added the `serializer` argument N)rr)selfrrrrr__init__Js zStream.__init__cC t|jSr)iterrr+rrr__iter__V zStream.__iter__cCstt|||jdS)aOverride the "bitwise or" operator to apply filters or serializers to the stream, providing a syntax similar to pipes on Unix shells. Assume the following stream produced by the `HTML` function: >>> from genshi.input import HTML >>> html = HTML('''

Hello, world!

''', encoding='utf-8') >>> print(html)

Hello, world!

A filter such as the HTML sanitizer can be applied to that stream using the pipe notation as follows: >>> from genshi.filters import HTMLSanitizer >>> sanitizer = HTMLSanitizer() >>> print(html | sanitizer)

Hello, world!

Filters can be any function that accepts and produces a stream (where a stream is anything that iterates over events): >>> def uppercase(stream): ... for kind, data, pos in stream: ... if kind is TEXT: ... data = data.upper() ... yield kind, data, pos >>> print(html | sanitizer | uppercase)

HELLO, WORLD!

Serializers can also be used with this notation: >>> from genshi.output import TextSerializer >>> output = TextSerializer() >>> print(html | sanitizer | uppercase | output) HELLO, WORLD! Commonly, serializers should be used at the end of the "pipeline"; using them somewhere in the middle may produce unexpected results. :param function: the callable object that should be applied as a filter :return: the filtered stream :rtype: `Stream` )r)r_ensurer)r+functionrrr__or__Ys,z Stream.__or__cGsttj|f|S)aLApply filters to the stream. This method returns a new stream with the given filters applied. The filters must be callables that accept the stream object as parameter, and return the filtered stream. The call:: stream.filter(filter1, filter2) is equivalent to:: stream | filter1 | filter2 :param filters: one or more callable objects that should be applied as filters :return: the filtered stream :rtype: `Stream` )roperatoror_)r+filtersrrrfiltersz Stream.filtercKsBddlm}|dur|jpd}|jdd|i|}|||||dS)aReturn a string representation of the stream. Any additional keyword arguments are passed to the serializer, and thus depend on the `method` parameter value. :param method: determines how the stream is serialized; can be either "xml", "xhtml", "html", "text", or a custom serializer class; if `None`, the default serialization method of the stream is used :param encoding: how the output string should be encoded; if set to `None`, this method returns a `unicode` object :param out: a file-like object that the output should be written to instead of being returned as one big string; note that if this is a file or socket (or similar), the `encoding` must not be `None` (that is, the output must be encoded) :return: a `str` or `unicode` object (depending on the `encoding` parameter), or `None` if the `out` parameter is provided :rtype: `basestring` :see: XMLSerializer, XHTMLSerializer, HTMLSerializer, TextSerializer :note: Changed in 0.5: added the `out` parameter r)encodeNxmlmethod)r;encodingoutr) genshi.outputr9r serialize)r+r;r<r=kwargsr9 generatorrrrrenders  z Stream.rendercCsddlm}|||||S)aTReturn a new stream that contains the events matching the given XPath expression. >>> from genshi import HTML >>> stream = HTML('foobar', encoding='utf-8') >>> print(stream.select('elem')) foobar >>> print(stream.select('elem/text()')) foobar Note that the outermost element of the stream becomes the *context node* for the XPath test. That means that the expression "doc" would not match anything in the example above, because it only tests against child elements of the outermost element: >>> print(stream.select('doc')) You can use the "." expression to match the context node itself (although that usually makes little sense): >>> print(stream.select('.')) foobar :param path: a string containing the XPath expression :param namespaces: mapping of namespace prefixes used in the path :param variables: mapping of variable names to values :return: the selected substream :rtype: `Stream` :raises PathSyntaxError: if the given path expression is invalid or not supported r)Path) genshi.pathrCselect)r+path namespaces variablesrCrrrrEs !z Stream.selectr:cKs6ddlm}|dur|jpd}||fi|t|S)aGenerate strings corresponding to a specific serialization of the stream. Unlike the `render()` method, this method is a generator that returns the serialized output incrementally, as opposed to returning a single string. Any additional keyword arguments are passed to the serializer, and thus depend on the `method` parameter value. :param method: determines how the stream is serialized; can be either "xml", "xhtml", "html", "text", or a custom serializer class; if `None`, the default serialization method of the stream is used :return: an iterator over the serialization results (`Markup` or `unicode` objects, depending on the serialization method) :rtype: ``iterator`` :see: XMLSerializer, XHTMLSerializer, HTMLSerializer, TextSerializer r)get_serializerNr:)r>rIrr2)r+r;r@rIrrrr?s  zStream.serializecCs|SrrBr/rrr__str__zStream.__str__cCs |jddS)N)r<rJr/rrr __unicode__ zStream.__unicode__cCs|Srrr/rrr__html__szStream.__html__r)NNN)NN)r:)rrrrrrr r!r"r#r$r%r&r'r(r)r*r,r0r4r8rBrEr?rKrMrOrrrrr(s0  .   $ rccst|}zt|}Wn tyYdSwt|tus"t|dkrBt|g|D]}t|dr4|}nt t |df}|Vq(dS|V|D]}|VqGdS)z@Ensure that every item on the stream is actually a markup event.NtotupleNrS) r.next StopIterationtypetuplelenrhasattrrQr"six text_type)streameventrrrr2s$    r2c@sVeZdZdZgZddZddZddZdd Zd d Z d d Z dddZ ddZ dS)r a?Immutable sequence type that stores the attributes of an element. Ordering of the attributes is preserved, while access by name is also supported. >>> attrs = Attrs([('href', '#'), ('title', 'Foo')]) >>> attrs Attrs([('href', '#'), ('title', 'Foo')]) >>> 'href' in attrs True >>> 'tabindex' in attrs False >>> attrs.get('title') 'Foo' Instances may not be manipulated directly. Instead, the operators ``|`` and ``-`` can be used to produce new instances that have specific attributes added, replaced or removed. To remove an attribute, use the ``-`` operator. The right hand side can be either a string or a set/sequence of strings, identifying the name(s) of the attribute(s) to remove: >>> attrs - 'title' Attrs([('href', '#')]) >>> attrs - ('title', 'href') Attrs() The original instance is not modified, but the operator can of course be used with an assignment: >>> attrs Attrs([('href', '#'), ('title', 'Foo')]) >>> attrs -= 'title' >>> attrs Attrs([('href', '#')]) To add a new attribute, use the ``|`` operator, where the right hand value is a sequence of ``(name, value)`` tuples (which includes `Attrs` instances): >>> attrs | [('title', 'Bar')] Attrs([('href', '#'), ('title', 'Bar')]) If the attributes already contain an attribute with a given name, the value of that attribute is replaced: >>> attrs | [('href', 'http://example.org/')] Attrs([('href', 'http://example.org/')]) cCs |D] \}}||kr dSqdS)zReturn whether the list includes an attribute with the specified name. :return: `True` if the list includes the attribute :rtype: `bool` TFr)r+nameattr_rrr __contains__^s zAttrs.__contains__cCs$t||}t|turt|S|S)zReturn an item or slice of the attributes list. >>> attrs = Attrs([('href', '#'), ('title', 'Foo')]) >>> attrs[1] ('title', 'Foo') >>> attrs[1:] Attrs([('title', 'Foo')]) )rW __getitem__rVslicer )r+iitemsrrrrbjs zAttrs.__getitem__cCstt|||S)zReturn a slice of the attributes list. >>> attrs = Attrs([('href', '#'), ('title', 'Foo')]) >>> attrs[1:] Attrs([('title', 'Foo')]) )r rW __getslice__)r+rdjrrrrfxszAttrs.__getslice__csTtdd|Dtfdd|DtfddDfdd|DS)a)Return a new instance that contains the attributes in `attrs` in addition to any already existing attributes. Any attributes in the new set that have a value of `None` are removed. :return: a new instance with the merged attributes :rtype: `Attrs` cSsg|] \}}|dur|qSrr.0anavrrr sz Attrs.__or__..cs(g|]\}}|vr|dur||fqSrrrhr/rrrl cs(g|]\}}|vr|||fqSr)get)risnsv)removereplacerrrls cs(g|]\}}|vr|vr||fqSrrrh)rqr+rrrlrm)setdictr )r+attrsr)rqrrr+rr4s z Attrs.__or__cCs |sdSdddd|DS)NzAttrs()z Attrs([%s])z, cSsg|]}t|qSr)reprriitemrrrrlz"Attrs.__repr__..)joinr/rrr__repr__szAttrs.__repr__cs(ttjr ftfdd|DS)zReturn a new instance with all attributes with a name in `names` are removed. :param names: the names of the attributes to remove :return: a new instance with the attribute removed :rtype: `Attrs` cs g|] \}}|vr||fqSrr)rir^rnamesrrrls z!Attrs.__sub__..) isinstancerZ string_typesr )r+r}rr|r__sub__s z Attrs.__sub__NcCs"|D] \}}||kr|Sq|S)aReturn the value of the attribute with the specified name, or the value of the `default` parameter if no such attribute is found. :param name: the name of the attribute :param default: the value to return when the attribute does not exist :return: the attribute value, or the `default` value if that attribute does not exist :rtype: `object` r)r+r^defaultr_valuerrrrns z Attrs.getcCstddd|DdfS)a[Return the attributes as a markup event. The returned event is a `TEXT` event, the data is the value of all attributes joined together. >>> Attrs([('href', '#'), ('title', 'Foo')]).totuple() ('TEXT', '#Foo', (None, -1, -1)) :return: a `TEXT` event :rtype: `tuple` cSsg|]}|dqS)r)rixrrrrlryz!Attrs.totuple..rR)r"rzr/rrrrQs z Attrs.totupler) rrrrrrarbrfr4r{rrnrQrrrrr (s3   r c@sreZdZdZgZddZddZddZdd ZeZ d d Z dd dZ e dddZ ddZdddZddZdS)rzeMarks a string as being safe for inclusion in HTML/XML output without needing to be escaped. cCsttj|t|SrrrZr[__add__r r+otherrrrrrzMarkup.__add__cCsttjt||Srrrrrr__radd__rzMarkup.__radd__cCs`t|trtt|tt|}nt|ttfr#ttt|}nt|}t t j ||Sr) r~rtzipkeysmapr valueslistrWrrZr[__mod__)r+argsrrrrs zMarkup.__mod__cCsttj||Sr)rrZr[__mul__)r+numrrrrzMarkup.__mul__cCsdt|jtj|fS)Nz<%s %s>)rVrrZr[r{r/rrrr{szMarkup.__repr__Tcs$fdd|D}ttj||S)aAReturn a `Markup` object which is the concatenation of the strings in the given sequence, where this `Markup` object is the separator between the joined elements. Any element in the sequence that is not a `Markup` instance is automatically escaped. :param seq: the sequence of strings to join :param escape_quotes: whether double quote characters in the elements should be escaped :return: the joined `Markup` object :rtype: `Markup` :see: `escape` csg|]}t|dqS))quotes)r rw escape_quotesrrrlszMarkup.join..)rrZr[rz)r+seqr escaped_itemsrrrrzsz Markup.joincCsd|s|St||ur |St|dr||S|dddddd}|r.|dd }||S) aCreate a Markup instance from a string and escape special characters it may contain (<, >, & and "). >>> escape('"1 < 2"') If the `quotes` parameter is set to `False`, the " character is left as is. Escaping quotes is generally only required for strings that are to be used in attribute values. >>> escape('"1 < 2"', quotes=False) :param text: the text to escape :param quotes: if ``True``, double quote characters are escaped in addition to the other special characters :return: the escaped `Markup` string :rtype: `Markup` rO&&<<>>"")rVrYrOrr)rtextrrrrr s     z Markup.escapecCs2|sdSt|dddddddd S) zReverse-escapes &, <, >, and " and returns a `unicode` object. >>> Markup('1 < 2').unescape() '1 < 2' :return: the unescaped string :rtype: `unicode` :see: `genshi.core.unescape` rrrrrrrrr)rZr[rrr/rrrr s zMarkup.unescapeFcCstt||dS)aReturn a copy of the text with any character or numeric entities replaced by the equivalent UTF-8 characters. If the `keepxmlentities` parameter is provided and evaluates to `True`, the core XML entities (``&``, ``'``, ``>``, ``<`` and ``"``) are not stripped. :return: a `Markup` instance with entities removed :rtype: `Markup` :see: `genshi.util.stripentities` )keepxmlentities)rr)r+rrrrr"s zMarkup.stripentitiescCs tt|S)zReturn a copy of the text with all XML/HTML tags removed. :return: a `Markup` instance with all tags removed :rtype: `Markup` :see: `genshi.util.striptags` )rrr/rrrr0s zMarkup.striptagsN)T)F)rrrrrrrrr__rmul__r{rz classmethodr r rrrrrrrs   "  r)rcCst|ts|S|S)aoReverse-escapes &, <, >, and " and returns a `unicode` object. >>> unescape(Markup('1 < 2')) '1 < 2' If the provided `text` object is not a `Markup` instance, it is returned unchanged. >>> unescape('1 < 2') '1 < 2' :param text: the text to unescape :return: the unescsaped string :rtype: `unicode` )r~rr )rrrrr Cs r c@seZdZdZddZddZddZdd Zd d Zd d Z ddZ ddZ ddZ e Z ddZejddkr>> html = Namespace('http://www.w3.org/1999/xhtml') >>> html Namespace('http://www.w3.org/1999/xhtml') >>> html.uri 'http://www.w3.org/1999/xhtml' The `Namespace` object can than be used to generate `QName` objects with that namespace: >>> html.body QName('http://www.w3.org/1999/xhtml}body') >>> html.body.localname 'body' >>> html.body.namespace 'http://www.w3.org/1999/xhtml' The same works using item access notation, which is useful for element or attribute names that are not valid Python identifiers: >>> html['body'] QName('http://www.w3.org/1999/xhtml}body') A `Namespace` object can also be used to test whether a specific `QName` belongs to that namespace using the ``in`` operator: >>> qname = html.body >>> qname in html True >>> qname in Namespace('http://www.w3.org/2002/06/xhtml2') False cCst||ur|St|Sr)rVobjectr)rurirrrrs  zNamespace.__new__cCs|jfSrrr/rrr__getnewargs__rLzNamespace.__getnewargs__cC|jSrrr/rrr __getstate__zNamespace.__getstate__cCs ||_dSrrr+rrrr __setstate__r1zNamespace.__setstate__cCst||_dSr)rZr[rrrrrr,szNamespace.__init__cCs |j|jkSr) namespacer)r+qnamerrrrarNzNamespace.__contains__cCs ||k Srrrrrr__ne__r1zNamespace.__ne__cCs t|tr |j|jkS|j|kSr)r~r rrrrr__eq__s   zNamespace.__eq__cCst|jd|S)N})r r)r+r^rrrrbrzNamespace.__getitem__cCr-r)hashrr/rrr__hash__r1zNamespace.__hash__rcCsdt|jt|jfS)N%s(%s))rVrrrr/rrrr{zNamespace.__repr__cCsdt|j|jfS)N%s(%r))rVrrr/rrrr{scCs |jdS)Nzutf-8)rr9r/rrrrKrNzNamespace.__str__cCrrrr/rrrrMrzNamespace.__unicode__N)rrrrrrrrr,rarrrb __getattr__rsys version_infor{rKrMrrrrr Xs$'  r z$http://www.w3.org/XML/1998/namespacec@sJeZdZdZddgZddZddZejdd krd d Z d Sd d Z d S)r aA qualified element or attribute name. The unicode value of instances of this class contains the qualified name of the element or attribute, in the form ``{namespace-uri}local-name``. The namespace URI can be obtained through the additional `namespace` attribute, while the local name can be accessed through the `localname` attribute. >>> qname = QName('foo') >>> qname QName('foo') >>> qname.localname 'foo' >>> qname.namespace >>> qname = QName('http://www.w3.org/1999/xhtml}body') >>> qname QName('http://www.w3.org/1999/xhtml}body') >>> qname.localname 'body' >>> qname.namespace 'http://www.w3.org/1999/xhtml' r localnamecCst||ur|S|d}|dd}t|dkr.tj|d|}ttj|\|_|_ |Stj||}dt||_|_ |S)zCreate the `QName` instance. :param qname: the qualified name as a string of the form ``{namespace-uri}local-name``, where the leading curly brace is optional {rrz{%sN) rVlstripsplitrXrZr[rrrr)rrpartsr+rrrrs    z QName.__new__cCs |dfS)Nr)rr/rrrrrNzQName.__getnewargs__rrcCsdt|jt|dfS)Nrr)rVrrrr/rrrr{szQName.__repr__cCsdt|j|dfS)Nrr)rVrrr/rrrr{rN) rrrrrrrrrr{rrrrr s  r )*r functoolsrr itertoolsrr5rZ genshi.compatr genshi.utilrr__all__ __docformat__rrrrr r!r"r#r$r%r&r'r(r)r*r2rWr r[rgenshi._speedups ImportErrorr r r XML_NAMESPACEr rrrrsL    Zz [