o $amp @sdZddlZddlZddlZddlZddlZddlmZmZddlm Z ddl m Z ddl m Z ddlmZmZddlmZdd lmZmZmZmZmZmZmZmZmZmZmZdd lm Z m!Z!m"Z"dd l#m$Z$dd l%m&Z&dd l'm(Z(ddl)m*Z*m+Z+m,Z,ddl-m.Z.ddl/m0Z0ddl1m2Z2ddl3m4Z4ddl5m6Z6ddl7m8Z8ddl9m:Z:ddl;mm?Z?m@Z@mAZAmBZBmCZCddlDmEZEe=FeGZHeIdZJeddeKfdeKfdeeLffZMeddeNfd eeMffZOed!deKfdeKfdeLfd"eKfd#eKfd$eLffZPed%d&eNfdeNffZQeeOeeNeKeKeLffZRd'd(iZSdZTd)ZUd*ZVd+e&d,eLfd-d.ZWGd/d0d0e ZXd1e'j'j(d2eKd,eYfd3d4ZZGd5d6d6e0Z[Gd7d8d8Z\Gd9d:d:eZ]Gd;d<dd?Z_d=e.d@e2d,dfdAdBZ`d=e.d,eeKeffdCdDZadS)Ez sphinx.builders.linkcheck ~~~~~~~~~~~~~~~~~~~~~~~~~ The CheckExternalLinksBuilder class. :copyright: Copyright 2007-2021 by the Sphinx team, see AUTHORS. :license: BSD, see LICENSE for details. N)datetimetimezone)parsedate_to_datetime) HTMLParser)path) PriorityQueueQueue)Thread) AnyDict GeneratorList NamedTupleOptionalPatternSetTupleUnioncast)unquoteurlparse urlunparse)nodes)Element)Response)ConnectionError HTTPErrorTooManyRedirects)Sphinx) DummyBuilder)Config)RemovedInSphinx50Warning)BuildEnvironment)__)SphinxPostTransform) encode_uriloggingrequests)darkgray darkgreenpurplered turquoise) get_node_linez ([a-z]+:)?// Hyperlinkuridocnamelineno CheckRequest next_check hyperlink CheckResultstatusmessagecode RateLimitdelayAcceptz/text/html,application/xhtml+xml;q=0.9,*/*;q=0.8gN@nodereturncCstjdtddt|p dS)z PriorityQueue items must be comparable. The line number is part of the tuple used by the PriorityQueue, keep an homogeneous type for comparison. znode_line_or_0() is deprecated. stacklevelr)warningswarnr!r-)r=rD;/usr/lib/python3/dist-packages/sphinx/builders/linkcheck.pynode_line_or_0Fs rFcs@eZdZdZdeddffdd Zdededdfd d ZZS) AnchorCheckParserz9Specialized HTML parser that looks for a specific anchor. search_anchorr>Ncst||_d|_dSNF)super__init__rHfound)selfrH __class__rDrErKSs  zAnchorCheckParser.__init__tagattrscCs0|D]\}}|dvr||jkrd|_dSqdS)N)idnameT)rHrL)rMrPrQkeyvaluerDrDrEhandle_starttagYs z!AnchorCheckParser.handle_starttag) __name__ __module__ __qualname____doc__strrKr rV __classcell__rDrDrNrErGPsrGresponseanchorcCsPt|}|jdddD]}t|tr|}|||jr nq ||jS)zReads HTML data from a response object `response` searching for `anchor`. Returns True if anchor was found, False otherwise. iT) chunk_sizedecode_unicode)rG iter_content isinstancebytesdecodefeedrLclose)r]r^parserchunkrDrDrE check_anchor`s  ric @sleZdZdZdZedZd.ddZede e fdd Z ede e e e ffd d Zede e fd d ZedeefddZedeeeffddZedeee eefffddZd.ddZdedeefddZdedefddZdede efddZdedefddZ de!ddfd d!Z"d"ed#ed$ed%ed&eddf d'd(Z#d)e$ddfd*d+Z%d.d,d-Z&dS)/CheckExternalLinksBuilderz+ Checks for broken external links. linkcheckzCLook for any errors in the above output or in %(outdir)s/output.txtr>NcCs8i|_t|_i|_i|_tdt|_t |_ dS)Ng@) hyperlinksset_good_broken _redirectedsocketsetdefaulttimeoutr_wqueuer_rqueuerMrDrDrEinitzs  zCheckExternalLinksBuilder.initcC.tjd|jjdftdddd|jjDS)N%s.%s is deprecated.anchors_ignorer?r@cSg|]}t|qSrDrecompile.0xrDrDrE z.)rBrCrOrWr!configlinkcheck_anchors_ignorerurDrDrEry z(CheckExternalLinksBuilder.anchors_ignorecCrw)Nrxauthr?r@cSg|] \}}t||fqSrDr{rpattern auth_inforDrDrErz2CheckExternalLinksBuilder.auth..)rBrCrOrWr!rlinkcheck_authrurDrDrErszCheckExternalLinksBuilder.authcCrw)Nrx to_ignorer?r@cSrzrDr{r~rDrDrErrz7CheckExternalLinksBuilder.to_ignore..)rBrCrOrWr!rlinkcheck_ignorerurDrDrErrz#CheckExternalLinksBuilder.to_ignorecC"tjd|jjdftdd|jS)Nrxgoodr?r@)rBrCrOrWr!rnrurDrDrEr zCheckExternalLinksBuilder.goodcCr)Nrxbrokenr?r@)rBrCrOrWr!rorurDrDrErrz CheckExternalLinksBuilder.brokencCr)Nrx redirectedr?r@)rBrCrOrWr!rprurDrDrErrz$CheckExternalLinksBuilder.redirectedcCs tjd|jjdftdddS)Nrx check_threadr?r@rBrCrOrWr!rurDrDrErs  z&CheckExternalLinksBuilder.check_threadr]cCs:tjd|jjdftddt|j|jddi}||S)Nrx limit_rater?r@) rBrCrOrWr! HyperlinkAvailabilityCheckWorkerenvrr)rMr]workerrDrDrErs  z$CheckExternalLinksBuilder.limit_ratecCr)Nrxrqueuer?r@)rBrCrOrWr!rtrMr]rDrDrEr z CheckExternalLinksBuilder.rqueuecCs tjd|jjdftddgS)Nrxworkersr?r@rrrDrDrErs z!CheckExternalLinksBuilder.workerscCr)Nrxwqueuer?r@)rBrCrOrWr!rsrrDrDrErrz CheckExternalLinksBuilder.wqueueresultcCs|j|jd}t||j|j|j|j|jd}| ||jdkr#dS|jdkr/|jdkr/dS|jr=t j d|j|jdd|jdkrb|jrVt t d |jd |jdSt t d |jdS|jd krt t d |j| d |j||j|jdS|jdkrt td |j|jdS|jdkr|jjs|jjrt jtd|j|j|j|jfdnt td|jtd|j| d|j||j|jd |jdS|jdkrPzdtfdtfdtfdtfdtfd|j\}}Wntydt}}Ynw||d<|jjr&t jd|jd|d|j|j|jfdnt |d|j|d|d|j| d||j||j|jd|jdStd|j)N)filenamer1r6r8r/info uncheckedworkingoldz(%16s: line %4d) T)nonlignoredz -ignored- z: localz -local- z ok rzbroken link: %s (%s))locationz broken z - r permanentlyz with Foundzwith See Other temporarily)i-i.i/i3i4zwith unknown codetextz redirect z to z redirected zUnknown status %s.)rdoc2pathr0dictr1r6r8r/r7write_linkstatloggerrr( write_entryr)appquietwarningiserrorwarningr#r+r*r,KeyErrorrlinkcheck_allowed_redirects ValueError)rMrrlinkstatrcolorrDrDrEprocess_resultsr    "    "    z(CheckExternalLinksBuilder.process_resultwhatr0rliner/cCs|jd||||fdS)Nz%s:%s: [%s] %s ) txt_outfilewrite)rMrr0rrr/rDrDrErsz%CheckExternalLinksBuilder.write_entrydatacCs"|jt||jddS)N ) json_outfilerjsondumps)rMrrDrDrEr"sz(CheckExternalLinksBuilder.write_linkstatc Cst|j|j|}tdtt|jdd2|_ tt|jdd|_ | |j D]}| |q+Wdn1s=wYWdn1sLwY|jrZd|j_dSdS)Nz output.txtwz output.jsonr<)HyperlinkAvailabilityCheckerrrrropenrjoinoutdirrrcheckrlrror statuscode)rMcheckerrrDrDrEfinish&s   z CheckExternalLinksBuilder.finishr>N)'rWrXrYrZrSr#epilogrvpropertyr rryrr rrrr[rr rintrrrrfloatrrrr rrr5rrrrrrDrDrDrErjrs<      7 rjc @steZdZ ddedededdfddZddd Zdd d Zd e e e fde e ddffd dZde defddZdS)rNrrbuilderr>cCs^||_||_||_i|_g|_dd|jjD|_|r%|j|_|j |_ dSt |_t |_ dS)NcSrzrDr{r~rDrDrEr?rz9HyperlinkAvailabilityChecker.__init__..) rrr rate_limitsrrrrtrrsrrr)rMrrrrDrDrErK4s  z%HyperlinkAvailabilityChecker.__init__cCsHt|jjD]}t|j|j|j|j|j|j}| |j |qdSN) rangerlinkcheck_workersrrrrrrstartrappend)rMithreadrDrDrEinvoke_threadsHs z+HyperlinkAvailabilityChecker.invoke_threadscCs.|j|jD] }|jttddqdSrI)rrrputr2CHECK_IMMEDIATELY)rMrrDrDrEshutdown_threadsPs  z-HyperlinkAvailabilityChecker.shutdown_threadsrlccs|d}|D]$}||jr!t|j|j|jdddVq |jt t |d|d7}q d}||krD|j V|d7}||ks6| dS)NrrrFr<)rvaluesis_ignored_urir/r5r0r1rrr2rrgetr)rMrl total_linksr4donerDrDrErUs      z"HyperlinkAvailabilityChecker.checkr/cstfdd|jDS)Nc3s|]}|VqdSr)match)rpatr/rDrE isz>HyperlinkAvailabilityChecker.is_ignored_uri..)anyrrMr/rDrrErhsz+HyperlinkAvailabilityChecker.is_ignored_urirr)rWrXrYr"r rjrKrrr r[r.r r5rboolrrDrDrDrEr3s   $rcsjeZdZdZ ddededededeee fde d dffd d Z dd d Z de d eefddZZS)rz;A worker class for checking the availability of hyperlinks.Nrrrrrrr>cs||_||_||_||_||_dd|jjD|_dd|jjD|_|r2|j |_ |j |_ |j |_ n t |_ i|_ i|_ t jdddS)NcSrzrDr{r~rDrDrEr{sz=HyperlinkAvailabilityCheckWorker.__init__..cSrrDr{rrDrDrEr}rT)daemon)rrrrrrryrrrnrorprmrJrK)rMrrrrrrrNrDrErKos& z)HyperlinkAvailabilityCheckWorker.__init__c sijjr jjd<dtffdd dttttfffdd dtdtdtffd d d tdttttfffd d } j}z|\}durZWdS\}}Wnt yo|\}}}YnwdurvdSt j }zj |j }Wn tyYnw|tkrttjt|djqH||\}}} |dkrttdtdn jt||||| jqI)Ntimeoutr>csht}d|j|jfd|j|jfdg}|D]}|jjvr1tt}|jj||SqiS)Nz%s://%sz%s://%s/*)rschemenetlocrlinkcheck_request_headersrDEFAULT_REQUEST_HEADERSupdate)url candidatesuheadersrrDrEget_request_headerss zAHyperlinkAvailabilityCheckWorker.run..get_request_headersc sdvrdd\}}jD] }||rd}nqn}d}z|dWn ty4t|}YnwjD] \}}|rCnq8d}d<zt|rvjjrvt j |fdj|d}| t |t |}|suttd|nHzt j|fdj|d}| Wn3tttfy}z$t|tr|jjd krt j |fdj|d}| WYd}~nd}~wwWnty)}z]|jjd krWYd}~d S|jjd kr|j}|durjt|d WYd}~d Sdt|dfWYd}~S|jjdkrdt|dfWYd}~Sdt|dfWYd}~Sd}~wtyB}z dt|dfWYd}~Sd}~wwt|j} zj| =Wn tyXYnw|j !d|!dkrgdS|j } |rs| d|7} || r{dS|j"r|j"dj} d| | fSd| dfS)N#r<asciirT)streamrrzAnchor '%s' not found)allow_redirectsrrii)rz - unauthorizedrF) rate-limitedrrrrir/rrrr)#splitryrencode UnicodeErrorr%rrlinkcheck_anchorsr'rraise_for_statusrir Exceptionr#headrrrrbr] status_coderrrr2r[rrrrrrstriphistory) req_urlr^rexrrr]rLerrr3rnew_urlr8)allowed_redirectrr4kwargsrMr/rDrE check_uris                   z7HyperlinkAvailabilityCheckWorker.run..check_urirrcs4jjD]\}}||r||rdSqdS)NTF)rritemsr)rrfrom_urlto_urlrurDrErs z>HyperlinkAvailabilityCheckWorker.run..allowed_redirectr0cs,tdks dr dSds4trdStj|}tt |r-dSdj <dSj vr;dSj vrHd j dfSj vr\d j dj d fSt jjD]}\}}}|d krpnqb|d kr|j n|d kr|j <n |d kr||fj <|||fS) Nr)rzmailto:ztel:)rrr)zhttp:zhttps:rr)rrr)rrrrrr<r)len startswithuri_rerrdirnamerrexistsrrornrprrlinkcheck_retriesadd)r0srcdir_r6rr8)rrMr/rDrErs8         z3HyperlinkAvailabilityCheckWorker.run..checkTFrz-rate limited- z | sleeping...)rlinkcheck_timeoutr rr[rrrrrrrrr3rtimesleepQUEUE_POLL_SECSrr2 task_donerrr(rr5) rMr check_requestr3r0r1rr6rr8rD)rrrr4rrMr/rErunsH &`$%        z$HyperlinkAvailabilityCheckWorker.runr]c Csd}|jd}|rAzt|}Wn*ty:zt|}Wn ttfy(Ynwt|}|tt j  }Ynwt |}t |jj}|dur|jj}z|j|}Wn tyat}Ynw|j} d| }||krs| |krs|}||krydSt |}t|||j|<|S)Nz Retry-Afterg@)rrrrr TypeErrorr timestampnowrutc total_secondsrrrrrlinkcheck_rate_limit_timeoutrr DEFAULT_DELAYr:r9) rMr]r3 retry_afterr:untilr max_delay rate_limitlast_wait_timerDrDrErRsB        z+HyperlinkAvailabilityCheckWorker.limit_raterr)rWrXrYrZr"r rr r[r9rjrKr$rrrrr\rDrDrNrErls  Frc@s&eZdZdZdZdeddfddZdS)HyperlinkCollector)rk rr>Nc Kstt|jj}|j}|jtjD]*}d|vrq|d}|j d|}|r'|}t |}t ||j j |}||vr;|||<q|jtjD].} | dd}|rqd|vrq|j d|}|r]|}t | }t ||j j |}||vrq|||<qCdS)Nrefurilinkcheck-process-urir?z://)rrjrrrldocumenttraverser referenceemit_firstresultr-r.rr0imager) rMrrrlrefnoder/newurir1uri_infoimgnoderDrDrEr$}s4 zHyperlinkCollector.run)rWrXrYbuildersdefault_priorityr r$rDrDrDrEr1ysr1rcCsHt|}|jdkr"|jr"|jd}|s"d|j}t|j|dSdS)zRewrite anchor name of the hyperlink to github.com The hyperlink anchors in github.com are dynamically generated. This rewrites them before checking and makes them comparable. z github.comz user-content-)fragmentN)rhostnamerArr_replace)rr/parsedprefixedrArDrDrErewrite_github_anchors  rFrc Cst|jjD]B\}}z6zt||jjt|<Wntjy9}ztt d|j |j WYd}~nd}~wwW|jj |q|jj |wdS)zFCompile patterns in linkcheck_allowed_redirects to the regexp objects.z=Failed to compile regex in linkcheck_allowed_redirects: %r %sN) listrrrr|r}errorrrr#rmsgpop)rrrrexcrDrDrE#compile_linkcheck_allowed_redirectss  rLcCs|t|t|dgd|did|dgd|did|ddd|dddtg|dd d|d d d|d d gd|ddd|d|jdtdddd d dS)Nrrrrrr<rrrTrz^!r*gr@r4z config-initedr2)prioritybuiltin)versionparallel_read_safeparallel_write_safe) add_builderrjadd_post_transformr1add_config_valuer add_eventconnectrL)rrDrDrEsetups$   rX)brZrr|rqrrBrr email.utilsr html.parserrosrqueuerr threadingr typingr r r r rrrrrrr urllib.parserrrdocutilsrdocutils.nodesrr'rrequests.exceptionsrrrsphinx.applicationrsphinx.builders.dummyr sphinx.configr sphinx.deprecationr!sphinx.environmentr" sphinx.localer#!sphinx.transforms.post_transformsr$ sphinx.utilr%r&sphinx.util.consoler(r)r*r+r,sphinx.util.nodesr- getLoggerrWrr}rr[rr.rr2r5r9CheckRequestTyperrr!r+rFrGrrirjrrr1rFrLrXrDrDrDrEs     4                   B9$