????
Your IP : 18.222.219.97
�
��abc@s�dZddlZddlZddlZddlmZddlmZmZm Z ddl
mZddlm
Z
ddlmZdd lmZd
efd��YZdS(s
Module containing the UniversalDetector detector class, which is the primary
class a user of ``chardet`` should use.
:author: Mark Pilgrim (initial port to Python)
:author: Shy Shalom (original C code)
:author: Dan Blanchard (major refactoring for 3.0)
:author: Ian Cordasco
i����Ni(tCharSetGroupProber(t
InputStatetLanguageFiltertProbingState(tEscCharSetProber(tLatin1Prober(tMBCSGroupProber(tSBCSGroupProbertUniversalDetectorcBs�eZdZdZejd�Zejd�Zejd�Zidd6dd6d d
6dd6d
d6dd6dd6dd6Z e
jd�Zd�Z
d�Zd�ZRS(sq
The ``UniversalDetector`` class underlies the ``chardet.detect`` function
and coordinates all of the different charset probers.
To get a ``dict`` containing an encoding and its confidence, you can simply
run:
.. code::
u = UniversalDetector()
u.feed(some_bytes)
u.close()
detected = u.result
g�������?s[�-�]s(|~{)s[�-�]sWindows-1252s
iso-8859-1sWindows-1250s
iso-8859-2sWindows-1251s
iso-8859-5sWindows-1256s
iso-8859-6sWindows-1253s
iso-8859-7sWindows-1255s
iso-8859-8sWindows-1254s
iso-8859-9sWindows-1257siso-8859-13cCsqd|_g|_d|_d|_d|_d|_d|_||_t j
t�|_d|_
|j�dS(N(tNonet_esc_charset_probert_charset_proberstresulttdonet _got_datat_input_statet
_last_chartlang_filtertloggingt getLoggert__name__tloggert_has_win_bytestreset(tselfR((sI/usr/lib/python2.7/site-packages/pip/_vendor/chardet/universaldetector.pyt__init__Qs cCs�idd6dd6dd6|_t|_t|_t|_tj|_d|_ |j
rg|j
j�nx|jD]}|j�qqWdS(s�
Reset the UniversalDetector and all of its probers back to their
initial states. This is called by ``__init__``, so you only need to
call this directly in between analyses of different documents.
tencodinggt
confidencetlanguagetN(
R RtFalseR
RRRt
PURE_ASCIIRRR
RR(Rtprober((sI/usr/lib/python2.7/site-packages/pip/_vendor/chardet/universaldetector.pyR^s cCsy|jr
dSt|�sdSt|t�s;t|�}n|js{|jtj�rwidd6dd6dd6|_n�|jtj tj
f�r�idd6dd6dd6|_n�|jd �r�id
d6dd6dd6|_nl|jd�ridd6dd6dd6|_n<|jtjtjf�rOid
d6dd6dd6|_nt
|_|jddk r{t
|_dSn|jtjkr�|jj|�r�tj|_q�|jtjkr�|jj|j|�r�tj|_q�n|d|_|jtjkr�|js(t|j�|_n|jj|�tjkrui|jjd6|jj�d6|jj d6|_t
|_qun�|jtjkru|j!s�t"|j�g|_!|jt#j$@r�|j!j%t&��n|j!j%t'��nx`|j!D]U}|j|�tjkr�i|jd6|j�d6|j d6|_t
|_Pq�q�W|j(j|�rut
|_)qundS(s�
Takes a chunk of a document and feeds it through all of the relevant
charset probers.
After calling ``feed``, you can check the value of the ``done``
attribute to see if you need to continue feeding the
``UniversalDetector`` more data, or if it has made a prediction
(in the ``result`` attribute).
.. note::
You should always call ``close`` when you're done feeding in your
document if ``done`` is not already ``True``.
Ns UTF-8-SIGRg�?RRRsUTF-32s��sX-ISO-10646-UCS-4-3412s��sX-ISO-10646-UCS-4-2143sUTF-16i����(*R
tlent
isinstancet bytearrayRt
startswithtcodecstBOM_UTF8RtBOM_UTF32_LEtBOM_UTF32_BEtBOM_LEtBOM_BEtTrueR RRRtHIGH_BYTE_DETECTORtsearcht HIGH_BYTEtESC_DETECTORRt ESC_ASCIIR
RRtfeedRtFOUND_ITtcharset_nametget_confidenceRRRRtNON_CJKtappendRRtWIN_BYTE_DETECTORR(Rtbyte_strR ((sI/usr/lib/python2.7/site-packages/pip/_vendor/chardet/universaldetector.pyR1os~
c Cs>|jr|jSt|_|js5|jjd�n1|jtjkrhidd6dd6dd6|_n�|jtj krfd}d}d}xD|jD]9}|s�q�n|j�}||kr�|}|}q�q�W|rf||j
krf|j}|jj�}|j�}|jd �r?|jr?|jj||�}q?ni|d6|d6|jd6|_qfn|jj�tjkr7|jddkr7|jjd
�x�|jD]�}|s�q�nt|t�rx^|jD]+}|jjd|j|j|j��q�Wq�|jjd|j|j|j��q�Wq7n|jS(
s�
Stop analyzing the current document and come up with a final
prediction.
:returns: The ``result`` attribute, a ``dict`` with the keys
`encoding`, `confidence`, and `language`.
sno data received!tasciiRg�?RRRgsiso-8859s no probers hit minimum thresholds%s %s confidence = %sN(R
RR+RRtdebugRRRR.R RR4tMINIMUM_THRESHOLDR3tlowerR$RtISO_WIN_MAPtgetRtgetEffectiveLevelRtDEBUGR"Rtprobers( Rtprober_confidencetmax_prober_confidencet
max_proberR R3tlower_charset_nameRtgroup_prober((sI/usr/lib/python2.7/site-packages/pip/_vendor/chardet/universaldetector.pytclose�s`
(Rt
__module__t__doc__R;tretcompileR,R/R7R=RtALLRRR1RG(((sI/usr/lib/python2.7/site-packages/pip/_vendor/chardet/universaldetector.pyR3s"
m(RIR%RRJtcharsetgroupproberRtenumsRRRt escproberRtlatin1proberRtmbcsgroupproberRtsbcsgroupproberRtobjectR(((sI/usr/lib/python2.7/site-packages/pip/_vendor/chardet/universaldetector.pyt<module>$s