fp{ 6UddlmZddlZddlZddlZddlZddlZddlZddlZddl m Z m Z ddl m Z mZmZmZmZmZmZmZmZe rddlZddlmZedZnedZedZej6dZeeegefZGd d Z Gd d e Z!Gd de Z"Gdde Z#Gdde Z$Gdde Z%Gdde Z&Gdde Z'GddeeefZ(Gdde Z)de iZ*de+d<d d!dZ,e e!e%e"e$e#e&e'e)f D] Z-e,e- y)") annotationsN)FutureThreadPoolExecutor) TYPE_CHECKINGAnyCallableClassVarGeneric NamedTupleOptional OrderedDictTypeVar) ParamSpecPTfsspeccHeZdZUdZdZded<d dZd dZd dZddZ dd Z y ) BaseCacheagPass-though cache: doesn't keep anything, calls every time Acts as base class for other cachers Parameters ---------- blocksize: int How far to read ahead in numbers of bytes fetcher: func Function of the form f(start, end) which gets bytes from remote as specified size: int How big this file is none ClassVar[str]namecf||_d|_||_||_d|_d|_d|_yNr) blocksizenblocksfetchersize hit_count miss_counttotal_requested_bytes)selfrrrs W/var/lib/jenkins/workspace/mettalog/venv/lib/python3.12/site-packages/fsspec/caching.py__init__zBaseCache.__init__:s4"   %&"ct|d}| |j}||jk\s||k\ry|j||S)Nrr$)rrr!startstops r"_fetchzBaseCache._fetchDs@ =E <99D DII $||E4((r$c.d|_d|_d|_y)zAReset hit and miss counts for a more ganular report e.g. by file.rN)rrr r!s r" _reset_statszBaseCache._reset_statsMs%&"r$c|jdk(r|jdk(ryd|j|j|j|jfzS)z2Return a formatted string of the cache statistics.rz3 , %s: %d hits, %d misses, %d total requested bytes)rrrr r+s r" _log_statszBaseCache._log_statsSsM >>Q 4??a#7D II NN OO  & & H   r$cd|jjd|jd|jd|jd|j d|j d|jdS) Nz ) __class____name__rrrrrr r+s r"__repr__zBaseCache.__repr___s .. ! ! "#!^^,-!\\N+!YYK(!^^,-!__-.$$($>$>#?@   r$NrintrFetcherrr5returnNoner' int | Noner(r:r7bytesr7r8)r7str) r2 __module__ __qualname____doc__r__annotations__r#r)r,r/r3r$r"rr(s, !D- ')'    r$rcdeZdZdZdZ d d fd Zd dZd dZd dZd dZ xZ S) MMapCachezmemory-mapped sparse file cache Opens temporary file, which is filled blocks-wise when data is requested. Ensure there is enough disc space in the temporary location. This cache method might only work on posix mmapct||||| tn||_||_|j |_yN)superr#setblockslocation _makefilecache)r!rrrrKrJr1s r"r#zMMapCache.__init__ws: GT2%~ce6   ^^% r$c4ddl}ddl}|jdk(r tS|j)t j j|js|j |j}t|_ nt|jd}|j|jdz |jd|jnt|jd}|j|j|jS)Nrzwb+1zr+b)rEtempfiler bytearrayrKospathexists TemporaryFilerIrJopenseekwriteflushfileno)r!rErQfds r"rLzMMapCache._makefiles 99>;  == t}}(E}}$++-!e $--/ GGDIIM " HHTN HHJdmmU+Btyydii00r$c tjd|d||d}| |j}||jk\s||k\ry||jz}||jz}t ||dzDcgc]}||j vs|}}t ||dzDcgc]}||j vs|}}|xj t|z c_|xjt|z c_|r|jd}||jz}t||jz|j} |xj| |z z c_ tjd|d|d| d|j|| |j|| |j j||r|j||Scc}wcc}w) NzMMap cache fetching -rr$rOzMMap get block #z ())loggerdebugrrrangerJrlenrpopminr rrMadd) r!r'end start_block end_blockineedhitssstartsends r"r)zMMapCache._fetchs +E7!C59: =E ;))C DII #t~~- 4>>)  i!m<Ua@TUU i!m<QaT[[@PQQ 3t9$ #d)# A'Fv. :D  & &$- 7 & LL+A3b$qA B&*ll64&@DJJvd # KKOOA zz%$$!VQs2F;F;G1Gc@|jj}|d=|S)NrM)__dict__copyr!states r" __getstate__zMMapCache.__getstate__s  ""$ 'N r$cd|jj||j|_yrG)rpupdaterLrMrrs r" __setstate__zMMapCache.__setstate__s" U#^^% r$)NN) rr5rr6rr5rKz str | NonerJzset[int] | Noner7r8)r7zmmap.mmap | bytearrayr'r:rgr:r7r;r7dict[str, Any]rsrzr7r8) r2r>r?r@rr#rLr)rtrw __classcell__r1s@r"rDrDlsj D $"& & & & &  &  &  &1,%8 &r$rDc0eZdZdZdZdfd ZddZxZS)ReadAheadCachea!Cache which reads only when we get beyond a block of data This is a much simpler version of BytesCache, and does not attempt to fill holes in the cache or keep fragments alive. It is best suited to many small reads in a sequential order (e.g., reading lines from a file). readaheadcRt||||d|_d|_d|_y)Nr$r)rHr#rMr'rgr!rrrr1s r"r#zReadAheadCache.__init__s) GT2  r$c^|d}|||jkDr |j}||jk\s||k\ry||z }||jk\rM||jkr>|xjdz c_|j||jz ||jz S|j|cxkr|jkrOnnL|xj dz c_|j||jz d}|t |z}|j}n|xj dz c_d}t|j||jz}|xj||z z c_ |j|||_||_|jt |jz|_||jd|zSNrr$rO) rr'rgrrMrrcrerr r)r!r'rglparts r"r)zReadAheadCache._fetchse =E ;# /))C DII # %K DJJ 3$((? NNa N::edjj033CD D ZZ5 +488 + OOq O::edjj023D TNAHHE OOq OD$))S4>>12 ""cEk1"\\%-  ::DJJ/djj!n$$r$r4rxr2r>r?r@rr#r)r|r}s@r"rrs D %r$rc0eZdZdZdZdfd ZddZxZS)FirstChunkCachezCaches the first block of a file only This may be useful for file types where the metadata is stored in the header, but is randomly accessed. firstcD||kDr|}t||||d|_yrG)rHr#rMrs r"r#zFirstChunkCache.__init__s( t I GT2#' r$cn|xsd}||jkDrtjdyt||j}||jkr&|j |xj dz c_||jkDr@|xj|z c_|jd|}|d|j|_||dS|jd|j|_|xj|jz c_|j ||}||jkDrA|xj||jz z c_||j|j|z }|xjdz c_ |S|xj dz c_|xj||z z c_|j||S)Nrz,FirstChunkCache: requested start > file sizer$rO) rr`rarerrMrr rr)r!r'rgdatars r"r)zFirstChunkCache._fetchsh  499  LLG H#tyy! 4>> !zz!1$'..#5.<<3/D!%&6!7DJ<'!\\!T^^< **dnn<*::eC(DT^^#**cDNN.BB* T^^S99 NNa NK OOq O  & &#+ 5 &<<s+ +r$r4rxrr}s@r"rrs D(,r$rceZdZdZdZ d d fd ZdZd dZd dZddZ dfd Z dd Z xZ S) BlockCachea Cache holding memory as a set of blocks. Requests are only ever made ``blocksize`` at a time, and are stored in an LRU cache. The least recently accessed block is discarded when more than ``maxblocks`` are stored. Parameters ---------- blocksize : int The number of bytes to store in each block. Requests are only ever made for ``blocksize``, so this should balance the overhead of making a request against the granularity of the blocks. fetcher : Callable size : int The total size of the file being cached. maxblocks : int The maximum number of blocks to cache for. The maximum memory use for this cache is then ``blocksize * maxblocks``. blockcachect||||tj||z |_||_t j||j|_ yrG) rHr#mathceilr maxblocks functools lru_cache _fetch_block_fetch_block_cachedr!rrrrr1s r"r#zBlockCache.__init__7sR GT2yy !12 "#A9#6#6y#A$BSBS#T r$c6|jjSz The statistics on the block cache. Returns ------- NamedTuple Returned directly from the LRU Cache used internally. r cache_infor+s r"rzBlockCache.cache_info?''2244r$c$|j}|d=|S)Nrrprrs r"rtzBlockCache.__getstate__Js  ' ( r$c|jj|tj|d|j|_y)Nr)rprvrrrrrrs r"rwzBlockCache.__setstate__Os< U##J9#6#6u[7I#J   $  r$c|d}| |j}||jk\s||k\ry||jz}||jz}t||dzD]}|j||j ||||S)Nrr$rOstart_block_numberend_block_number)rrrbr _read_cache)r!r'rgrr block_numbers r"r)zBlockCache._fetchUs =E ;))C DII ##dnn4$..0""46F6JKL  $ $\ 2L  1-    r$c@||jkDrtd|d|jd||jz}||jz}|xj||z z c_|xjdz c_t j d|t|!||}|S)= Fetch the block of data for `block_number`. 'block_number=(' is greater than the number of blocks (r_rOzBlockCache fetching block %d) r ValueErrorrr rr`inforHr))r!rr'rgblock_contentsr1s r"rzBlockCache._fetch_blockls $,, & /))-a9  t~~-dnn$ ""cEk1" 1 2LAs3r$c ||jz}||jz}|xjdz c_||k(r|j|}|||S|j||dg}|jt |jt |dz||j |j|d|dj|Sz Read from our block cache. Parameters ---------- start, end : int The start and end byte positions. start_block_number, end_block_number : int The start and end block numbers. rONr$rrrextendmaprbappendjoin r!r'rgrr start_posend_posblockouts r"rzBlockCache._read_cache~sDNN* & ! !1 1334FGE7+ +++,>? KLC JJ,,,q02BC  JJt//0@A(7K L88C= r$ rr5rr6rr5rr5r7r8ryr{rx)rr5r7r; r'r5rgr5rr5rr5r7r;) r2r>r?r@rr#rrtrwr)rrr|r}s@r"rrs, DMOUU'.U69UFIU U 5   .$&!&!"&!8;&!OR&! &!r$rcZeZdZUdZdZded< d d fd Zd dZd dZxZ S) BytesCacheaKCache which holds data in a in-memory bytes object Implements read-ahead by the block size, for semi-random reads progressing through the file. Parameters ---------- trim: bool As we read more data, whether to discard the start of the buffer when we are more than a blocksize ahead of it. r;rrc`t||||d|_d|_d|_||_y)Nr$)rHr#rMr'rgtrim)r!rrrrr1s r"r#zBytesCache.__init__s2 GT2 !% # r$c|d}| |j}||jk\s||k\ry|jc||jk\rT|jH||jkr9||jz }|xjdz c_|j|||z|z S|j r$t |j||j z}n|}||k(s||jkDry|j||jkrh|j||jkDrM|xj||z z c_|xjdz c_|j|||_||_n|jJ|jJ|xjdz c_||jkr|j|j|z |j kDr8|xj||z z c_|j|||_||_n4|xj|j|z z c_|j||j}||_||jz|_n|j||jkDr|j|jkDrn||jz |j kDr7|xj||z z c_|j|||_||_nR|xj||jz z c_|j|j|}|j|z|_|jt|jz|_||jz }|j|||z|z }|jrq|j|jz |j dzz}|dkDrC|xj|j |zz c_|j|j |zd|_|Sr) rr'rgrrMrrer rrrcr)r!r'rgoffsetbendnewrnums r"r)zBytesCache._fetchsJ =E ;))C DII # JJ "#$dhhTZZ'F NNa N::fv|e';< < >>tyy#"67DD 5=EDII- JJ %$**"4 HH dhh  & &$, 6 & OOq OeT2DJDJ::) ))88' '' OOq Otzz!88#txx#~'F..$,>.!%eT!:DJ!&DJ..$**u2DD.,,udjj9C!&DJ!$tzz!1DJ%$/88dii'488^dnn4..$,>.!%eT!:DJ!&DJ..$/A.,,txx6C!%c!1DJ::DJJ/#jj&3,"67 9988djj(dnnq.@ACQw dnns22 !ZZ(<(>?  r$c,t|jSrG)rcrMr+s r"__len__zBytesCache.__len__s4::r$)T) rr5rr6rr5rboolr7r8rx)r7r5) r2r>r?r@rrAr#r)rr|r}s@r"rrsT "D-!IM'.69AE GRr$rcXeZdZUdZdZded< d dfd Zd dZxZS) AllBytesz!Cache entire contents of the fileallrrct|||||P|xjdz c_|xj|jz c_|j d|j}||_y)NrOr)rHr#rr rrr)r!rrrrr1s r"r#zAllBytes.__init__sY GT2 < OOq O  & &$)) 3 &<<499-D r$cJ|xjdz c_|j||S)NrO)rrr&s r"r)zAllBytes._fetchs! !yyt$$r$)NNNN) rr:rzFetcher | Nonerr:rz bytes | Noner7r8r9 r2r>r?r@rrAr#r)r|r}s@r"rr sX+D-!%"&!         %r$rc\eZdZUdZdZded< d dfd Zd fd ZxZS) KnownPartsOfAFilea Cache holding known file parts. Parameters ---------- blocksize: int How far to read ahead in numbers of bytes fetcher: func Function of the form f(start, end) which gets bytes from remote as specified size: int How big this file is data: dict A dictionary mapping explicit `(start, stop)` file-offset tuples with known bytes. strict: bool, default True Whether to fetch reads that go beyond a known byte-range boundary. If `False`, any read that ends outside a known part will be zero padded. Note that zero padding will not be used for reads that begin outside a known byte-range. partsrrc t||||||_|rt|j }|dg}|j |dg} |ddD]m\} } |d\} } | | k(r&| | f|d<| dxx|j | | fz cc<9|j | | f| j |j | | fott|| |_ yi|_ y)NrrO) rHr#strictsortedkeysrdrdictzipr)r!rrrrr_ old_offsetsoffsetsrJr'r(start0stop0r1s r"r#zKnownPartsOfAFile.__init__=s GT2   -K"1~&Ghh{1~./F*12 t ' E>#)4.GBK2J$((E4="99JNNE4=1MM$((E4="9: /S&12DIDIr$ct|d}| |j}d}|jjD]t\\}}}||cxkr|ksn||z }||||z|z }|jr||cxkr|kr3nn0|d||z t |z zz }|xj dz c_|cS|}n|j td||fdtjd||fdtjd|d ||xj||z z c_ |xjdz c_ |t|=||zS) Nrr$rOz&Read is outside the known file parts: z. z%. IO/caching performance may be poor!z!KnownPartsOfAFile cache fetching r^)rritemsrrcrrrwarningswarnr`rar rrHr)) r!r'r(rloc0loc1roffr1s r"r)zKnownPartsOfAFile._fetch[sX =E <99D"&))//"3 LT4$u#t#dl3te!34{{dd&:d&: 7dUlSX&=>>CNNa'NJ !E'#40 << Eudm_TVWX X  4eT]OD2 3   8qGH ""dUl2" 1UW^E4000r$)NT) rr5rr6rr5rz&Optional[dict[tuple[int, int], bytes]]rrrrr9rr}s@r"rr$se,"D-!8<   5   <+1+1r$rcPeZdZdZGddeZd d dZd dZd dZddZ ddZ y ) UpdatableLRUzg Custom implementation of LRU cache that allows updating keys Used by BackgroudBlockCache c6eZdZUded<ded<ded<ded<y)UpdatableLRU.CacheInfor5rlmissesmaxsizecurrsizeN)r2r>r?rArBr$r" CacheInfors    r$rctj|_||_||_d|_d|_tj|_ yr) collectionsr _cache_func _max_size_hits_misses threadingLock_lock)r!funcmax_sizes r"r#zUpdatableLRU.__init__s<+6+B+B+D  !  ^^% r$c^|rtd|j|j5||jvrH|jj ||xj dz c_|j|cdddS ddd|j |i|}|j5||j|<|xjdz c_t|j|jkDr|jjdddd|S#1swYxYw#1swY|SxYw)Nz Got unexpected keyword argument rOFlast) TypeErrorrrr move_to_endrrrrcrpopitem)r!argskwargsresults r"__call__zUpdatableLRU.__call__s >v{{}oNO O ZZ )t{{" ''- a {{4(  ) )" ) T,V, ZZ 0 &DKK  LLA L4;;$..0 ###/  0   ) ) 0  sA D)A#D"D"D,cb|j5||jvcdddS#1swYyxYwrG)rr)r!rs r" is_key_cachedzUpdatableLRU.is_key_cacheds* ZZ '4;;& ' ' 's%.c|j5||j|<t|j|jkDr|jj ddddy#1swYyxYw)NFr)rrrcrr)r!rrs r"add_keyzUpdatableLRU.add_keysV ZZ 0 &DKK 4;;$..0 ###/ 0 0 0s AA$$A-c|j5|j|jt|j|j |j cdddS#1swYyxYw)N)rrrlr)rrrrcrrrr+s r"rzUpdatableLRU.cache_infosP ZZ >>T[[)ZZ|| "   s AAA&N))rzCallable[P, T]rr5r7r8)rzP.argsrzP.kwargsr7r)rrr7r)rrrrr7r8r7r) r2r>r?r@r rr#rrr rrBr$r"rrs, J &&'0 r$rceZdZUdZdZded< d d fd ZddZddZddZ dd Z ddfd Z dd Z xZ S)BackgroundBlockCachea Cache holding memory as a set of blocks with pre-loading of the next block in the background. Requests are only ever made ``blocksize`` at a time, and are stored in an LRU cache. The least recently accessed block is discarded when more than ``maxblocks`` are stored. If the next block is not in cache, it is loaded in a separate thread in non-blocking way. Parameters ---------- blocksize : int The number of bytes to store in each block. Requests are only ever made for ``blocksize``, so this should balance the overhead of making a request against the granularity of the blocks. fetcher : Callable size : int The total size of the file being cached. maxblocks : int The maximum number of blocks to cache for. The maximum memory use for this cache is then ``blocksize * maxblocks``. backgroundrrct||||tj||z |_||_t |j||_td|_ d|_ d|_ tj|_y)NrO max_workers)rHr#rrrrrrrr_thread_executor_fetch_future_block_number _fetch_futurerr_fetch_future_lockrs r"r#zBackgroundBlockCache.__init__ss GT2yy !12 "#/0A0A9#M 2q A6:'37"+.."2r$c6|jjSrrr+s r"rzBackgroundBlockCache.cache_inforr$c<|j}|d=|d=|d=|d=|d=|S)Nrrrrrrrrs r"rtz!BackgroundBlockCache.__getstate__s<  ' ( $ % . / / " & ' r$c|jj|t|j|d|_t d|_d|_d|_tj|_ y)NrrOr) rprvrrrrrrrrrrrrs r"rwz!BackgroundBlockCache.__setstate__sZ U##/0A0A5CU#V 2q A*.'!"+.."2r$c|d}| |j}||jk\s||k\ry||jz}||jz}d}d}|j5|j|jJ|jj rbt jd|jj|jj|jd|_d|_nKt||jcxkxr|knc}|r&|j}|j}d|_d|_ddd|?t jd|jj|j|t||dzD]}|j||dz} |j5|j]| |jkrN|jj| s3| |_|jj!|j"| d|_ddd|j%||||S#1swYxYw#1swY+xYw)Nrr$z3BlockCache joined background fetch without waiting.z(BlockCache waiting for background fetch.rOasyncr)rrrrrdoner`rrr rrrbrrrsubmitrr) r!r'rgrrfetch_future_block_number fetch_future must_joinrend_block_plus_1s r"r)zBackgroundBlockCache._fetch sZ =E ;))C DII ##dnn4$..0$(!  $ $ 2!!-66BBB%%**,KK UV,,44**113T5T5T7;D3)-D&!%*::,+,!I !594S4S1'+'9'9 ;?7-1*7 2<  # KKB C  $ $ , ,##%'@  ""46F6JKL  $ $\ 2L ,a/  $ $ ""*$ 400>>?OP2B/%)%:%:%A%A%%'7&"   1-    o 2 2X  sC"H>7A*I >I IcB||jkDrtd|d|jd||jz}||jz}tj d|||xj ||z z c_|xj dz c_t|!||}|S)rrrr_z!BlockCache fetching block (%s) %drO) rrrr`rr rrHr))r!rlog_infor'rgrr1s r"rz!BackgroundBlockCache._fetch_blockVs $,, & /))-a9  t~~-dnn$ 7<P ""cEk1" 1s3r$c ||jz}||jz}|xjdz c_||k(r|j|}|||S|j||dg}|jt |jt |dz||j |j|d|dj|Srrrs r"rz BackgroundBlockCache._read_cachehsDNN* & ! !1 1,,-?@E7+ +++,>? KLC JJ,,,q02BC  JJt//0@A(7K L88C= r$rrr ryr<rx)sync)rr5r#r=r7r;r)r2r>r?r@rrAr#rrtrwr)rrr|r}s@r"rrs2'D-&MO 3 3'. 369 3FI 3  3 53J X$(!(!"(!8;(!OR(! (!r$rz!dict[str | None, type[BaseCache]]cachescr|j}|s |tvrtd|dt||t|<y)z'Register' cache implementation. Parameters ---------- clobber: bool, optional If set to True (default is False) - allow to overwrite existing entry. Raises ------ ValueError zCache with name z is already known: N)rr&r)clsclobberrs r"register_cacher*s= 88D tv~+D83Fvd|nUVVF4Lr$)F)r(ztype[BaseCache]r)rr7r8). __future__rrrloggingrrSrrconcurrent.futuresrrtypingrrrr r r r r rrEtyping_extensionsrrr getLoggerr`r5r;r6rrDrrrrrrrrr&rAr*crBr$r"r2s]" 9   +#A A CL   8 $ C:u$ %A A HS& S&l+%Y+%\+,i+,\F!F!RbbJ%y%0b1 b1J971a4=9xK!9K!` )-) (   A1 r$