summary refs log tree commit diff
path: root/ripple/fossil/src/chunker/mod.rs
AgeCommit message (Collapse)Author
2022-06-12ripple/fossil/chunker: use hex_literal rather than hex escapesedef
A little more explicit, and a bit more readable. Change-Id: Id462c46236e1de547aabd36409260bd1c99f6f88
2022-05-03ripple/fossil/chunker: clean up SAFETY commentsedef
stdlib code seems to place these before the blocks, so let's copy their style. Change-Id: Ic77ed43bc8c6807c5ddb126e624f263f8bca5b66
2022-05-02ripple/fossil/chunker: iterate smarteredef
This drops the manual `len <= MIN_CHUNK_SIZE` check, and instead combines it into acquiring the to-be-scanned chunk. The pointer-based design doesn't need the iterator to be enumerated from the start of the buffer, so we don't need to use take/skip. Throughput improves about 5%. Change-Id: Ic430c7afde68bf1acfba1a2137a0b8ac064176ea
2022-05-02ripple/fossil/chunker: use const computation for DISCRIMINATORedef
While `const fn` isn't permitted float computation, regular `const` is. This deals with LLVM's reluctance to inline discriminator_from_average, without forcing us to hardcode a magic number. Change-Id: Ibdbfa4c733468517a2feff1ec0deedd1d9b70d47
2022-05-02ripple/fossil/chunker: remove hasher initialisation bounds checkedef
We already check for `self.buffer.len() <= MIN_CHUNK_SIZE`, but LLVM doesn't seem to notice. This boosts throughput by 35%. Change-Id: I1a0e07d276dcc285f8dec3149a629cb6e865c286
2022-05-02ripple/fossil/chunker: factor out Chunker::cutedef
Change-Id: I4fed55703cd02833f377ed0bbc659f3fcfdb949f
2022-05-02ripple/fossil/chunker: eliminate indexing and bounds checksedef
This improves performance by ~12%. Change-Id: I5612b624da77b138fcfb44cbb439b0106580ed70
2022-05-02ripple/fossil/chunker: hardcode the discriminatoredef
This improves performance by ~17%. I had *expected* that rustc would have reduce it to a constant already, but alas. Change-Id: I5c15fe90244da64498d2d6562262db58242ffb24
2022-05-01ripple/fossil/chunker: DRY out super::* from the testsedef
Change-Id: I7f7c5556dda64f0055f1b6d2da37c36b5c684092
2022-05-01ripple/fossil/chunker: DRY out WINDOW_SIZEedef
Change-Id: Ib5a0bc2fb5b725dfe1f7f4557838529711407203
2022-05-01ripple/fossil/chunker: simplify and test Chunker::size_hintedef
Full test coverage for fossil/chunker! :) Change-Id: I0436a266220bbed6d85c291dcca827d1770294dd
2022-05-01ripple/fossil/chunker: test early cutoff for <= MIN_CHUNK_SIZEedef
Full test coverage for Chunker::next! Change-Id: I4f3dbad7e0a56f46d5714e0dd8e07f00ce255928
2022-05-01ripple/fossil/chunker: drop unused derivesedef
Free test coverage win! :) Change-Id: I9bab30e0f0da2810c770cbd8ba5603f0eb2b28e7
2022-05-01ripple/fossil/chunker: clarify MAX_CHUNK_SIZE testedef
Change-Id: Ia5adb5a9056fd0e9ddcd8667c56129219b9d6f52
2022-05-01ripple/fossil/chunker: handle and test boundary condition correctnessedef
This ensures that MIN_CHUNK_SIZE-sized chunks can actually be emitted, and adds tests for both MIN_CHUNK_SIZE and MAX_CHUNK_SIZE chunks. The behaviour for all cases now verifiably matches casync. Change-Id: Ie0bfaf50ec02658069da83ebb30210e6e1963de6
2022-05-01ripple/fossil/chunker: use b-strings for test dataedef
This takes up 80% less vertical space for something that isn't readable to humans to begin with. Change-Id: I04aa27755f0b8d6cdaa83d766f8bf0ecbe3b7a46
2022-04-28ripple/fossil: deduplicate using content-defined chunkingedef
This implements casync-style content-defined chunking and deduplication. Change-Id: I42a9b98e1140bed462a5ae1e0aba508bebc9fa0e