summary refs log tree commit diff
path: root/ripple/fossil/src
AgeCommit message (Collapse)Author
2022-05-03ripple/fossil: handle missing blobs more gracefullyedef
Change-Id: I8a928b57ecc81bea31d757e73b9ece9474628db4
2022-05-03ripple/fossil/mount: expose arbitrary directories by digestedef
This makes the filesystem more like eg /nix/store. ~/src/ripple> ./target/release/add fossil puma5z7rnb4tmnqk8ixgryobay9ifg8txh69635snkgx8dis6quo ~/src/ripple> ./target/release/mount & ~/src/ripple> ls mnt/puma5z7rnb4tmnqk8ixgryobay9ifg8txh69635snkgx8dis6quo benches build.rs Cargo.toml src Change-Id: Ic35f81ffec521f49ce2e4a414919e1ff717d7041
2022-05-03ripple/fossil/mount: make node references more flexibleedef
Essentially, memtree::Node becomes more of a NodeRef, and Node gets renamed to NodeBuf. This permits calling Node::find with an arbitrary owned Directory, without having to move it into the enum. Change-Id: I93838932a00f2e2073e3c7ddf7ce8d302ed4ed59
2022-05-03ripple/fossil/mount: memtree API cleanupedef
This replaces the tuples with a DirectoryEntry struct. Change-Id: I42a49fee03f7abfac9143c48106ebeb964814ca9
2022-05-03ripple/fossil: fix clippy nitedef
Change-Id: I2c9b2a15ac066ec2295d54665afd301f396efdc1
2022-05-03ripple/fossil: don't expose protobufs in the frontendedef
Previously, the CLI took Directory protobufs as input or wrote them as output. Now we just deal in store hashes. Change-Id: I5e0f0f33929ede43d971080c33bdb865f1832b2e
2022-05-03ripple/fossil: add digest_from_str and digest_stredef
These decode digests to and from zbase32 for user-facing uses. Change-Id: Ibd2db960044a97812d18d1a3c107521d78bd7f24
2022-05-03ripple/fossil/chunker: clean up SAFETY commentsedef
stdlib code seems to place these before the blocks, so let's copy their style. Change-Id: Ic77ed43bc8c6807c5ddb126e624f263f8bca5b66
2022-05-03ripple/fossil/add: only accept a single directory argedef
This changes it from building an implicit top-level directory containing all its args, to simply accepting a directory. Change-Id: Iaf00e07d8568367b9eb27f365e8a2eaac3576974
2022-05-03ripple/fossil: expose add_directoryedef
Change-Id: I8e976279bd7aaaaf325129dc5c68a6ca5c750dc6
2022-05-03ripple/fossil: don't recursively fsyncedef
`add` takes about 10 seconds to run on a full LLVM tree, unless it were to spend 4 minutes mostly waiting for a series of tiny fsyncs. It did. Change-Id: I492604bae68e3472f8626a112a33d023947e0e86
2022-05-03ripple/fossil: make store path configurableedef
Change-Id: Ic410619a6115a7059b79593c6fade38236d4e8c1
2022-05-03ripple/fossil: use clapedef
This adds clap to all our binaries. Only add currently takes any args, but previously, the others did not reject args as they should. Change-Id: I6257fb3b218c624ee0124f6ed7313a579db88c4c
2022-05-02ripple/fossil/chunker: iterate smarteredef
This drops the manual `len <= MIN_CHUNK_SIZE` check, and instead combines it into acquiring the to-be-scanned chunk. The pointer-based design doesn't need the iterator to be enumerated from the start of the buffer, so we don't need to use take/skip. Throughput improves about 5%. Change-Id: Ic430c7afde68bf1acfba1a2137a0b8ac064176ea
2022-05-02ripple/fossil/chunker: use const computation for DISCRIMINATORedef
While `const fn` isn't permitted float computation, regular `const` is. This deals with LLVM's reluctance to inline discriminator_from_average, without forcing us to hardcode a magic number. Change-Id: Ibdbfa4c733468517a2feff1ec0deedd1d9b70d47
2022-05-02ripple/fossil/chunker: remove hasher initialisation bounds checkedef
We already check for `self.buffer.len() <= MIN_CHUNK_SIZE`, but LLVM doesn't seem to notice. This boosts throughput by 35%. Change-Id: I1a0e07d276dcc285f8dec3149a629cb6e865c286
2022-05-02ripple/fossil/chunker: factor out Chunker::cutedef
Change-Id: I4fed55703cd02833f377ed0bbc659f3fcfdb949f
2022-05-02ripple/fossil/chunker: eliminate indexing and bounds checksedef
This improves performance by ~12%. Change-Id: I5612b624da77b138fcfb44cbb439b0106580ed70
2022-05-02ripple/fossil/chunker: hardcode the discriminatoredef
This improves performance by ~17%. I had *expected* that rustc would have reduce it to a constant already, but alas. Change-Id: I5c15fe90244da64498d2d6562262db58242ffb24
2022-05-01ripple/fossil: benchmark Chunker using Criterionedef
Performance hovers around 300MiB/s on my machine. Change-Id: I387ccbf065c0b667824ede0675e6a295722f6d4b
2022-05-01ripple/fossil/chunker: DRY out super::* from the testsedef
Change-Id: I7f7c5556dda64f0055f1b6d2da37c36b5c684092
2022-05-01ripple/fossil/chunker: DRY out WINDOW_SIZEedef
Change-Id: Ib5a0bc2fb5b725dfe1f7f4557838529711407203
2022-05-01ripple/fossil/chunker: simplify and test Chunker::size_hintedef
Full test coverage for fossil/chunker! :) Change-Id: I0436a266220bbed6d85c291dcca827d1770294dd
2022-05-01ripple/fossil/chunker: remove Rolling::try_from_sliceedef
We never actually use this directly, and the resulting branch is test coverage noise. Change-Id: Id32b056ca0cd57965d829085d768012e5a9e05ce
2022-05-01ripple/fossil/chunker: test early cutoff for <= MIN_CHUNK_SIZEedef
Full test coverage for Chunker::next! Change-Id: I4f3dbad7e0a56f46d5714e0dd8e07f00ce255928
2022-05-01ripple/fossil/chunker: drop unused derivesedef
Free test coverage win! :) Change-Id: I9bab30e0f0da2810c770cbd8ba5603f0eb2b28e7
2022-05-01ripple/fossil/chunker: clarify MAX_CHUNK_SIZE testedef
Change-Id: Ia5adb5a9056fd0e9ddcd8667c56129219b9d6f52
2022-05-01ripple/fossil/chunker: handle and test boundary condition correctnessedef
This ensures that MIN_CHUNK_SIZE-sized chunks can actually be emitted, and adds tests for both MIN_CHUNK_SIZE and MAX_CHUNK_SIZE chunks. The behaviour for all cases now verifiably matches casync. Change-Id: Ie0bfaf50ec02658069da83ebb30210e6e1963de6
2022-05-01ripple/fossil/chunker: use b-strings for test dataedef
This takes up 80% less vertical space for something that isn't readable to humans to begin with. Change-Id: I04aa27755f0b8d6cdaa83d766f8bf0ecbe3b7a46
2022-04-28ripple/fossil: use to_be_bytes rather than write_u64edef
Change-Id: I7eb02482772f48ca9f486f514b89652a9c5730cd
2022-04-28ripple/fossil: ensure durability of add_pathedef
sled doesn't actually promise not to eat your data until you invoke flush. This is observable under normal circumstances as add_path occasionally just not committing anything to the store. While we're at it, ensure we're syncing the chunks file data to disk, so the database *might* actually be consistent. We're not going for full crash-correctness yet, mostly for performance and complexity reasons. Change-Id: I6cc69a013dc18cd50df26e73801b185de596565c
2022-04-28ripple/fossil: deduplicate using content-defined chunkingedef
This implements casync-style content-defined chunking and deduplication. Change-Id: I42a9b98e1140bed462a5ae1e0aba508bebc9fa0e
2022-04-28ripple/fossil/mount: don't serve short readsedef
FUSE doesn't actually respect the usual read() contract, so this resulted in us serving truncated files. Change-Id: I8bdb0bd7f03162fb78774f3f84daeefc5ba5e3b1
2022-04-28ripple/fossil: outline write_blob_inneredef
Change-Id: I1c4a5d16cf10c464f9835c961481da221aa0d12e
2022-04-26ripple/fossil: use the default sled tree for global metadata, not blobsedef
Change-Id: I358a5c354c19cc8cb0a75629758fda476629406d
2022-04-25ripple/fossil/mount: implement incremental file readsedef
Change-Id: Iae189c3107a6841bcbdd75bb57dde785f9548130
2022-04-25ripple/fossil: implement incremental blob readingedef
Change-Id: I69ae53824e149133aa6bb61dda201f972c840b1f
2022-04-25ripple/fossil: store bao outboard tree in the blob metadataedef
One step closer to genuine incremental blob reads. Change-Id: I796710820c1b69baad91a6dc65f9d7f0dee311d3
2022-04-25ripple/fossil: store blob metadata as protobufedef
Change-Id: I12c590d842471bf543f16fd21056224d8a7c0857
2022-04-24ripple/fossil/mount: implement stateful file handlesedef
This will primarily allow us to amortise metadata lookups. Change-Id: Ic92781bf1ded5af62f6e955322bb89623afb2061
2022-04-24ripple/fossil: implement io::Seek for RawBlobedef
Change-Id: I625530fe2f4db89be5889e46f0a5ed50727c8cd1
2022-04-23ripple/fossil: implement a more lightweight blob store backendedef
The sled chunk store works fine, but has pretty awful performance, since sled just doesn't make a particularly good large blob store. This change replaces it with a file-based store using sled purely as a metadata store. Adding LLVM's source tree takes about 120s before, and about 15s after. Change-Id: I5fb22ea79a006fa6bcf5351921038f57f2484112
2022-04-20ripple/fossil: prefer anyhow::Result over io::Resultedef
Change-Id: I4a94b84ef456b427422757a899fdce6198fd01a1
2022-04-19ripple/fossil: use bao to one-shot verify hashesedef
Change-Id: I77ace8ee9f69ccb92afaa0a41d69538d28f11583
2022-04-19ripple: upgrade blake3 (0.3.8 -> 1.3.1)edef
Change-Id: I75f2e0ff57e09b026fd1aaaeb86b041ddb8238f4
2022-04-19ripple/fossil: prepare for seekable, streaming blob readingedef
This implements blob reading in terms of RawBlob, a fairly naive streaming blob reader. For now, we still only use it for simple one-shot reads. Change-Id: Iecd4f926412b474ca6f3dde8c6055c0c3781301f
2022-04-19ripple/fossil: add read_write testedef
Change-Id: I88d13d9dd7055b8370706df7b3dd4479a0891399
2022-04-18ripple/fossil: use blake3::CHUNK_LEN as chunk sizeV
This will pave the way for BLAKE3 verified streaming, so we won't have to read objects into memory in their entirety. Change-Id: Ic68dee2ad81448db4969b8c423f0876f0e0272e0
2022-04-15ripple/fossil/mount: drop bmapedef
We'll never write a direct block-mapped filesystem, so this won't become relevant. Change-Id: I512ac44fc40f3969c2d54b93a1e2725628f46ed4
2022-04-15ripple/fossil/mount: move writing methods into their own sectionedef
Change-Id: Ia5b63f3be625c738c58a915bc46114cac7acc0a5