jermp/sshash
📖 🧬 SSHash is a compressed, associative, exact, and weighted dictionary for k-mers.
102Stars
18Forks
28Claude Commits
C++Language
bioinformaticsdictionaryhashingk-merminimal-perfect-hash
First Claude commit: May 4, 2026Last Claude commit: 1mo agoDiscovered: May 7, 2026
Recent Claude Commits
docs(build-algorithm): rephrase 'O(buffer)' as 'proportional to the buffer size'
f19ffcc1mo agoauthor_emaildocs(build-algorithm): use real CLI flag names (-g, -d)
730f0ad1mo agoauthor_emaildocs: add build-algorithm.md describing the streaming build pipeline
5aff9f71mo agoauthor_emailalways build via streaming-save; mmap the saved file for --check
033d59f1mo agoauthor_emailremove dead bucket_type / minimizers_tuples_iterator
659b9561mo agoauthor_emailbuild_stats: format step timings as seconds with [sec] unit
030f1d01mo agoauthor_emailfinalize_stats: print total bits/kmer also in streaming-save flow
d8328511mo agoauthor_emailRevert "build: add --no-streaming-save flag for in-RAM save path"
d7dc21d1mo agoauthor_emailbuild: add --no-streaming-save flag for in-RAM save path
a35c3641mo agoauthor_emailfactor out buffered_record_stream<T>; remove duplicated read loops
b3e49c91mo agoauthor_emailbump pthash submodule to master tip a95e814
365758b1mo agoauthor_emailtighten ram-proportional buffer caps from ram/4 to ram/8
c550d531mo agoauthor_emailbump pthash to claude/fix-pthash-memory-estimate-NhgTI
f68fa771mo agoauthor_emailclamp --ram-limit to a 4 GiB floor
8e4b0d81mo agoauthor_emailmphf: scale avg_partition_size to honor -t; cap only when pathological
869f9011mo agoauthor_emailmphf thread cap: only kick in when budget would actually be exceeded
e18b9b41mo agoauthor_emailcap pthash mphf num_threads by --ram-limit
dddee471mo agoauthor_emailspill the codewords + per-skew-partition MPHFs to disk
2c73e091mo agoauthor_emailspill the step-7 compact_vectors to disk; concatenate at save
27c71e81mo agoauthor_emailstrings_offsets: stream to disk during build
fe443261mo agoauthor_emailremove all remaining mmap from the SSHash build path
a6ac6141mo agoauthor_emailstep 7.1 + 7.2 phase A: drop mmap; single ifstream pass
5f9ec801mo agoauthor_emailstep 7.2 phase C: stream per-partition kmers from disk; external-memory MPHF
03012011mo agoauthor_emailstep 7.1: drop redundant tuples copy; point bucket_type into mmap
e5d26121mo agoauthor_emailstreaming dictionary save (no full strings in RAM at any point)
54e98eb1mo agoauthor_emailstrings: stream to disk during build, read via small windows
63ea2bd1mo agoauthor_emailstep 7.2: stream over strings instead of random access
259b7fc1mo agoauthor_emailfixed issue with minimizers_tuples_iterator
54aa00b1mo agoauthor_email