Spooky Allocator Issues and Fixes
Recently we started noticing performance issues in the main branch of Ceph that ultimately were traced back to a commit last summer that changed parts of our AVL and hybrid disk allocator implementations in bluestore. Strangly, the issue only affected some of the NVMe drives in our test lab but not others. The quick fix was to always update and save the allocator’s cursor position so that we don’t search (and fail) over and over in fast-fit mode for every allocation request. Another interesting offshoot of this though is that it may be much nicer to limit fast-fit searches based on time rather than byte distance or the number of iterations.