last updated: Dec 17, 2027
11 minute read
Lessons in Performance: Caching, Batching, and Fuzzy Matching
tl;dr I built my own fuzzy finder called
ff.nvim
About a half a year ago, I started down a rabbit hole of building frecency-based fuzzy finders. At
the time, I was using fzf-lua as my primary picker, and I
was interested in modifying fzf's rankings to consider frecency scores (i.e. how frequently and
recently a file has been opened).
I began by using fzf-lua's fzf_exec function to shell out to
fre, a command-line frecency tracker. But I try to eliminate
external dependencies when I can, and so I began writing my own lua implementation of the
algorithm fre uses internally. The result was
fzf-lua-frecency, a plugin for fzf-lua.
The plugin's frecency logic worked as follows (modified from the README):
- When opened, files are given a score of
1. This score decays exponentially over time with a half-life of 30 days - i.e. a score of1will decay to0.5in 30 days. - Scores are not stored directly. Instead, an
mpack-encoded file keeps track of thedate_at_score_onefor each file, which represents the time at which the file's score will decay to1. Using thedate_at_score_one, current time, and decay-rate, we can derive a file's current score.
The frecency algorithm was straightforward to implement, but the combination of using fzf to
fuzzily filter frecency-ranked files had some caveats I didn't expect:
By default,
fzfwill filter out results based on the current input and sort the results to display the most relevant items first. With the--no-sortoption enabled,fzfwill continue to filter out results based on the current input, but it will not sort the list of results itself as you type.fzf-lua-frecencydefaults--no-sorttotruesince the list provided tofzfis already sorted: frecent files first (in order), everything else after. As a result, with--no-sortenabled, frecent items will always rank first in the list of results - even if another entry has a better fuzzy score based on the current input.
Given a user input, fzf-lua-frecency would eliminate entries based on its fuzzy score and sort
entries based on its frecency score. That worked, but what I really wanted was a fuzzy finder that
would rank files using both their fuzzy match quality and their frecency score together. The
difference may sound subtle, but the user experience is drastically different. For the former, the
order of the search results never changes: entries are only eliminated as you type, with frecency
always determining rank. For the latter, every keystroke can reorder the files completely, allowing
a file with a great fuzzy match to rank above a frecent file with a poor match. The ranking becomes
more responsive to what you're actually typing.
Around this time, I came across fff.nvim, a new fuzzy file finder with a great name. Inspired, I
started work on ff.nvim - I figured it would be a bit more
barebones and deserved one fewer f.
First attempt
At a high level, my plan was that the fuzzy finder would, for a given input:
- Pull the files in the
cwd - Calculate a file's "fuzzy" score against that input
- Calculate a file's existing frecency score
- Calculate a file's misc score (i.e. is the file open, is it modified, etc)
- Scale the values to the same min-max range
- Weigh each score by some multiple and add them together i.e.
0.7 * fuzzy + 0.3 * (frecency + misc) - Sort the files based on their total score
- Render the files in their sorted order with filetype icons
Coming off the heels of fzf-lua-frecency, I initially wanted to build my fuzzy finder using fzf
as well. Using the --disabled flag, fzf can become a standalone file picker with no filtering or
sorting of it's own - which was ideal, since I wanted to do that myself. junegunn has a great
walkthrough that uses fzf as a picker
for results supplied by rg. In my case, I would supply the picker results with a lua script:
nvim --headless -l [path-to-lua-script]. To render the fzf picker UI within Neovim, I used the
approach outlined in my last post where I open a terminal buffer which
runs the fzf command.
As a proof of concept it worked, but boy was it slow. The core issue was that every keystroke
executed a standalone lua script which fed the search results to fzf through stdout. But since
every script invocation was independent, there was no easy way to pass state between different
script calls - which meant there was no easy way to cache data from one keystroke to another. If I
wanted my fuzzy finder to be performant, I needed caching, and to cache, I needed to move past fzf
and build my own picker.
A basic picker UI
Building a basic picker UI turned out to be fairly straightforward, which I think is a testament to Neovim's API design and overall hackability. At a high level, my barebones picker:
- Creates two buffers: one for the user input, one for the results
:h nvim_create_buf
- Opens two windows, setting the input window to a height of
1:h nvim_open_win:h nvim_win_set_height
- Creates an autocommand to close both windows when closing either
:h nvim_create_autocmd:h nvim_win_is_valid:h nvim_win_close
- Populates the results buffer based on the current value of the input buffer
:h TextChanged:h TextChangedI:h nvim_buf_get_lines:h nvim_buf_set_lines
- Sets insert mode keymaps for the input buffer that perform some action in the context of the
results buffer
:h vim.keymap.set:h nvim_win_call
Using this approach, I was able to write a basic picker in ~50 lines of code.
Caching
Now that I had a picker that could persist state between keystrokes, I could begin caching.
In my mind, there were three categories for how long data would be fresh - which is just another way of saying how long it can be cached.
- Data that's fresh for an entire vim session
- Data that's fresh for an entire picker session
- Data that's only fresh for a single keystroke
Referencing the pseudocode from earlier, here's how I thought about caching for each step:
- Pull the files in the
cwd- Can be cached for the entire vim session, or at least once per
cwdchange
- Can be cached for the entire vim session, or at least once per
- Calculate a file's "fuzzy" score against that input
- Can't be cached, depends on the keystroke
- Calculate a file's existing frecency score
- Can be cached per picker session (can't be cached per vim session because a file's frecency score would update if it's opened in a picker session)
- Calculate a file's misc score (i.e. is the file open, is it modified, etc)
- The list of open buffers can be cached per picker session
- Scale both values to the same min-max range
- Can't be cached, depends on a score which depends on the keystroke
- Weigh each score by some multiple and add them together i.e.
0.7 * fuzzy + 0.3 * (frecency + misc)- Can't be cached, depends on scores which depends on the keystroke
- Sort the files based on their total score
- Can't be cached, depends on the keystroke
- Render the files in their sorted order with filetype icons
- Icons can be cached per picker session
These caching optimizations helped, but the picker still felt slow - the UI was blocked as the results were calculated, which would prevent the latest keystrokes from rendering in the input buffer.
Batching to yield to the main thread
To avoid blocking the UI from updating as picker results were processed, I began to process the entries in small batches, yielding to the main thread after each batch. In other words, before batching, the flow looked like:
- User types
ab - The Neovim UI renders
a - Results for
aare processed - slow! - Results for
aare rendered - The Neovim UI renders
ab - Results for
abare processed - Results for
abare rendered
After batching:
- User types
ab - The Neovim UI renders
a - Batch #1 of the results for
ais processed - fast! - The fuzzy finder yields back to the main thread
- The Neovim UI renders
ab - Batch #2 of the results for
ais processed - ...
- Results for
aare rendered - Batch #1 of the results for
abis processed - Batch #2 of the results for
abis processed - ...
- Results for
abare rendered
Yielding to the main thread sounds complex but is actually quite simple: finish the current function
and schedule the next batch with vim.schedule. A simple batching function can look like:
local batch_size = 2local step--- @generic T--- @param list T[]--- @param on_iteration fun(entry: T):nil--- @param on_complete fun():nilstep = function(list, on_iteration, on_complete, start)vim.print('step')start = start or 1for i = start, math.min(#list, start + batch_size - 1) doon_iteration(list[i])endstart = start + batch_sizeif start > #list thenon_complete()elsevim.schedule(function() step(list, on_iteration, on_complete, start) end)endendstep({ 1, 2, 3, 4, 5 }, function(entry) vim.print(entry) end, function() vim.print "completed" end)-- step-- 1-- 2-- step-- 3-- 4-- step-- 5-- completed
I'm working on another post about improving the ergonomics of batching functions, stay tuned.
An unexpected bottleneck
With caching and batching in place, average processing times were significantly down. However, I began to notice an interesting pattern when inspecting the logs I set up: processing would take significantly longer for a short input than for a lengthier input. I found this surprising, since the fuzzy algorithm should need to compare the input to every file regardless of the input's length. Moreover, with a shorter input there's fewer letters to take into account during these operations - it should be faster, not slower.
I investigated further by having Claude ingest matchfuzzypos's source code (I'm not too
comfortable reading C), which found that the algorithm is much quicker to filter out results that
don't have a fuzzy match at all than it is calculating the score and indices for results which
do have a fuzzy match. For short inputs, nearly every file would have a fuzzy match against the
user input - more files are going to have a a followed by a b than an a followed by a b
followed by a c. For lengthier inputs, very few files would have a fuzzy match against the user
input, which allows the algorithm to speed along to the files that do have a match. So, how to speed
up processing shorter inputs?
The solution I came up with is simple yet effective: I cap the number of files that are fuzzy
matched to a fixed constant. Once n files are found to have a fuzzy match, I skip the remaining
files and move onto the sorting and rendering.
Although this heuristic has the potential to miss processing the "target" file (the file that the user is searching for), I've found this isn't the case in practice. For short inputs, it's unlikely that the needle will have a fuzzy score that would rank higher than it's haystack neighbors - it's hard to locate a file with a short user input based on fuzzy matching alone.
That's where frecency scores come into play: a file with a high frecency score can very well rank higher than its non-frecenct neighbors - even with a very short user input - if its been opened frequently and recently enough. So to ensure that frecenct files are considered even with the capping heuristic in place, I simply calculate fuzzy scores for files with an existing frecency score before calculating scores for non-frecent files.
This solution worked great: benchmarks for shorter inputs were down to the same speed as lengthier inputs, and I was ready to call it a day.
Conclusion
With all the optimizations in place, I average ~20ms on a codebase of 60k files on my local machine - a speed which feels pretty great. It took a bit to get there, but the results are worth it.
Caching bought back wasted computation, batching kept the UI responsive, and the frecency-first heuristic turned the worst case (short inputs) into just another average case. None of these optimizations alone would have been enough, but together they transformed an interesting prototype into something I reach for on the daily. It's stable, fast, most importantly, something I wrote myself. Thanks for reading.
you might also like:
A Barebones Approach to Continuous Integration
May 05, 2023
A handful of bash scripts and an unexpected git hook