Lessons in Performance: Caching, Batching, and Fuzzy Matching

tl;dr I built my own fuzzy finder called ff.nvim

About a half a year ago, I started down a rabbit hole of building frecency-based fuzzy finders. At the time, I was using fzf-lua as my primary picker, and I was interested in modifying fzf's rankings to consider frecency scores (i.e. how frequently and recently a file has been opened).

I began by using fzf-lua's fzf_exec function to shell out to fre, a command-line frecency tracker. But I try to eliminate external dependencies when I can, and so I began writing my own lua implementation of the algorithm fre uses internally. The result was fzf-lua-frecency, a plugin for fzf-lua.

The plugin's frecency logic worked as follows (modified from the README):

When opened, files are given a score of 1. This score decays exponentially over time with a half-life of 30 days - i.e. a score of 1 will decay to 0.5 in 30 days.
Scores are not stored directly. Instead, an mpack-encoded file keeps track of the date_at_score_one for each file, which represents the time at which the file's score will decay to 1. Using the date_at_score_one, current time, and decay-rate, we can derive a file's current score.

The frecency algorithm was straightforward to implement, but the combination of using fzf to fuzzily filter frecency-ranked files had some caveats I didn't expect:

By default, fzf will filter out results based on the current input and sort the results to display the most relevant items first. With the --no-sort option enabled, fzf will continue to filter out results based on the current input, but it will not sort the list of results itself as you type. fzf-lua-frecency defaults --no-sort to true since the list provided to fzf is already sorted: frecent files first (in order), everything else after. As a result, with --no-sort enabled, frecent items will always rank first in the list of results - even if another entry has a better fuzzy score based on the current input.

Given a user input, fzf-lua-frecency would eliminate entries based on its fuzzy score and sort entries based on its frecency score. That worked, but what I really wanted was a fuzzy finder that would rank files using both their fuzzy match quality and their frecency score together. The difference may sound subtle, but the user experience is drastically different. For the former, the order of the search results never changes: entries are only eliminated as you type, with frecency always determining rank. For the latter, every keystroke can reorder the files completely, allowing a file with a great fuzzy match to rank above a frecent file with a poor match. The ranking becomes more responsive to what you're actually typing.

Around this time, I came across fff.nvim, a new fuzzy file finder with a great name. Inspired, I started work on ff.nvim - I figured it would be a bit more barebones and deserved one fewer f.

First attempt

At a high level, my plan was that the fuzzy finder would, for a given input:

Pull the files in the cwd
Calculate a file's "fuzzy" score against that input
Calculate a file's existing frecency score
Calculate a file's misc score (i.e. is the file open, is it modified, etc)
Scale the values to the same min-max range
Weigh each score by some multiple and add them together i.e. 0.7 * fuzzy + 0.3 * (frecency + misc)
Sort the files based on their total score
Render the files in their sorted order with filetype icons

Coming off the heels of fzf-lua-frecency, I initially wanted to build my fuzzy finder using fzf as well. Using the --disabled flag, fzf can become a standalone file picker with no filtering or sorting of it's own - which was ideal, since I wanted to do that myself. junegunn has a great walkthrough that uses fzf as a picker for results supplied by rg. In my case, I would supply the picker results with a lua script: nvim --headless -l [path-to-lua-script]. To render the fzf picker UI within Neovim, I used the approach outlined in my last post where I open a terminal buffer which runs the fzf command.

As a proof of concept it worked, but boy was it slow. The core issue was that every keystroke executed a standalone lua script which fed the search results to fzf through stdout. But since every script invocation was independent, there was no easy way to pass state between different script calls - which meant there was no easy way to cache data from one keystroke to another. If I wanted my fuzzy finder to be performant, I needed caching, and to cache, I needed to move past fzf and build my own picker.

A basic picker UI

Building a basic picker UI turned out to be fairly straightforward, which I think is a testament to Neovim's API design and overall hackability. At a high level, my barebones picker:

Creates two buffers: one for the user input, one for the results
- :h nvim_create_buf
Opens two windows, setting the input window to a height of 1
- :h nvim_open_win
- :h nvim_win_set_height
Creates an autocommand to close both windows when closing either
- :h nvim_create_autocmd
- :h nvim_win_is_valid
- :h nvim_win_close
Populates the results buffer based on the current value of the input buffer
- :h TextChanged
- :h TextChangedI
- :h nvim_buf_get_lines
- :h nvim_buf_set_lines
Sets insert mode keymaps for the input buffer that perform some action in the context of the results buffer
- :h vim.keymap.set
- :h nvim_win_call

Using this approach, I was able to write a basic picker in ~50 lines of code.

Caching

Now that I had a picker that could persist state between keystrokes, I could begin caching.

In my mind, there were three categories for how long data would be fresh - which is just another way of saying how long it can be cached.

Data that's fresh for an entire vim session
Data that's fresh for an entire picker session
Data that's only fresh for a single keystroke

Referencing the pseudocode from earlier, here's how I thought about caching for each step:

Pull the files in the cwd
- Can be cached for the entire vim session, or at least once per cwd change
Calculate a file's "fuzzy" score against that input
- Can't be cached, depends on the keystroke
Calculate a file's existing frecency score
- Can be cached per picker session (can't be cached per vim session because a file's frecency score would update if it's opened in a picker session)
Calculate a file's misc score (i.e. is the file open, is it modified, etc)
- The list of open buffers can be cached per picker session
Scale both values to the same min-max range
- Can't be cached, depends on a score which depends on the keystroke
Weigh each score by some multiple and add them together i.e. 0.7 * fuzzy + 0.3 * (frecency + misc)
- Can't be cached, depends on scores which depends on the keystroke
Sort the files based on their total score
- Can't be cached, depends on the keystroke
Render the files in their sorted order with filetype icons
- Icons can be cached per picker session

These caching optimizations helped, but the picker still felt slow - the UI was blocked as the results were calculated, which would prevent the latest keystrokes from rendering in the input buffer.

Batching to yield to the main thread

To avoid blocking the UI from updating as picker results were processed, I began to process the entries in small batches, yielding to the main thread after each batch. In other words, before batching, the flow looked like:

User types ab
The Neovim UI renders a
Results for a are processed - slow!
Results for a are rendered
The Neovim UI renders ab
Results for ab are processed
Results for ab are rendered

After batching:

User types ab
The Neovim UI renders a
Batch #1 of the results for a is processed - fast!
The fuzzy finder yields back to the main thread
The Neovim UI renders ab
Batch #2 of the results for a is processed
...
Results for a are rendered
Batch #1 of the results for ab is processed
Batch #2 of the results for ab is processed
...
Results for ab are rendered

Yielding to the main thread sounds complex but is actually quite simple: finish the current function and schedule the next batch with vim.schedule. A simple batching function can look like:


local batch_size = 2

local step
--- @generic T
--- @param list T[]
--- @param on_iteration fun(entry: T):nil
--- @param on_complete fun():nil
step = function(list, on_iteration, on_complete, start)
  vim.print('step')
  start = start or 1
  for i = start, math.min(#list, start + batch_size - 1) do
    on_iteration(list[i])
  end
  start = start + batch_size
  if start > #list then
    on_complete()
  else
    vim.schedule(function() step(list, on_iteration, on_complete, start) end)
  end
end

step({ 1, 2, 3, 4, 5 }, function(entry) vim.print(entry) end, function() vim.print "completed" end)

-- step
-- 1
-- 2
-- step
-- 3
-- 4
-- step
-- 5
-- completed

I'm working on another post about improving the ergonomics of batching functions, stay tuned.

An unexpected bottleneck

With caching and batching in place, average processing times were significantly down. However, I began to notice an interesting pattern when inspecting the logs I set up: processing would take significantly longer for a short input than for a lengthier input. I found this surprising, since the fuzzy algorithm should need to compare the input to every file regardless of the input's length. Moreover, with a shorter input there's fewer letters to take into account during these operations - it should be faster, not slower.

I investigated further by having Claude ingest matchfuzzypos's source code (I'm not too comfortable reading C), which found that the algorithm is much quicker to filter out results that don't have a fuzzy match at all than it is calculating the score and indices for results which do have a fuzzy match. For short inputs, nearly every file would have a fuzzy match against the user input - more files are going to have a a followed by a b than an a followed by a b followed by a c. For lengthier inputs, very few files would have a fuzzy match against the user input, which allows the algorithm to speed along to the files that do have a match. So, how to speed up processing shorter inputs?

The solution I came up with is simple yet effective: I cap the number of files that are fuzzy matched to a fixed constant. Once n files are found to have a fuzzy match, I skip the remaining files and move onto the sorting and rendering.

Although this heuristic has the potential to miss processing the "target" file (the file that the user is searching for), I've found this isn't the case in practice. For short inputs, it's unlikely that the needle will have a fuzzy score that would rank higher than it's haystack neighbors - it's hard to locate a file with a short user input based on fuzzy matching alone.

That's where frecency scores come into play: a file with a high frecency score can very well rank higher than its non-frecenct neighbors - even with a very short user input - if its been opened frequently and recently enough. So to ensure that frecenct files are considered even with the capping heuristic in place, I simply calculate fuzzy scores for files with an existing frecency score before calculating scores for non-frecent files.

This solution worked great: benchmarks for shorter inputs were down to the same speed as lengthier inputs, and I was ready to call it a day.

Conclusion

With all the optimizations in place, I average ~20ms on a codebase of 60k files on my local machine - a speed which feels pretty great. It took a bit to get there, but the results are worth it.

Caching bought back wasted computation, batching kept the UI responsive, and the frecency-first heuristic turned the worst case (short inputs) into just another average case. None of these optimizations alone would have been enough, but together they transformed an interesting prototype into something I reach for on the daily. It's stable, fast, most importantly, something I wrote myself. Thanks for reading.

Five Interesting Things From my Neovim Config (2025 Edition)

July 28, 2025

A few more tips I've picked up since 2022

software eng

vim