## Work

• Not that much this week, again; though it’s because I was covering support on Tuesday, and then off sick on Wednesday and Thursday.

• I changed the metric we use to evaluate search performance from DCG to nDCG, and wrote a long commit message about why, which got me wondering what my most well-explained commit was. I came up with this script to rank my commits by proportion of commit message to commit diff:

for hash in $(git log --format='%H' --no-merges --author=mike@barrucadu.co.uk); do msg_len=$(git show --format='%B' --no-patch $hash | wc -c) full_len=$(git show --format='%B' $hash | wc -c) echo -e "$(echo "($full_len -$msg_len) / $msg_len" | bc -l)\t$(git show --oneline --no-patch \$hash)"
done | sort -h


Smaller numbers mean more explanation than diff. My most well-explained commit in search-api is:

commit f9410c9a7c4c8c4664125103d225eb83ddfba967
Date:   Fri Apr 5 12:04:09 2019 +0100

Bump limit of default sidekiq queue to 8

We've occasionally seen a build up of sidekiq jobs (in both search-api
and rummager).  I experimented by bumping the limit on the "default"
queue to 8 via the app console during one such spike: the jobs cleared
much faster than in rummager (which still had the limit of 4), and the
Elasticsearch search latency increased by maybe a couple of
milliseconds - it's hard to say because any increase is small enough
to be obscured by the natural variability of the metric.

These limits were added to solve a problem, but that problem occurred
with an almost totally different search architecture, so I think it's
worth experimenting with the limits a bit.  They could perhaps be
increased further, but let's see how this change performs for now.

diff --git a/config/sidekiq.yml b/config/sidekiq.yml
index 46a23a0b..1706eef3 100644
--- a/config/sidekiq.yml
+++ b/config/sidekiq.yml
@@ -12,4 +12,4 @@ production:
- bulk
:limits:
bulk: 4
-  default: 4
+  default: 8

My least well-explained commits all seem to be refactoring commits, and that trend holds across a few different repositories (including non-work ones).

## Miscellaneous

• I got a letter from HMRC saying I’d paid £2037 too much tax last year, so as soon as I claimed that back I bought an Oculus Quest and some games: Beat Saber, I Expect You To Die, and Moss. It arrived on Saturday morning, and I’ve spent a lot of this weekend in VR. Unlike other VR headsets, the Quest is entirely self-contained and doesn’t need external sensors or a connection to a computer (though next month it is getting the ability to play PC VR games if connected by a cable). I’m really impressed with how well it works, though some things—like browsing the web in VR—are a bit clunky, due to the limited number of buttons on the controllers. But for gaming it’s great.

• I read a few books this week:

• The Fires of Heaven (by Robert Jordan), the fifth book in the Wheel of Time series.
• The Dark Warrior and The Bloody Valkyrie, volumes 2 and 3 of Overlord (by Kugane Maruyama).
• I had an idea for how to make a searchable local academic database: for each paper, write a little metadata (bibliographic data + a list of references) and use tesseract to extract the contents of the PDF as text. Plug all that into Elasticsearch, and you’ve got your own academic database. I had a go, and it seems like it would work pretty well.