Lei Huang

is a software engineer based in Berlin.
He enjoys solving hard problems through code.
He loves mountains, but circumstances have him currently wandering the flatlands of Brandenburg.
He speaks English, Chinese, und ein bisschen Deutsch.

Learn Batching From LLM

I was reviewing my old notebook and came across a piece of code that I took from somewhere many years ago. I asked Claude 3.7 Sonnet about this code, and here’s the conversation: Lei: Take a look at the following code: let timeout = null; const queue = new Set(); function process() { for (const task of queue) { task(); } queue.clear(); timeout = null; } function enqueue(task) { if (timeout === null) timeout = setTimeout(process, 0); queue.add(task); } I’m not sure how is this code useful. One scenario I can think of is modifying the DOM. Doing DOM manipulation in one batch might avoid reflow. But I’m fuzzy on details. Can you give me a concrete example? ...

Category Mapping with Embedding

Recently, I faced a daunting task: migrating all of our existing product deals to a brand new, more comprehensive, and standardized set of canonical product categories. This was critical for improving product discoverability, ensuring consistent marketing, and enabling better reporting. Think of it as moving from a somewhat disorganized, ad-hoc filing system to a meticulously organized, hierarchical library catalog. The problem? Our system had tens of thousands of deals, each with existing category assignments that were often inconsistent, incomplete, or simply didn’t map cleanly to the new structure. Manually re-categorizing everything was out of the question. It would have taken an absurd amount of time and been incredibly prone to errors. I needed an automated solution, but a simple keyword-based approach wouldn’t work. The nuances of product descriptions and the potential mismatches between the old and new categories demanded something far more intelligent. ...

Time-slicing With Coroutine

I wrote about time-slicing with CPS technique in the last blog post. The solution I proposed has two drawbacks: The control over task scheduling is too weak. Task slicing relies entirely on hacking the JavaScript engine’s event loop, and it’s impossible to arbitrarily pause and resume. This makes it impossible to precisely time the slices; you can only set them based on subjective experience (the example I provided uses 500 as the interval). However, 500 tasks might still be too long, causing the main thread to be blocked for too long. Or it might be too short, not fully utilizing the current call stack. setTimeout’s timing is inaccurate, and the actual time interval will have deviations. The result is that the delays of each task accumulate, significantly increasing the total task completion time. An alternative solution: Coroutine If a computational task can suspend itself and yield execution to other tasks, it’s a coroutine. ...

Time-slicing With Continuation-passing Style

Background story In January this year, I applied for an overseas dev job. I was asked to finish a project as homework. The project required a ton of computations on the front end, which posed a challenge as I must ensure no operations should block the main thread. I didn’t want to move the computation to a worker thread because the data was consumed in the main thread. Serializing and de-serializing a large set of data also has a performance cost. Furthermore, to improve indexing performance, I used Map to store the results, which is not serializable. ...

Typewriter Effect With RxJS

Background I recently rewrote my blog website from scratch in Gatsby. This time, I didn’t use a starter template, so I had to make a lot of design decisions. When I wrote the bio section on the home page, initially I put a long heading there as a one-sentence introduction. As I was gazing at the screen, I felt something was wrong. It was too wordy. But after I took out a few words, I was not satisfied with what was left. ...

Implementing A Trie In JavaScript

Recently, I encountered a situation where I need to perform text searches in a large set of data. Normally, I would do it in a filter function, but that would be slow when the list is too long. There’s a data structure for this situation. Meet the Trie! Trie (pronounced try) is a tree-like data structure that’s created for efficient text searches. How does it work Trie takes all words and rearranges them in a tree hierarchy. For example, the list of words ['abet', 'abode', 'abort'] will be transformed into a structure like this: ...

Introduction to Lenses in JavaScript

When I was reading Eric Elliott's article on Lenses , I was curious about how such beautiful magic can be fully implemented in JavaScript. It was a tough exploration. Many of the tutorials online are about Haskell, which cannot be easily translated to JavaScript. I read the source code of Ramda and finally grokked how it works. ...

Parabolic Curve Animation With RxJS

I came across this article (written in Chinese) the other day. It was about parabolic curve animation in vanilla JS. I wondered how RxJS can implement this. Below is the result of my investigation. Imagine we take a perspective from a slow-motion camera. What humans see as a smooth animation is just an object that is put at different places at every fragment of a time period. This can be expressed as a ‘stream’ of object position. The mechanism of an animation can be simplified by somehow mapping every fragment of a time period to a position point in space. In practice, time cannot be fragmented indefinitely, what we want is an approximation of an “atomic time unit”. The browser has provided us a tool to achieve this, which is the requestAnimationFrame API. ...