Rui-Jie Zhu 朱芮捷

Ph.D. Candidate, Electrical and Computer Engineering
University of California, Santa Cruz
Advisor: Jason K. Eshraghian

Previously, I worked on spiking neural networks, contributing to snnTorch, SpikingJelly, and building SpikeGPT. My research has since shifted to scalable and efficient sequence modeling architectures, and how to scale them. I received my Bachelor's degree from the University of Electronic Science and Technology of China (2023).

Find me on GitHub, Google Scholar, and X (Twitter).

Email: ridger@ucsc.edu

Research

I am interested in building scalable and efficient sequence modeling architectures as an alternative to standard Transformers. On the architecture side, I have joined the development of linear attention and recurrent models that achieve Transformer-level quality at a fraction of the cost:

What I care about most is touching scaling with my own hands. My personal scaling trajectory covers three orders of magnitude in compute:

For each of these runs, I watched every checkpoint from the very first to the last, witnessing a model go from random to intelligence. That is what I am really enjoying. The journey is the reward.

Please refer to publications for the full list.

Talks & Media

Experience


This website is adapted from Tianyu Gao's design, which is in turn adapted from Gregory Gunderson.