I'm an engineer working on deep learning.
Work
- 2023-24: Implemented large scale pre-training infrastructure in Rust, pre-trained distributed Transformer foundation models across 8,000 GPUs and TRNs. Data, tensor, pipeline, and expert parallelism. Applying to LLM self play / self improvement with code generation.
- 2020-21: Fine-tuned GPT and BERT models and applied Secret Sharer to characterize unintended memorization
- 2019-20: Applied WaveNets to time series forecasting on trillions of rows of streaming music metadata
- 2017-18: Extended GANs for "homomorphic encryption" (signal-preserving obfuscation)
- 2015-16: Engineered deep reinforcement learning for industrial robotics with imitation learning
I'm additionally interested in information theory, security (vulnerabilities, exploits), and trees (the plant kind, not the balanced kind). I think Moravec's Paradox and the Bitter Lesson are two of the most important empirical realizations in AI.
Essays
Technical
Videos