Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Paged Attention and vLLM

6 minute read

Published:

Paged Attention is a memory optimization on which the vLLM Inference Engine is based. Here is a summary of the paper on paged attention and the key features of vLLM that make it so powerful.

Are Autoencoders Fundamentally Denoisers?

6 minute read

Published:

The core idea behind Autoencoders is to bottleneck information flow so that the DNN is forced to prioritize what information to propagate to the next layer (by restricting the number of dimensions in the latent space). In this project, I explore how this can be a useful denoising tool.

Implementing GPT from Scratch

9 minute read

Published:

This article contains a conceptual explanation, necessary for building a language model from scratch, using the decoder-only transformer architecture. It is based on Andrej’s Karpathys GPT from scratch. The code for this conceptual guide can be found here.

Review: A Mathematical Framework for Transformer Circuits

7 minute read

Published:

This paper provides a mental model for reasoning about the internal workings of transformers and attention heads in deep neural networks. The insights here help understand and analyze the behaviors of large models.

portfolio

Predicting AI bias using SAEs

Published:

A comparative analysis of how Sparse Autoencoders and MLP activations encode gender information inside LLMs.

Beyond Lang

Published:

Voice calling with real-time speech translation and transcription.

publications

talks

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.