Manav Pandey

Research EngineerArizonaU.S. Citizen

I'm a research engineer interested in how neural networks can learn to reason — not by generating token sequences, but by navigating energy landscapes and optimizing latent representations.

I believe small models with structured search and learned energy landscapes will outperform large autoregressive models on reasoning tasks. My work on Enso and Dialogue Tree Search are direct expressions of that conviction.

Manav Pandey

If I had to describe myself

01Every autoregressive system I’ve shipped started with production Python — conversational AI, LLM fine-tuning, agentic tool-use.

Python Engineer.

02Building Enso taught me that diffusion-like reasoning — Langevin dynamics refining noise into structure — is a principled alternative to autoregression.

ML Engineer.

My core passions are mechanistic interpretability and self-supervised learning now focused on JEPA architectures and Langevin dynamics for inference. I’m drawn to energy-based models because they offer a principled framework for reasoning through optimization rather than generation.

03Self-supervised learning is my core research bet — JEPA, contrastive methods, and the conviction that reasoning emerges from representations, not token prediction.

Researcher.

I reproduce results before I trust them Enso started as a replication of Kona 1.0 and became something new. I read papers end to end, question assumptions, and build to understand. When I encounter a claim, my first instinct is to verify it myself.

Experience

American Express — CTO R&D Team

Senior Research Engineer

American Express — CTO R&D Team

Sep 2024 – Present
  • Built and shipped American Express’s first agentic use case in production (PR Agent) using LangGraph and MCP.
  • Core engineer on GenAI and agentic AI frameworks and standards, serving 4,500+ developers.
  • Leading technical sessions distilling cutting-edge research (MLA, Muon Optimizer, Context Engineering) for 500+ engineers.
Lightsource

Machine Learning Engineer — Research

Lightsource

Feb 2024 – Sep 2024
  • Fine-tuned Mistral 7B and Mixtral 8x7B for multilingual content generation using DPO and PPO with synthetic preference data.
  • Optimized inference with INT4-FP8 quantization; implemented spherical interpolation model merging for improved multilingual performance.
Curiouser

Director of AI

Curiouser

Dec 2023 – Sep 2024
  • Led engineering for a conversational AI platform. Built LoRA dynamic adapter selection, chain-of-thought tool calling, and autonomous knowledge graph creation for persistent user understanding.
American Express

Software Engineer

American Express

Aug 2022 – Feb 2024
  • Built production conversational AI using BERT models achieving 90% accuracy across 1M+ monthly interactions.
  • Implemented APM monitoring across 100+ microservices using Splunk and OpenTelemetry.
Texas A&M University

ML Research Assistant

Texas A&M University

Oct 2021 – May 2022
  • Enhanced SVM accuracy from 79% to 94% through kernel optimization for sentiment classification and cross-border threat detection.

Projects

Enso

Energy-Based Model for Constraint Satisfaction

A 36.5M-parameter JEPA-EBM that solves hard Sudoku through Langevin dynamics in latent space — navigating a learned energy landscape rather than generating tokens sequentially.

  • 96.6% puzzle accuracy — exceeding Kona 1.0’s open-source benchmark of 96.2%
  • Forward pass achieves 95.6%; Langevin dynamics adds +1.0% through test-time compute scaling
  • Uses mechanistic interpretability to analyze energy-based model reasoning

Dialogue Tree Search

MCTS-Inspired Synthetic RL Data Generation

A parallel beam search system that treats conversation trajectories as a search tree, using Monte Carlo rollouts to explore diverse dialogue paths. Generates synthetic preference datasets for training tool-using agents via GRPO and PPO with Elo-based scoring.

  • MCTS-inspired parallel beam search over conversation trajectories
  • Produces preference data for RL fine-tuning of tool-using agents

Education

Georgia Institute of Technology

Georgia Institute of Technology

M.S. Computer Science (Machine Learning)

2025 – Present

Texas A&M University

Texas A&M University

B.S. Computing

2018 – 2022

Awards & Recognition

American Express Inventor Award

2025

Received for several patent filings across the agentic AI, mechanistic interpretability, and self-supervised learning domains.

Anthropic Bug Bounty Program

Contributed to AI safety through Anthropic’s bug bounty program, identifying vulnerabilities in foundation model behavior.

Technical Skills

ML & Research

PyTorchTransformersRLHF/PPO/GRPO/DPOEnergy-Based ModelsJEPASparse AutoencodersLangevin DynamicsVICRegDistributed TrainingModel Quantization (INT4/FP8/AWQ/GGUF)vLLMTensorRT

Python & Backend

Python 3.10+PydanticFastAPIasync/awaitCeleryMCP ServersDockerKubernetesAWS SageMaker

Infra & DevOps

RL Environment SandboxingCode Execution ContainersMLOps PipelinesSplunkOpenTelemetryW&B Experiment Tracking

Agentic AI

LangGraphLangChainMCPMulti-Agent OrchestrationTool-Use OptimizationRAGKnowledge Graphs