toooold
The Least Action Nature of AdvJudge-Zero: A Lagrangian Perspective on LLM Steering
3 weeks 3 days ago
In December 2025, Tony, Yuhao, and I have published AdvJudge-Zero: Binary Decision Flips in LLM-as-a-Judge via Adversarial Control Tokens https://arxiv.org/abs/2512.17375 . This post serves to clarify the underlying mathematical mechanics of our method, stripping away heuristic explanations to focus purely on Lagrangian optimization and the Principle of Least Action in discrete sequence generation.
The Mechanism of Logit Gap Steering: A Unified View of Prompts, Vectors, and Low-Rank Adaptation
1 month 1 week ago
It has been a few months since my colleague Tony and I published our paper on Logit Gap Steering. In that work, we demonstrated a practical method for steering LLM behavior—specifically bridging the gap between “Refusal” and “Compliance”—by optimizing token sequences.
LLM as My Pair Researcher: Prover–Validator Collaboration and the Road to Logit-Gap Steering
1 month 3 weeks ago
Research is a deeply personal and tailored process; it’s not something that regular prompting can replicate. ChatGPT, or any LLM or AI agents, can’t simply find the research gap or invent a groundbreaking idea for you. What this post shares is how I work with AI as a collaborator to transform a wild intuition into a concrete new research direction of logit gap steering*.
The Physics of mHC: Why Deep Learning Needs Energy Conservation
2 months 1 week ago
When I first read the Manifold-Constrained Hyper-Connections (mHC) paper https://www.arxiv.org/abs/2512.24880 , I didn’t see it as just another optimization trick or a clever use of Sinkhorn iterations, but the other way round. This is physics.
The Mathematics of Baby Shower Games: Solomonoff Inference in Action
1 year 3 months ago
Last weekend, I found myself applying data science in an unexpected setting: a baby shower. The host announced what seemed like a simple party game - guessing the circumference of the mother-to-be’s baby bump. What made this particularly interesting was that I could see everyone else’s guesses on a decorated board, transforming a simple estimation game into a fascinating exercise in probability theory and strategic decision-making.
The missing knowledge snippets of AI
1 year 3 months ago
It is a live blog post of some knowledge snippets of AI to bridge the gap among text books, papers, other blog posts. Most content has been posted on my Linkedin.
Building a Lightweight Financial Agent: A Flexible Approach to Tool Use and Orchestration
1 year 6 months ago
In the rapidly evolving field of AI agents, there’s a growing trend towards complex frameworks and libraries. However, for many practical applications, a simpler, more flexible approach can be just as effective. This blog post introduces a lightweight financial agent framework that demonstrates how powerful tool use and orchestration can be achieved without relying on heavy libraries like LangChain or LlamaIndex or CrewAI etc.
Agentic Systems: Snake Oil or the Future of AI?
1 year 6 months ago
The short answer: it’s more nuanced—agentic systems have potential, but only if implemented correctly.
Those Magnificent underdogs competing ChatGPT
2 years 11 months ago
Disclaimer: the open source community and the AI community evolve so fast. This blog post can only include content up to early April 2023. The cover image is generated using Midjourney.
Understand Twitter’s Recommendation system with a diagram using GPT-4
2 years 11 months ago
Disclaimer: the following blog post is mostly generated by GPT-4. The image is generated by Midjourney. I used the following prompt to produce a diagram and a short blog post for highlights:
Collaborating with an AI Assistant to Discover a JPEG2000 Decoder Vulnerability
2 years 11 months ago
How I teamed up with ChatGPT, an AI language model, to identify and analyze a critical security vulnerability in a JPEG2000 decoder.
Make a CatGPT out of ChatGPT
3 years 1 month ago
Since ChatGPT is good at hallucination, why not make a CatGPT out of it? Let’s engineer the prompt step by step:
Guess the size of an atomic bomb and an iOS supply chain attack
3 years 6 months ago
Disclaimer: Nothing in this blog is related to the author’s current day-to-day work. The content is neither affiliated nor sponsored by any companies. The story in this post is based on a true event that happened in two parts, six years apart, and is full of nostalgia.
Zero hacking problem: do we really protect the customers?
3 years 7 months ago
Disclaimer: Nothing in this blog is related to the author’s day-to-day work. The content is neither affiliated nor sponsored by any companies.
Linkedin spam: a case study of robust feature engineering
3 years 7 months ago
Disclaimer: Nothing in this blog is related to the author’s day-to-day work. The content is neither affiliated nor sponsored by any companies. I am not employed by Linkedin.
Measure the unmeasurable: botnet and German tanks
3 years 7 months ago
Disclaimer: Nothing in this blog is related to the author’s day-to-day work. The content is neither affiliated nor sponsored by any companies. The story in this post is NOT based on a true event.
Nine cars, twenty-five horses and beyond
3 years 9 months ago
TL, DR
Checked
1 hour 48 minutes ago
to code
toooold feed