AI Research Papers

Educational analyses of cutting-edge AI research. Explore key findings, data visualizations, and practical implications from the latest advances in machine learning, NLP, and artificial intelligence.

Filter papers

Deep Delta Learning: Rethinking Residual Connections with Geometric Transformations - Research visualization
Technical Deep-Dive arXiv 2026

Deep Delta Learning: Rethinking Residual Connections with Geometric Transformations

Yifan Zhang Jan 2026
Deep LearningNeural ArchitectureMachine Learning

DDL replaces the standard additive skip connection with a learnable Delta Operator (a rank-1 Householder transformation) that dynamically interpolates between identity, projection, and reflection. This enables networks to model complex, non-monotonic dynamics while preserving training stability.

Delta Operator generalizes identity shortcuts via rank-1 Householder transformations
Recursive Language Models: Processing Unlimited Context Through Code - Research visualization
Important Finding arXiv 2025

Recursive Language Models: Processing Unlimited Context Through Code

Alex L. Zhang, Tim Kraska et al. Dec 2025
Natural Language ProcessingLarge Language ModelsInference Scaling

LLMs have fixed context windows, but real-world documents can be millions of tokens. Recursive Language Models (RLMs) let models treat their prompts as programmable objects, recursively calling themselves over snippets to handle inputs 100x beyond their context limits while outperforming long-context baselines.

Breaks the context window barrier: handles documents 100x longer than the model's native limit without any architectural changes
GenZ: Using Foundation Models as Feature Generators - Research visualization
Important Finding arXiv 2025

GenZ: Using Foundation Models as Feature Generators

Marko Jojic, Nebojsa Jojic Dec 2025
Machine LearningFoundation ModelsStatistical Modeling

Foundation models struggle at direct prediction tasks like pricing or recommendations. GenZ shows how to use LLMs as semantic feature extractors within traditional statistical models, achieving 3.2x better house price predictions and cold-start recommendations equivalent to 4,000 user ratings.

Stop asking LLMs to guess numbers. Use them to answer yes/no questions, then let a simple regression model do the math. You get interpretable features and accurate predictions.
Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution - Research visualization
Important Finding arXiv 2025

Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution

Hau-Shiang Shiu, Chin-Yang Lin et al. Dec 2025
Computer VisionGenerative ModelsVideo Processing

A diffusion-based framework that achieves 130x faster video super-resolution than previous methods, processing 720p frames in 0.328 seconds while improving perceptual quality. The first practical diffusion approach for real-time video enhancement.

130x faster than previous diffusion-based video super-resolution methods
End-to-End Test-Time Training for Long Context - Research visualization
Important Finding arXiv 2025

End-to-End Test-Time Training for Long Context

Arnuv Tandon, Karan Dalal et al. Dec 2025
Large Language ModelsMachine LearningContinual Learning

A novel approach that treats long-context language modeling as a continual learning problem, achieving 2.7× faster inference than full attention at 128K tokens while maintaining comparable performance, but with a critical weakness on retrieval tasks.

Treats long documents as a 'learning problem': instead of building complex attention patterns, the model 'studies' the context and stores knowledge in its weights
SAGA: Autonomous Goal-Evolving Agents for Scientific Discovery - Research visualization
Important Finding arXiv 2025

SAGA: Autonomous Goal-Evolving Agents for Scientific Discovery

Yuanqi Du, Botao Yu et al. Dec 2025
AI AgentsScientific DiscoveryMachine Learning

SAGA automates objective evolution in AI-driven science: instead of optimizing fixed objectives, it evolves the objectives themselves. This bi-level framework where LLMs propose, implement, and refine scientific goals has achieved strong results in antibiotic design, materials discovery, DNA engineering, and chemical process optimization.

Automating objective evolution (not just solution optimization) is the key to scientific discovery
UltraShape 1.0: High-Fidelity 3D Shape Generation via Scalable Geometric Refinement - Research visualization
Emerging Research arXiv 2025

UltraShape 1.0: High-Fidelity 3D Shape Generation via Scalable Geometric Refinement

Tanghui Jia, Dongyu Yan et al. Dec 2025
Computer VisionGenerative Models3D Graphics

A two-stage 3D diffusion framework that generates detailed geometry by first creating coarse structures then refining them with voxel-based methods. Trained entirely on public datasets, it matches proprietary approaches in quality.

Two-stage pipeline separates coarse structure generation from fine detail synthesis
Black-Box On-Policy Distillation: Learning from Closed-Source LLMs - Research visualization
Important Finding arXiv 2025

Black-Box On-Policy Distillation: Learning from Closed-Source LLMs

Tianzhu Ye, Li Dong et al. Nov 2025
Large Language ModelsKnowledge DistillationMachine Learning

Generative Adversarial Distillation (GAD) enables training smaller models from proprietary LLMs like GPT-5 using only text outputs. By framing distillation as an adversarial game between student and discriminator, GAD achieves what was previously impossible: a 14B parameter model matching its closed-source teacher.

14B student model matches GPT-5-Chat teacher on LMSYS-Chat benchmark
Depth Anything 3: Recovering the Visual Space from Any Views - Research visualization
Important Finding arXiv 2025

Depth Anything 3: Recovering the Visual Space from Any Views

Haotong Lin, Sili Chen et al. Nov 2025
Computer Vision3D VisionDeep Learning

A simplified approach to visual geometry that predicts spatially consistent depth from any number of images. Using just a plain transformer and single prediction target, DA3 outperforms prior methods by 44% on camera pose accuracy while matching Depth Anything 2's quality.

44.3% improvement in camera pose accuracy over prior state-of-the-art
RAG-Anything: Unified Multimodal Retrieval for Real-World Documents - Research visualization
Important Finding arXiv 2025

RAG-Anything: Unified Multimodal Retrieval for Real-World Documents

Zirui Guo, Xubin Ren et al. Oct 2025
Information RetrievalMultimodal AINatural Language Processing

Traditional RAG systems only handle text, but real documents contain images, tables, and equations. RAG-Anything introduces a dual-graph architecture that treats all content types as interconnected knowledge entities, achieving 13+ percentage point improvements over baselines on long documents.

Solves the 'long document' problem: maintains high accuracy on 200+ page reports where standard RAG systems fail
The Fold Principle: A Universal Pattern from Cosmos to Cognition - Research visualization
Important Finding SSRN 2025

The Fold Principle: A Universal Pattern from Cosmos to Cognition

Jonas Jakob Gebendorfer Oct 2025
Artificial IntelligenceComplexity TheoryNeuroscience

A new theory proposing that intelligence emerges when systems 'hold tension' between contradictory constraints instead of collapsing to simple answers. This framework explains why Chain-of-Thought works, why jailbreaks happen, and how to build better AI.

Intelligence is not computational power. It's the capacity to hold productive tension between competing ideas without immediately collapsing to one answer.
Verbalized Sampling: Unlocking LLM Creativity with a Simple Prompt - Research visualization
Technical Deep-Dive arXiv 2025

Verbalized Sampling: Unlocking LLM Creativity with a Simple Prompt

Jiayi Zhang, Simon Yu et al. Oct 2025
PromptingLLM AlignmentMachine Learning

Stanford researchers discovered that adding ~20 words to any prompt can boost LLM creativity by 1.6-2x. Verbalized Sampling bypasses RLHF's mode collapse by asking models to generate probability distributions instead of single answers, recovering the diverse capabilities lost during alignment.

RLHF causes mode collapse due to typicality bias in human preference data, not algorithmic limitations
On the Theoretical Limitations of Embedding-Based Retrieval - Research visualization
Critical Warning arXiv 2025

On the Theoretical Limitations of Embedding-Based Retrieval

Orion Weller, Michael Boratko et al. Aug 2025
Information RetrievalMachine Learning TheoryNatural Language Processing

A theoretical and empirical analysis proving that vector embeddings face fundamental dimensional constraints when handling diverse retrieval tasks, with state-of-the-art models achieving only 8-19% recall on the new LIMIT benchmark.

Vector search has hard mathematical limits that no amount of training can fix
Collapse of Dense Retrievers: Short, Early, and Literal Biases Outranking Factual Evidence - Research visualization
Critical Warning ACL 2025

Collapse of Dense Retrievers: Short, Early, and Literal Biases Outranking Factual Evidence

Mohsen Fayyaz, Ali Modarressi et al. Jul 2025
Information RetrievalNatural Language ProcessingMachine Learning

A comprehensive study revealing critical vulnerabilities in dense retrieval models used in RAG systems, showing how biases for shorter documents, early positions, and literal matches cause models to ignore factual evidence.

AI search models lazily prefer short, repetitive text over detailed, correct answers
Small language models are the future of agentic AI - Research visualization
Important Finding NVIDIA Research 2025

Small language models are the future of agentic AI

Peter Belcak, Greg Heinrich et al. Jun 2025
Language ModelsAI AgentsEfficient AI

NVIDIA researchers argue that sub-10B parameter models are better suited for AI agent tasks than frontier LLMs. With 10-30x lower inference costs and comparable performance on tool-calling, the economics of agentic AI may favor specialized small models over general-purpose giants.

Challenges the 'bigger is better' dogma: sub-10B models match or beat GPT-4o on specialized agent tasks
DeepSeek-R1 & V3: The Open-Source Reasoning Revolution - Research visualization
Important Finding arXiv 2025

DeepSeek-R1 & V3: The Open-Source Reasoning Revolution

DeepSeek-AI Jan 2025
Large Language ModelsReasoning ModelsMachine Learning

DeepSeek shook the AI industry by matching OpenAI o1's reasoning capabilities at a fraction of the cost—then open-sourcing everything. DeepSeek-V3's efficient MoE architecture and R1's pure reinforcement learning approach demonstrate that frontier AI doesn't require frontier budgets.

DeepSeek-R1 matches OpenAI o1-1217 on reasoning benchmarks using pure RL
Kimi k1.5: Scaling Reinforcement Learning with LLMs - Research visualization
Important Finding arXiv 2025

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Kimi Team, Flood Sung Jan 2025
Large Language ModelsReinforcement LearningReasoning

Moonshot AI's breakthrough in reasoning through simplicity. By removing complex RL components like Monte Carlo tree search and value functions, Kimi k1.5 matches OpenAI's o1 on math and coding benchmarks while enabling efficient short-response models that outperform GPT-4o by up to 550% on AIME math problems.

Matches OpenAI o1 on AIME (77.5) and MATH-500 (96.2) without MCTS or value functions
Learning Dynamics of LLM Finetuning: Why Your Model Hallucinates and Forgets - Research visualization
Important Finding ICLR 2025

Learning Dynamics of LLM Finetuning: Why Your Model Hallucinates and Forgets

Yi Ren, Danica J. Sutherland Jan 2025
Machine LearningNatural Language ProcessingLLM Training

ICLR 2025 Outstanding Paper reveals why finetuning makes LLMs hallucinate and why training too long actually hurts performance. A three-term decomposition framework explains the hidden mechanics of SFT, DPO, and RLHF.

Explains why finetuning causes hallucinations: the model borrows phrases from one answer to respond to unrelated questions
The AI Scientist: Fully Automated Scientific Discovery - Research visualization
Important Finding arXiv 2024

The AI Scientist: Fully Automated Scientific Discovery

Chris Lu*, Cong Lu* et al. Aug 2024
AI AgentsScientific DiscoveryMachine Learning

The AI Scientist is a comprehensive framework enabling LLMs to autonomously conduct scientific research: generating ideas, writing code, running experiments, creating visualizations, writing papers, and simulating peer review. All for under $15 per paper.

First comprehensive framework for fully automated open-ended scientific discovery