Dongjae Jeon

Dongjae Jeon

Master's Student
Yonsei University

I am a first-year Master's student in Computer Science at Yonsei University, where I conduct research in the Artificial Intelligence and Information Systems Lab under the supervision of Professor Albert No.

I research generative modeling, focusing on improving models by analyzing their underlying mechanisms. My recent work includes addressing core challenges in diffusion LLMs ([C4][C6][P2]), analyzing diffusion-based image generation ([C2]), and enhancing model efficiency via quantization ([C3][P1]).

Quick overview:

I also bring experience in Continual Learning for computer vision, specifically in Object Detection (See Awards). Ultimately, I aim to deepen our understanding of machine perception and translate those insights into reliable generative systems that are useful in the real world.

News

Jan 2026 Two papers on Diffusion Language Model (Rainbow Padding, A2D) are accepted to ICLR 2026. See you in Brazil πŸ‡§πŸ‡·!
Nov 2025 A paper on Machine Unlearning (IDI) is accepted to AAAI 2026!
Oct 2025 Two preprints are now available: one on Diffusion Language Model training (Rainbow Padding) and Safety (A2D).
Sep 2025 A paper on Diffusion Language Model is accepted to NeurIPS 2025. Kudos to my co-authors! See you in San Diego πŸ‡ΊπŸ‡Έ!
Aug 2025 Official code for ICML 2025 paper is uploaded! Checkout!
Aug 2025 I gave a talk at Yonsei MLSys student group on Diffusion models & their privacy issues.
Jun 2025 A paper on Quantization is accepted to ICML 2025 (TTODLer-FM workshop) as Oral!
May 2025 Two Papers (ODLRI and Regard-FT) are accepted to ACL 2025! See you in Austria πŸ‡¦πŸ‡Ή!

Publications

(*) denotes equal contribution

Preprints

Dependency-Aware Parallel Decoding via Attention for Diffusion LLMs
Bumjun Kim*, Dongjae Jeon*, Moongyu Jeon, Albert No
Under review, 2026
We propose DAPD, a training free parallel decoding method for diffusion LLMs that uses self attention to build a dependency graph over masked tokens and unmask an independent set in parallel to reduce joint marginal mismatch. Across LLaDA and Dream, DAPD improves the accuracy steps trade off and yields more globally dispersed, any order unmasking than marginal confidence baselines.
Preserve-Then-Quantize: Balancing Rank Budgets for Quantization Error Reconstruction in LLMs
Yoonjun Cho*, Dongjae Jeon*, Soeun Kim, Albert No
ICML (TTODLer-FM workshop), 2025 (Oral)
This paper proposes Structured Residual Reconstruction (SRR), which preserves the top singular subspace of activation scaled weights before quantization and uses the remaining rank budget to reconstruct quantization error, guided by a theory based criterion for choosing the preserved rank. It improves post training quantization perplexity and boosts 2 bit QPEFT performance, reporting a 5.9 point average gain on GLUE.

Peer-Reviewed

Rainbow Padding: Mitigating Early Termination In Instruction-tuned Diffusion LLMs
Bumjun Kim*, Dongjae Jeon*, Dueun Kim*, Wonje Jeung, Albert No
ICLR, 2026
This paper identifies overflow in instruction tuned diffusion LLMs, where increasing the max length paradoxically causes early termination or repetitive outputs because serves as both padding and the stop token. It proposes Rainbow Padding, which replaces repeated padding with a small cyclic set of distinct padding tokens, restoring length robustness and improving reasoning and coding performance with minimal LoRA fine tuning.
A2D: Any-Order, Any-Step Safety Alignment for Diffusion Language Models
Wonje Jeung*, Sangyeon Yoon*, Yoonjun Cho, Dongjae Jeon, Sangwoo Shin, Hyesoo Hong, Albert No
ICLR, 2026
This paper proposes A2D, a token-level safety alignment method for diffusion language models that triggers an \texttt{[EOS]} refusal as soon as harmful content arises, making safety robust to any decoding order and any step. On safety benchmarks, A2D drives DIJA prefilling attack success rates from over 80% to near-zero and enables early rejection via thresholded \texttt{[EOS]} probabilities for up to 19.3$\times$ faster safe termination.
An Information Theoretic Metric for Evaluating Unlearning Models
AAAI, 2026
This paper introduces the Information Difference Index (IDI), a white box information theoretic metric that quantifies residual forget set information retained in intermediate representations after unlearning, beyond what end task accuracy reveals. Across datasets and architectures, IDI exposes cases where black box evaluations overestimate unlearning and provides a more reliable measure of strong unlearning.
Information-Theoretic Discrete Diffusion
Moongyu Jeon, Sangwoo Shin, Dongjae Jeon, Albert No
NeurIPS, 2025
This paper develops an information theoretic foundation for discrete diffusion, deriving I MDSE and I MDCE identities that show common score matching losses (DSE and DCE) yield tight, principled log likelihood estimators rather than loose variational bounds. It also provides practical estimators, including a time free likelihood formula, conditional likelihood for prompt response tasks, and a coupled Monte Carlo likelihood ratio estimator, with experiments validating accuracy and variance stability.
Assigning Distinct Roles to Quantized and Low-Rank Matrices Toward Optimal Weight Decomposition
Yoonjun Cho, Soeun Kim, Dongjae Jeon, Kyelim Lee, Beomsoo Lee, Albert No
ACL Findings, 2025
This paper shows that joint quantization plus low rank decomposition quality depends strongly on how the low rank term is initialized, because initialization fixes the roles each component keeps during optimization; it proposes Outlier Driven Low Rank Initialization (ODLRI) that assigns activation sensitive outlier weights to the low rank part so quantization handles the remaining weights more stably. Experiments on Llama2, Llama3, and Mistral show consistent gains in activation aware error, quantization scale, perplexity, and zero shot accuracy in extreme low bit settings.
Understanding and Mitigating Memorization in Generative Models via Sharpness of Probability Landscapes
Dongjae Jeon*, Dueun Kim*, Albert No
ICML, 2025 (Spotlight)
Previously at AAAI (PPAI workshop), 2025 (Oral)
This paper links diffusion model memorization to high sharpness in the learned log-density landscape and proposes a sharpness-based metric that detects memorization from the earliest sampling steps; it then introduces SAIL, an inference-time initial-noise optimization that steers sampling toward smoother regions to reduce memorization without retraining or prompt edits.
Large Language Models Still Exhibit Bias in Long Text
Wonje Jeung, Dongjae Jeon, Ashkan Yousefpour, Jonghyun Choi
ACL Findings, 2025
Previously at NeurIPS (SOLAR workshop), 2024
This paper introduces LTF TEST, a long form fairness benchmark that probes demographic bias in essay style generations across 14 topics and 10 demographic axes. Evaluations on multiple LLMs show that bias remains and can surface more clearly in long text than in short form fairness tests, motivating long form bias auditing as a complementary standard.

Awards

Class-Incremental with Repetition (CIR) using Unlabelled Data
πŸ₯ˆ 2nd Place
CVPR CLVISION challenge, 2024
Continual Test-time Adaptation for Object Detection
πŸ₯‡ 1st Place
ICCV VCL challenge, 2023

Talks

Privacy Issues in DGMs: How to detect & mitigate
Privacy Issues in DGMs: How to detect & mitigate
MLSys Student Group Seminar, 2025

Blog

Jan 2026

Coming Soon

...will be updated soon...

Read more β†’