Aligned large language models (LLMs) often react very differently to the same jailbreak prompt: one model may refuse, another may partially comply, and a third may produce unsafe content. This variability suggests that jailbreak vulnerability is not determined by a single factor. Instead, it likely emerges from the interaction of backbone architecture, tokenization, prompt-template structure, post-training alignment, and internal representation-level mechanisms governing refusal and compliance. This concept paper argues that cross-model jailbreak variability should be studied as a mechanistic problem rather than only a benchmarking problem. Drawing on prior work on safety-training failure modes, optimization-based jailbreaks, shallow safety alignment, prompt-template effects, refusal directions, attention manipulation, and token-position sensitivity, this paper proposes a unified research agenda for explaining why aligned LLMs exhibit different internal responses to the same jailbreak prompt. The central thesis is that architecture matters, but many practically important differences arise from post training alignment and from how refusal and helpfulness are represented and routed internally.The paper formulates testable hypotheses, proposes an experimental framework spanning models such as Llama-2-Chat, Vicuna, and Mistral-Instruct, and outlines a methodology combining attack evaluation with attention analysis, hidden-state analysis, refusal-direction probing, tokenizer analysis, and causal interventions. The goal is to move from measuring jailbreak success toward understanding the internal mechanisms that produce it.