This paper develops an industrial-organization theory of AI routing, modeling how a dual-role platform allocates user queries between an in-house model and an outside expert. We formulate this as a delegated allocation problem featuring endogenous quality investment and data feedback. The expert sets access prices and initial quality, while the platform routes queries by difficulty. Early traffic routed to the expert enhances its future quality through learning-by-serving. In equilibrium, routing follows a cutoff rule. The platform's self-preferencing acts as a tax on outside expertise, raising routing thresholds, reducing outside demand, and compressing both current quality investment and future learning gains. Decentralized routing introduces three inefficiency wedges compared to a dynamic first best: an access-markup wedge, a bias wedge, and a data-feedback wedge. The third is unique to AI routing because traffic allocation directly dictates learning opportunities. Consequently, neutrality and access-pricing remedies are complementary but insufficient together, as platforms fail to internalize the future value of outside learning. This model provides a tractable framework for analyzing AI gateways and router governance.