"An overview of the three main ways to access large language models: direct API, inference routing services, and local/self‑hosted deployments."