EngineeringFeb 2026

Distributed Inference at Scale

When you're forecasting a handful of series, a single machine handles it fine. But enterprise workloads often involve thousands of series with long horizons and multiple quantiles. That's where things used to get slow.

We rebuilt the inference pipeline to distribute work across multiple compute nodes automatically. The system looks at your job configuration, estimates the total compute, and provisions the right number of workers. You don't think about infrastructure. Submit the job and the platform handles the rest.

We also added a cost estimation endpoint. Before you submit a forecast job, you can call the estimator to get projected runtime and compute cost. No guessing whether a large job will take five minutes or fifty.

The practical upshot: jobs that previously took 40 minutes now finish in under 5. Costs scale linearly with the work, and you always know what you're signing up for before you hit submit.