3 min read

YOLO on physical infrastructure

Waste monitoring in the field. Cameras, weather, variable light. The real world is messier than the dataset.

Countercheck was YOLO in a logistics facility. Wasteer is YOLO on outdoor waste infrastructure. The model architecture is similar. The deployment conditions are completely different.

The environment

Logistics facilities are indoors. Variable but bounded: fluorescent lighting, known camera angles, controlled temperature. The distribution shift is slow and mostly predictable.

Waste monitoring is outdoors. Cameras mounted on containers, in parking structures, on building facades. Direct sunlight at midday. Full darkness at night. Rain on the lens. Frost. Leaves blowing through the frame. Graffiti on the container that wasn't there last week. The distribution shift is constant and unpredictable.

The model that performs well in the Countercheck environment needed significant retraining to work on waste infrastructure. Not because YOLO is the wrong architecture. Because the input distribution is fundamentally different.

The lighting problem is bigger

Indoor variable lighting is manageable with augmentation: random brightness, contrast shifts, shadow simulation. Outdoor lighting involves dynamic range that exceeds what indoor augmentation can replicate.

A container in direct midday sun in summer produces images with blown-out highlights and deep shadows simultaneously. The same camera at 2am in winter produces near-black images with only the container edges visible under IR illumination. The model has to handle both, plus everything between.

The practical solution is multi-scale augmentation during training combined with image normalization in the preprocessing pipeline. The normalization step adapts input images to a consistent range before inference. It helps significantly on the extreme cases and doesn't help on the genuinely novel ones, which is where manual data collection from the deployment environment is unavoidable.

Weather as a systematic bias

Rain on a camera lens produces consistent artifacts: droplets that appear as bright spots, streaks that obscure parts of the frame, a general reduction in image sharpness. These are not random noise. They're consistent patterns that correlate with weather conditions.

We built a weather-awareness component. When the preprocessing step detects rain artifacts (classifiable with a lightweight model trained on labeled wet vs. dry frames), the confidence thresholds on the primary model adjust. In rain conditions, the system is more conservative: higher confidence required to classify fill level, flagging for human review at lower certainty thresholds.

This is a systems approach to a data problem. We're not training the model to be better at rain. We're routing rain-condition inputs to a different decision path.

Installation variance

Every installation is different. Camera angle, mounting height, distance from the container, orientation relative to the sun: all of these vary and all of them affect model performance. A model calibrated for one installation doesn't transfer automatically to another.

We've standardized on installation guidelines that reduce variance: specific mounting height ranges, recommended camera angles, minimum clearance requirements. Even with the guidelines, significant variance remains. The current solution is per-installation calibration: a short on-site capture session after installation produces installation-specific fine-tuning data.

This doesn't scale well at 30 installations. At 300, it becomes untenable. The roadmap includes automated calibration from the first week of operational data. We haven't built it yet.

What three years of logistics CV taught me that applies here

Measure what you're deploying against, not what you trained against. The gap between training distribution and deployment distribution is the number that matters. In outdoor physical infrastructure, that gap is large and variable. Budget accordingly.

With gusto, Fatih.