Satellite Image Annotation: How AI Learns to Read the Physical World
Satellite Image Annotation: How AI Learns to Read the Physical World There is more intelligence packed into a single aerial image than most people realise. An overhead shot of a commercial property contains roads, parking areas, rooftops, vegetation, pathways, utility structures, shadows, boundaries, all layered on top of each other, all spatially related, all meaning something different depending on what the AI system is trying to understand. The models being built on top of satellite imagery today are genuinely impressive. They can estimate outdoor surface areas, identify vegetation density, detect parking utilisation, flag infrastructure changes, and generate property proposals — all from aerial data. But none of that happens because the model is clever. It happens because someone first sat down and taught the model what it was looking at. Feature by feature. Surface by surface. Property by property. That is satellite image annotation. And it is a lot more complex than it looks from the outside. Why Satellite Imagery Is a Different Kind of Annotation Problem Most computer vision annotation deals with objects that are familiar and visually distinct. A car is a car. A person is a person. The boundaries are usually clear and the categories are usually obvious. Satellite imagery does not work that way. From an aerial perspective: The visual complexity is high, and the margin for annotation error is low — because in geospatial AI, annotations do not just train a model to recognise objects. They train it to understand spatial relationships, measure surfaces, and reason about how a property is structured. A polygon that is slightly off does not just mislabel an object. It corrupts an area calculation. It breaks a surface estimate. It feeds wrong data into a proposal or planning workflow. This is why satellite image annotation demands a different level of precision and domain awareness than most annotation categories. What Satellite Image Annotation Actually Involves # Semantic Segmentation The foundation of most geospatial AI datasets is image semantic segmentation — classifying every visible region in the image by what it is. Grass, roads, rooftops, vegetation, pavements, parking areas, water surfaces, open land — each pixel region gets assigned to a category that tells the model how that surface type is distributed across the property. This is what allows AI systems to do: The challenge with image semantic segmentation in satellite imagery is consistency. The same surface type can look different depending on lighting, season, image resolution, and shadow coverage. Without tight ontology guidelines and disciplined annotator training, the same patch of ground gets labeled differently across different images — and that inconsistency compounds into model unreliability at scale. # Instance Segmentation Semantic segmentation tells the model what something is. Image instance segmentation tells it how many there are and where each one begins and ends. For property intelligence applications, this distinction matters enormously. A parking lot is not just a parking area — it is a collection of individual spaces, each of which can be occupied or empty, marked or unmarked, accessible or standard. A cluster of trees is not just vegetation, it is individual trees with distinct canopy boundaries that affect shading calculations and landscaping estimates. Image instance segmentation is what allows AI to: For applications like parking utilisation analysis, solar panel detection, or automated site planning, this layer of annotation is not optional — it is the core of what makes the model useful. # Polyline Annotation Not everything in a satellite image is a region. Some of the most important property features are linear — and those require a different annotation approach entirely. Roads, pathways, driveways, curbs, fence lines, utility lines, property boundaries — these are structural elements that define how a property is connected and how it is divided. Polyline annotation traces these linear features with precision, giving the model the vocabulary to understand: A model that cannot reliably trace a driveway from the road to the building entrance cannot support automated site analysis. A model that loses a property boundary mid-line cannot measure a lot accurately. # Bounding Box Annotation Not every object in a satellite dataset needs precise boundary annotation. For fast object localisation; vehicles, containers, rooftop equipment, utility poles, movable assets, 2D bounding box annotation is often the right tool. It is faster to produce, easier to scale, and sufficient for detection tasks where the exact object boundary matters less than knowing the object is there and roughly where it sits. In large-scale infrastructure monitoring or asset tracking workflows, 2D bounding box annotation forms a practical and efficient detection layer without the overhead of full segmentation. # Attribute-Based Annotation Detection and segmentation tell the model what is there and where. Attribute annotation tells it what kind of thing it is. These attribute labels are what push geospatial AI from detection into property-level intelligence. A model that knows a parking area exists is useful. A model that knows how many spaces it contains, what condition they are in, and how many are accessible is genuinely valuable for the platforms being built on top of it. Where Satellite Annotation Gets Hard — And How We Handle It Elevated structures cast shadows that hide what is underneath them. A tree canopy obscures the pathway running beneath it. A building shadow falls across a road and makes the surface boundary ambiguous. This is not just a visual problem, it is a consistency problem. If two annotators handle occluded regions differently, the model learns contradictory things about the same situation. The way we handle it is through clear occlusion protocols built into the annotation guidelines before the project starts. Every edge case, partial occlusion, full shadow coverage, seasonal variation gets a defined handling rule. Annotators are not making judgment calls on the fly. They are following a documented decision framework, and QA reviews specifically check for occlusion consistency across the dataset. This is the challenge that catches most teams off guard. From the air, concrete and asphalt are nearly indistinguishable without contextual cues. Dry vegetation and open gravel
Satellite Image Annotation: How AI Learns to Read the Physical World Read Post »