Artificial Intelligence

Artificial Intelligence (or AI) is a relatively misused term, and these days seems to cover everything from robots to automation to chat bots. Every few years there is an AI breakout technology that grabs headlines, and of course right now that’s ChatGPT (or Generative AI and Large Language Models in general). For those of us in, or on the edges of; the Geospatial industry, our main focus for AI has been Machine Learning, Deep Learning and Computer Vision (note that the lines between these and all AI technologies is blurry and often overlapping).

I’m not going to go into the technical details of each AI component (I’d struggle if I tried), but I will cover some practical considerations.

Generative AI / Large Language Models – Text (ChatGPT, Google Gemini, Claud etc)

Certainly the most famous at the moment and for the most part experienced via chat interface, there is no escaping how impressive ChatGPT and Large Language Models (LLMs) can be. Ask them any question and they will provide a convincing reply, often peppered with titbits of knowledge that even professionals in the subject might not know. Push a little deeper though, and you might encounter some weaknesses – such as the propensity of the AI to make up answers or add figures that have no accurate source. Pushed even further and the AI is known to “hallucinate” by simply making everything up just so that it can provide an answer. You should think of an AI like a human that’s too afraid to be incorrect and just makes up stuff (but using words that make it sound real).

With regards to what ChatGPT (and other tools) can do, I like to consider three things:

Read (consume/understand/comprehend): ChatGPT can read and understand content that you provide it, creating summaries or translations. They can be trained to understand your own data and documentation so that any answers they give are based on your own procedures (although you still need to be careful here!).
Write (Create): ChatGPT can write or create new content, either based on knowledge you’ve given it or based on its foundational understanding.
Do (do something/take an action): There are actually very few examples of where AI systems actually “do” something, at least today (some find the thought of this very scary indeed). They can understand and create content, but very rarely would we connect an AI engine to something physical, so that it can take action based on its understanding.

There are systems that claim to be AI (for example ADAS systems in modern cars), but these are not powered by anything like ChatGPT. However, there are examples of where you can use the “read/write” concept to at least support an action. A simple example would be providing a plain English question to a chat interface which the interface then converts to an SQL query to create an output (for example a map).

Geospatial examples

Data Analysis and Interpretation:

LLMs can understand and interpret natural language queries related to geospatial data analysis, allowing users to extract insights and make informed decisions from complex datasets.

Information Retrieval:

LLMs can retrieve relevant geospatial information from vast repositories of data in response to natural language queries, facilitating access to valuable insights and knowledge.

Geospatial Data Annotation and Labelling:

LLMs can assist in annotating and labelling geospatial data by understanding natural language instructions, helping to improve the quality and accuracy of datasets used for training AI algorithms.

Geospatial Data Documentation and Reporting:

LLMs can generate comprehensive documentation and reports from geospatial data analysis results, summarising findings and insights in a clear and understandable format for stakeholders.

Computer Vision

This is common to anyone in the Geospatial industry and it's perhaps had the biggest impact (good and bad). In simple terms, it is the ability of a computer to understand what it sees on a photo/video. Seeing and understanding is something that humans take for granted, but computers really struggle – they see pixels and colours but struggle to see what the photo is of. Computer Vision enables a computer to understand what the photo is, although we are at the very early stages of this right now. A human could see a photo of another human and instantly recognise things like their gender, hair, and skin colour. They could probably also make assumptions about their age and nationality and emotion. If they knew the person, they would immediately recognise them. A computer might be able to do some of this today, but not all that well (although things are improving all the time).

However, for Geospatial, the biggest impact has been a combination of computer vision and feature extraction (known Automated Feature Extraction, or AFE). This is where a computer can find and identify a feature in an image/video (relatively easy) and then pinpoint (extract) its physical location (the hard bit). This allows for the automated creation of maps. Despite all the hype, we’re still only at the start of this journey. There are plenty of AFE systems out there (for example counting cars in car parks, extracting building footprints or roads) but the data they produce is very rarely “map ready” and requires a huge amount of human validation before it can be used.

Computer Vision is used in cars for ADAS, venues for automated criminal identification and on self-service scanners in shops when you weigh your bananas.