Public Services, Powered by AI. How WhatsApp Is Streamlining City Reporting in Cape Town
Using AI and WhatsApp to lower barriers to public service reporting across Cape Town. The prototype enables faster, more inclusive, and more accurate service request logging.

Introduction
The prototype uses natural language processing and structured categorisation logic to interpret resident reports submitted via text or voice notes. It extracts key details, identifies the service category, and presents a clear summary for the user to verify before submission. The approach aims to make reporting more accessible while supporting faster, more accurate resource allocation within the City.
Context and Problem
Advances in AI present an opportunity to simplify this process. With WhatsApp already widely used across Cape Town, introducing an AI-supported flow can make reporting more convenient while improving the quality of information captured at the point of entry.

Solution Overview:

Additional information is collected using static blocks (without the use of AI):
Theory of Change
The table below outlines the expected activities, outputs, outcomes, and impact of the prototype.
Activities
Build the WhatsApp reporting system using AI, pilot with residents and customer relations staff, and refine alongside service teams.
Outputs
A WhatsApp system that allows residents to easily log reports via voice or text directly from their phone, confirm details, and submit categorised requests.
Outcomes
Easier submission process, less manual logging for staff, and higher quality report information.
Impact
More reports logged and resolved faster, with wider reach across the city.
1. AI Classification and Extraction
2. AI Categorisation: two approaches
LLM classifier
- A structured .txt file with consistent category formatting
- A small number of targeted examples
- A simplified prompt for reliable matching
Clarification agent


3. Multilingual Support
4. Voice and Text Flexibility
5. User Confirmation
6. Integration with the City system
The final approach used smaller, focused AI components:
Prompt Experimentation
This approach resulted in several challenges, such as:
01
02
03

Additional experiments included:
- Instructing the assistant to summarise long user messages before classifying them - If the user described the issue in natural language (e.g., 'Someone dumped bricks on the pavement'), extract or summarize it clearly (e.g., 'Building materials dumped on pavement').
- Testing different tones of voice—starting with a friendly, conversational tone and later shifting to a more concise one
- Adding multiple examples of issue classifications directly into the prompt to improve accuracy

Evaluation Approach
Testing With Synthetic Data

The process included:
- Generating synthetic complaints for each category in different tones
- Running these complaints through the categorisation prompts
- Comparing intended and predicted categories
- Manually reviewing errors to understand patterns
A similar approach to testing will be used for further iterations.
Challenges
Hallucinations
Missing or repeated questions
Image classification loops
Location
Hallucinations
Key Learnings
01.
Simplicity improves accuracy
Structured, concise prompts and a clean category file outperformed longer, complex approaches.
02.
Small, specialised prompts work best
Breaking tasks into smaller components reduced errors and improved control.
03.
Iterative testing is essential
Prompt behaviour changed with small adjustments, requiring ongoing refinement.
04.
Synthetic data strengthens reliability
Controlled testing identified weaknesses not visible through manual testing alone.
05.
Multilingual design must be intentional
Clear examples and tailored prompts improved recognition across languages.
06.
Voice is an important channel
Many residents rely on voice notes. Designing for both formats is necessary for inclusion.
07.
Shorter journeys improve completion
Simplified flows helped test groups reach the core value quickly.
08.
Quick prototypes improve alignment
Establish a core AI team for prompt design and testing, supported by teams for CRM, category, and workflow alignment.
09.
Human Capital Requirements
Controlled testing identified weaknesses not visible through manual testing alone.

Conclusion:
Iterative prompt design, multilingual support, and synthetic evaluation played a critical role in developing a reliable model. The methodology is adaptable and can be applied by other cities aiming to strengthen public participation and modernise service reporting through accessible digital channels.











