Table Of Contents
- What Are Monzo Pots and Why Do They Matter?
- What is Topic Modeling in Financial Customer Analysis?
- How Monzo Applies Topic Modeling to Understand Pots Usage
- What Categories of Saving Goals Are Most Common at Monzo?
- How Does Monzo Identify Seasonal Trends in Saving Goals?
- What Insights Can Emojis Provide About User Saving Behavior?
- Can Topic Modeling Improve Product Features for Digital Banks?
- How Do Data Cleaning Steps Impact NLP Results in Banking?
- Comparing Biterm Topic Models with LDA on Short-Form Banking Texts
- Are Customer Savings Motivations Changing Over Time at Monzo?
- FAQs
- Conclusion: The Future of Savings with Topic Modeling
In the fast-evolving world of digital banking, understanding what drives customers to save is key to building products that truly resonate. The UK-based digital bank has taken an innovative approach to decode customer behavior by leveraging topic modeling for saving goals. Through advanced machine learning and text clustering techniques, it analyzes the free-text names of its savings feature, called Pots, to uncover insights into user behavior. In this blog, we’ll explore how the bank applies topic modeling to understand customer saving goals, the role of emojis in financial personalization, and actionable insights for fintechs looking to enhance their offerings.
What Are Monzo Pots and Why Do They Matter?
Pots are a unique savings feature that allows customers to set aside money from their main account for specific purposes. Whether it’s saving for a dream vacation, a house deposit, or monthly bills, customers can name their Pots using free text, including emojis. This flexibility creates a rich, yet complex, dataset of nearly 5 million unique Pot names, with 350,000 new Pots created monthly.
The challenge? These short, often quirky names—like “Holiday 🌴” or “Bills 💸”—are tough to analyze manually. That’s where topic modeling for Pots analysis comes in, enabling the bank to cluster these names into meaningful categories and gain user behavior insights. By understanding what customers are saving for, it can tailor product features to better meet their needs.
What is Topic Modeling in Financial Customer Analysis?
Topic modeling is a powerful machine learning technique used to identify patterns in unstructured text data. In the context of saving goals, it groups similar Pot names into themes based on recurring words, phrases, or emojis. Unlike traditional methods that rely on manual categorization, topic modeling is unsupervised, meaning it doesn’t require predefined labels. This makes it ideal for analyzing messy, free-text data like customer Pot names.
The bank applies a specialized form called biterm topic modeling (BTM), which is particularly effective for short texts. Unlike standard topic modeling (e.g., Latent Dirichlet Allocation or LDA), BTM focuses on word pairs across the entire Pot name, capturing relationships even in brief phrases. For example, “Holiday Italy” and “Trip Rome” might be clustered together under a “travel” theme.
Why It Matters
Scalability: With millions of Pot names, manual analysis is impossible. Topic modeling automates the process, handling large datasets efficiently.
Granular Insights: It reveals nuanced saving motivations, from big life events to small daily expenses.
Customer-Centric Design: Understanding savings goals helps the bank create features that align with real user needs.
How Monzo Applies Topic Modeling to Understand Pots Usage
Monzo’s approach to text clustering in digital banking involves several key steps, each designed to transform raw Pot names into actionable insights. Let’s break it down:
1. Data Preparation: Cleaning the Chaos
Text data is notoriously messy, and Monzo’s Pot names are no exception. With 50% of Pots being single words (e.g., “Savings” or “Bills”) and 8% containing emojis, robust data cleaning is critical. Its process includes:
Lowercasing and Punctuation Removal: Standardizing text by converting to lowercase and stripping punctuation ensures consistency.
Lemmatization: Simplifying variations like “monthly” to “month” or “bday” to “birthday” reduces noise.
Emoji Translation: Converting emojis to text equivalents (e.g., 🏠 to “:house:”) makes them machine-readable while preserving meaning.
Best practices for cleaning financial text data for topic modeling include ensuring consistency in spelling, handling abbreviations, and accounting for cultural nuances in language. For instance, Its team addresses typos and slang to improve model accuracy.
2. Biterm Topic Modeling: Clustering Short Texts
Given the brevity of Pot names, It opts for biterm topic modeling over traditional LDA. BTM counts word pairs across the entire text, making it ideal for short-form data. The process involves:
Training the Model: Monzo used a sample of 800,000 unique Pot names to train the model, testing different numbers of topics to find the optimal configuration.
Coherence Scoring: The model calculates a coherence score to measure how well words within a topic align. Higher scores indicate better-defined categories.
Keyword and Emoji Lists: Post-modeling, It extracts keywords, phrases, and emojis to assign Pots to one of 20 topics, such as “travel,” “life events,” or “generic saving.”
How can biterm topic modeling cluster short text like Pot names? By focusing on word-pair relationships, BTM captures context even in single-word or emoji-heavy names, ensuring accurate clustering.
3. Handling Edge Cases
Single-word Pots like “Savings” lack context, making them tricky for topic modeling. It overcomes this by creating keyword lists based on model outputs, allowing even one-word Pots to be categorized effectively. For example, “Savings” might align with the “generic saving” topic based on its prevalence in similar contexts.
What Categories of Saving Goals Are Most Common at Monzo?
Monzo’s topic modeling revealed 20 distinct saving categories, covering a wide range of customer goals. Here are some key findings:
Generic Saving (30%): The largest category, including Pots named “Savings,” “Rainy Day,” or 💾. These reflect general financial caution or unspecified goals.
Travel (15%): Pots like “Holiday 🌴” or “Trip to Spain” highlight customers saving for vacations, with seasonal spikes in January and June.
Life Events: Pots for birthdays, weddings, or Christmas (🎁) peak toward year-end, driven by holiday gifting.
Household Bills: Pots named “Bills” or “Rent 💸” indicate practical, recurring expenses.
Pets: A smaller but charming category, with names like “Doggo 🐶” showing savings for pet-related costs.
What kind of data does Monzo use for Pots topic modeling? The dataset includes 800,000 unique Pot names, with a mix of single words, short phrases, and emojis, providing a rich source for analysis.Medium.com
How Does Monzo Identify Seasonal Trends in Saving Goals?
One of the most compelling insights from Monzo’s analysis is the seasonal variation in saving behavior. By tracking Pot creation over time, it uncovered patterns that reflect real-world events:
Life Events (e.g., Christmas): Creation of “life events” Pots spikes in late fall, as customers save for holiday gifts (🎁). This aligns with broader retail trends, where holiday spending peaks in Q4.
Travel: Travel-related Pots surge in January, likely as customers plan to beat the post-holiday blues with vacation goals. A secondary spike in June reflects summer travel planning. During the pandemic, travel Pot creation dropped but has since shown recovery.
Insights from analyzing seasonal patterns in saving Pots at Monzo help the bank anticipate customer needs, such as offering timely reminders or tailored savings tools during peak seasons.
Best practices for cleaning financial text data for topic modeling: Standardize text, handle typos, and translate emojis.
Comparing biterm topic models with LDA on short-form banking texts: BTM outperforms LDA for short texts due to its word-pair approach.
Using NLP for customer goal segmentation in fintech: NLP enables precise categorization of saving motivations.
Insights from analyzing seasonal patterns in saving Pots at Monzo: Seasonal trends guide timely product enhancements.
What Insights Can Emojis Provide About User Saving Behavior?
Emojis are more than just fun—they’re a window into customer intent. Monzo found that 8% of Pots include emojis, which add emotional and contextual depth to names. For example:
Life Events: 🎄 and 🎁 dominate, signaling Christmas or gifting goals.
Travel (Destination): Country flags (e.g., 🇮🇹 for Italy) appear in 58% of emoji-containing travel Pots, revealing specific destinations.
Pets: Emojis like 🐶 or 🐱 often reflect the type of pet customers are saving for.
Should banks analyze emoji usage in customer data? Absolutely. Emojis provide a layer of nuance that text alone can’t capture. For instance, “Holiday ⛷️” versus “Holiday 🌴” suggests different types of trips (skiing vs. beach), influencing savings timelines and goals. Solutions for emoji-based customer intent analysis in fintech include mapping emojis to text equivalents and integrating them into NLP pipelines.
Can Topic Modeling Improve Product Features for Digital Banks?
Monzo’s work demonstrates that NLP for financial product personalization can transform how banks design features. By understanding saving motivations, It can:
Personalize Notifications: Send reminders tailored to specific goals, like “You’re close to your holiday fund 🌴!”
Enhance Features: Introduce goal-specific tools, such as budgeting templates for travel or bills.
Predict Trends: Use seasonal insights to offer timely promotions or savings challenges.
Can topic modeling improve product features for digital banks? Yes, by providing data-driven insights into customer needs, it enables banks to create more relevant, engaging products.
How Do Data Cleaning Steps Impact NLP Results in Banking?
Clean data is the backbone of effective NLP. Monzo’s rigorous cleaning process—lowercasing, lemmatization, and emoji translation—ensures the model captures meaningful patterns. Poor cleaning can lead to:
Missed Connections: Typos like “bday” vs. “birthday” could fragment clusters.
Inaccurate Categorization: Untranslated emojis might be ignored, losing valuable context.
Lower Coherence: Noisy data reduces topic coherence, making results less actionable.
Best practices for cleaning financial text data for topic modeling include automating typo correction, standardizing slang, and preserving emoji meaning. These steps directly impact the quality of customer goal patterns banking data science.

Comparing Biterm Topic Models with LDA on Short-Form Banking Texts
While LDA is a popular choice for topic modeling, it struggles with short texts like Pot names due to limited context. BTM, as used by Monzo, excels because:
Word-Pair Focus: BTM analyzes word co-occurrences across the entire text, not just adjacent words.
Robust for Short Texts: It handles single-word or emoji-heavy Pots effectively.
Higher Coherence: Its testing showed BTM outperformed LDA in coherence scores for Pot names.
Is clustering effective for segmenting small-text financial data? Yes, especially with BTM, which is tailored for concise, unstructured data like financial product names.
Are Customer Savings Motivations Changing Over Time at Monzo?
Monzo’s analysis shows that saving motivations evolve with external factors:
Pandemic Impact: Travel Pots declined during travel restrictions but rebounded as restrictions eased.
Economic Shifts: Rising interest in “generic saving” may reflect economic uncertainty, as customers prioritize financial security.
Cultural Trends: Emojis like 🐶 for pets or 🌴 for travel highlight shifting priorities, with younger customers embracing expressive naming.
What is topic modeling in financial customer analysis?
It’s an unsupervised machine learning technique that identifies themes in text data, like Monzo’s Pot names, to understand customer behavior.
What categories of saving goals are most common at Monzo?
Generic saving (30%), travel (15%), life events, household bills, and pets are among the top categories.
How does Monzo apply topic modeling to understand Pots usage?
By cleaning data, using biterm topic modeling, and creating keyword lists to categorize Pots.
How can biterm topic modeling cluster short text like Pot names?
It analyzes word pairs across the entire text, capturing context in brief phrases.
How do data cleaning steps impact NLP results in banking?
Cleaning ensures accurate clustering by standardizing text and preserving emoji meaning.
Can topic modeling improve product features for digital banks?
Yes, by uncovering customer needs for personalized features.
Should banks analyze emoji usage in customer data?
Yes, emojis add emotional and contextual depth to text analysis.
Are customer savings motivations changing over time at Monzo?
Yes, influenced by seasonal, economic, and cultural factors.
FAQs
Conclusion: The Future of Savings with Topic Modeling
Monzo’s use of Monzo topic modeling customer saving goals showcases the power of NLP in understanding customer behavior. By clustering Pot names into meaningful categories, It gains insights into what motivates customers to save—whether it’s a sunny vacation, a furry friend, or holiday gifts. These insights drive smarter product decisions, from personalized notifications to seasonal savings tools. For fintechs, clustering savings motivations It offers a blueprint for leveraging NLP for financial product personalization. As customer needs evolve, topic modeling will remain a cornerstone of data-driven banking, helping create products that truly resonate.
Want to dive deeper into data-driven banking? Check out Monzo’s open roles in data science and analytics to join the revolution! Visit : CareerSwami