- Dipankar Sarkar: A technologist and entrepreneur/
- My writings/
- Enhancing User Expression: ML-Powered Vernacular Sticker Keyboard at Hike/
Enhancing User Expression: ML-Powered Vernacular Sticker Keyboard at Hike
Table of Contents
As the lead of the Machine Learning team at Hike Limited, I spearheaded the development of an innovative, AI-driven vernacular sticker keyboard. This project aimed to revolutionize user expression by intelligently suggesting stickers based on multilingual inputs, including Hinglish, Tamil English, and various other language combinations.
Project Overview #
Our goal was to create a smart sticker suggestion system that could understand and respond to diverse linguistic inputs, while personalizing suggestions based on individual user preferences and interactions.
Technical Approach #
Core Technologies #
- Python for backend development and model training
- TensorFlow and TensorFlow Lite for model development and on-device inference
- Natural Language Processing (NLP) techniques for language understanding
- BigQuery for data storage and analysis
- Airflow for workflow orchestration
Key Features #
Multilingual Input Processing: Developed NLP models capable of understanding and interpreting mixed-language inputs.
Contextual Sticker Suggestion: Created an AI model to suggest relevant stickers based on input text and context.
On-Device Personalization: Implemented TensorFlow Lite models for on-device learning and personalization.
Federated Learning: Developed a system for updating global models while maintaining user privacy.
Implementation Challenges and Solutions #
Challenge: Handling diverse linguistic combinations accurately. Solution: Trained models on a vast corpus of multilingual data and implemented advanced tokenization techniques.
Challenge: Ensuring real-time performance on mobile devices. Solution: Optimized models for mobile using TensorFlow Lite and implemented efficient caching mechanisms.
Challenge: Balancing personalization with user privacy. Solution: Implemented federated learning techniques, allowing model improvements without centralized data collection.
Development Process #
Data Collection and Analysis: Gathered and analyzed user interaction data using BigQuery to understand sticker usage patterns.
Model Development: Iteratively developed and refined NLP and recommendation models using TensorFlow.
On-Device Implementation: Optimized models for mobile devices using TensorFlow Lite.
Federated Learning Setup: Designed and implemented a federated learning system for privacy-preserving model updates.
Testing and Refinement: Conducted extensive A/B testing to optimize model performance and user satisfaction.
Results and Impact #
- Achieved a 40% increase in sticker usage across the platform.
- Improved sticker suggestion relevance by 60% compared to the previous system.
- Successfully handled inputs in over 10 different language combinations.
- Maintained user privacy while achieving continuous model improvements through federated learning.
Conclusion #
The ML-powered vernacular sticker keyboard project at Hike exemplifies the potential of AI in enhancing user expression and engagement. By successfully integrating advanced NLP techniques, on-device learning, and federated learning, we created a system that not only understands diverse linguistic inputs but also personalizes the experience for each user.
This project showcases the power of combining cutting-edge ML technologies with a deep understanding of user needs and privacy concerns. As we continue to refine and expand this feature, it remains a cornerstone of Hike’s commitment to providing innovative, user-centric communication tools.