To increase user engagement and drive the growth of our product, we will conduct user research and gather feedback to identify areas for improvement and potentially develop a new product.
I collaborated closely with the Product Director, 2 Product Managers, and 8 engineers to initiate the research process, generate ideas, and validate solutions, and successfully launch the product.
Successfully brought the product from concept to launch, generated initial interest, and secured additional funding through a Post-A venture round.
InfuseAI offers PrimeHub, an open-source MLOps platform that streamlines research for data scientists. PrimeHub simplifies the research process with easy dataset loading and resource management. To sustain growth, we're conducting new user research to identify new opportunities and improve our product.
Our original product, PrimeHub.
To assist the team in conducting user research and incorporating the sales team into the process, I developed an initial plan outlining the steps we can take to move forward and launch user research to gather customer needs.
Research process that I created
To understand the pain points in the ML deployment process, I first helped the team create a survey. Then, in collaboration with the community manager, we distributed the survey across various data science communities. The survey also offered participants a $45 Amazon voucher for their time, if they were willing to have a conversation with us about their feedback. The survey was shared in 23 communities and we received 115 responses, resulting in 37 participants for further research.
A message that I sent to a well-known data scientists with 20k+ followers.
Social media posts
Fun fact #1
When reaching out to strangers on the internet, my approach is to be human and think about how I would engage in a conversation with them in person. I try to ask questions that will make the person feel comfortable and willing to respond. Since I have worked for several startups that had limited resources in the beginning, I have had ample experience in reaching out to strangers online. Additionally, I make sure to tailor my approach to the specific person and their interests in order to build a connection and increase the likelihood of them responding positively.
Research
After we talked with the data scientists, we realized that data scientists cares a lot about model monitoring and collaboration on it. Rather than the overall deployment process.
Challenge #1
This is a common practice among data scientists as it allows them to easily organize and track the different models they are working on, along with their corresponding performance metrics and parameters.
Challenge #2
Notion and Google Sheets are popular tools used by data scientists to collaborate on projects. Notion is often used for project management and documentation while Google Sheets is used to store and share data. This combination of tools allows data scientists to easily share and access information, streamline communication and improve collaboration.
Challenge #3
As the number of models increases, it can become difficult to maintain an organized and accurate spreadsheet. This can lead to errors and inconsistencies in the data, and it can also become increasingly time-consuming to update and manage the spreadsheet. This can result in data scientists spending more time on data organization and less time on actual data analysis, which can hinder the project's progress.
Ideation
Based on the pain points we collected from interviewing data scientists, we organized a series of workshops to generate solutions and brainstorm ideas. These workshops provided an opportunity for the team to collaborate and visualize our ideas, which helped to bring our concepts to life and gain a better understanding of how they could be implemented. By sharing our ideas with the team, we were able to gather feedback and make adjustments as needed to ensure that our solutions were effective and met the needs of the data scientists.
Part of the workshop where I facilitated to help the team to visualize the ideas
Design
After the workshop where we shared our sketches, I assisted the team in consolidating all of the ideas generated during the session. This involved analyzing the concepts that were presented and identifying common themes and ideas that could be combined to create a cohesive and comprehensive solution. By combining the ideas, we were able to further visualize the concept and develop a clear understanding of how it could be implemented in practice. This helped us to refine our approach and ensure that the solutions we proposed were effective and met the needs of the data scientists.
A message that I sent to a well-known data scientists with 20k+ followers.
Collecting feedback on the design
Test
Designed with the modern data stack in mind, PipeRider fully supports a wide range of popular datasources, namely Snowflake, BigQuery, Redshift, Postgres, SQLite, DuckDB, CSV, Parquet.
Assumption workshop
Test with potential users
Define
After testing our initial concept and conducting further interviews with data scientists and machine learning practitioners, we realized that there was a deeper pain point that needed to be addressed. Instead of simply providing them with another general model monitoring platform, we recognized the need to solve a specific problem that they were facing. This involved taking a more in-depth look at the challenges they were facing and identifying specific areas where they needed the most support. Through this process, we were able to gain a better understanding of their needs and tailor our solution to better meet their requirements. This ultimately led to a more effective solution that addressed their specific pain points and provided greater value to data scientists and machine learning practitioners.
Challenge #1
The non-linear nature of data versioning makes it challenging to effectively manage and track dynamic, massive datasets using traditional version control tools. Instead of focusing on the actual content of the dataset, data scientists often rely on tracking changes to the dataset's metadata, such as training-serving skew, feature definition changes, implicit data dependency, and bias. However, this approach does not provide a comprehensive view of the entire ML pipeline and makes it difficult to reason about the correlation between changes made at different stages. This lack of a tool to manage complexity in these pipelines can be a major obstacle to building robust ML systems.
Challenge #2
The complex nature of ML pipelines reduce the reproducibility and resilience of your ML system:A subtle change in the feature definition can have a massive impact on the resulting model quality.An infrastructure failure can cause changes in the distribution of data and becomes a modeling problem.Debugging such issues requires having the whole picture, which is expensive and hard.
Challenge #3
A real-time performance metric for ML systems is crucial for evaluating their quality. These metrics (e.g. accuracy of recommendation systems) are domain-specified and every problem requires a different way to acquire these metrics. The faster you can get the feedback from the real world, the faster you can iterate and improve your ML project quality.
Design
PipeRider, a pipeline-wide change management tool, aims to solve the hard problem by building on top of the shoulder of these players and providing a better feedback loop on metadata’s change.
Competitor screnshots and collections on Miro
Fun fact #2
This is another story so I decided to skip this part... as it’s a full month of brainstorming sessions to come up with the brand, idea, the product and the design. In short, we hosted a branding workshop to decide on the brand, a couple other workshops to discuss about the problem, and a hackathon to come up with the solution separately.
Design
After successfully implementing the initial solution, we gained confidence in the idea and moved forward to develop the critical user journeys that were necessary to fully realize the potential of our solution. This involved identifying the key steps and actions that users would need to take in order to effectively utilize the system, and designing the user interface and user experience to support these journeys. By focusing on the critical user journeys, we were able to create a solution that was intuitive and easy to use, and that effectively addressed the needs of our target users.
The user goal for this step is to successfully create an account, understand the basic features of the platform and be able to navigate the interface.
The user goal for this step is to be able to create a new project, set the project's properties, and organize the data and models within the project.
The user goal for this step is to be able to customize and manage the settings of the projects, such as adding collaborators, setting permissions, and configuring integrations.
The user goal for this step is to be able to compare the performance of different models and experiments, and to identify the best-performing models.
The user goal for this step is to be able to view the history of changes made to the project, including the performance of different models over time, and to understand the impact of different changes on the project's performance.
The user goal for this step is to be able to manage the user's account settings, such as changing the password, updating the profile, and managing the email notifications.
Design
Before beginning the design phase, we also defined the information architecture (IA) to ensure that the design would cover everything within the critical user journey. This process involved organizing the content and functionality of the platform in a way that made it easy for users to find what they were looking for and complete the key actions they needed to take. This step was crucial in ensuring that the design would be intuitive and user-friendly, and that it would effectively support the critical user journeys identified earlier. By defining the IA together as a team, we were able to ensure that the design was comprehensive and met the needs of our target users.
Information architecture we made on Figjam
Test
During the prototype phase, we received feedback from customers on various aspects of the design, including the need for filters and the ability to connect multiple datasets on S3.
Information architecture we made on Figjam
Challenge #1
By providing a filter, the data scientists can quickly filter through the data and focus on the important data for their analysis.
Challenge #2
the data scientists can easily combine data from different sources and thus have a holistic view of the pipeline. This allows data scientists to more effectively manage and track the dynamic and massive datasets, and to make better use of the data to improve the performance of their models.
Design
Before beginning the design phase, we also defined the information architecture (IA) to ensure that the design would cover everything within the critical user journey. This process involved organizing the content and functionality of the platform in a way that made it easy for users to find what they were looking for and complete the key actions they needed to take. This step was crucial in ensuring that the design would be intuitive and user-friendly, and that it would effectively support the critical user journeys identified earlier. By defining the IA together as a team, we were able to ensure that the design was comprehensive and met the needs of our target users.
User onboarding (Simplified)
UJ 2 - Create New Project
Timeline
Tracking and comparing experiments
Managing datasets
FYI
Here’s a loom describing the final design and how the final implementation looks like (in progress).
Initial design of the content management page
The new design has helped us to reach a higher CSAT score because it allows customers to easily find the information they need and navigate the app with ease. The nav bar feature has been particularly well-received, as it gives users the ability to track translation status at east. This has led to an increase in user engagement and satisfaction.
With the new design being released, it also makes us easier to integrate more apps with the scalable design. Up to 2023, we have released 7 more apps including Typeform, Google Play, Youtube, Marketo and many more!.
As the services that VdoTok provides consists of complicated connection data, initially it was a big headache to present 50+ metrics in a webpage. To soothe this, I first tried to categorize different data into different levels and categories, and then come up with some ideas to visualize the data without taking up too much space.
As the end-users of this product are the developers, I worked closely with the developers in our own company during the design process. This ensured regular feedback from the developers to improve the design.