The brief.
Annotation, management and more.
Black Sesame Technologies is an artificial intelligence company focused on image processing and visual perception algorithms. Their engineers need hundreds of thousands of images that are annotated with detailed information to train their models.
The task forces responsible for annotating images, internal labeling teams and outsourced labelers, have been asking for a platform that encompasses the entire annotation process. At the same time, in order for management to track the cost efficiency of each contracted company, workers must record their daily work time.
​
The labeling task force, which includes users with drastically different responsibilities, needs to maintain a seamless and productive workflow, all while accurately recording their work time.
Challenges and constraints
The biggest challenge I faced in this project was the language barrier. Both the team working on this platform and the users of the tool were located in China, and did not speak English.
​
Between the asynchronous communication due to time difference, and the difficulties communicating with the China-based team, gathering information and receiving feedback felt like an insurmountable wall at the start of the project.
​
With the help of a US-based teammate who acted as a translator, I persevered, learned to read a few Chinese words, and successfully drove the platform's design in a better direction.
The story so far...
Although the project had a bumpy beginning, I redesigned the workflow of the primary persona, the labeling worker, by balancing business requirements and user needs.
​
After building, revising, and translating a greyscale prototype, I usability tested with several labeling workers, and iterated on the design based on their feedback.
​
Currently, I have prepared a high-fidelity prototype for usability testing, as well as additional research for the platform's secondary personas, the managers and reviewers. However, due to urgent project deadlines, participants are not available.
​
This case study will be updated when the project proceeds.
Confidentiality note
In order to protect the confidentiality of Black Sesame Technologies, dummy data has been used in this case study.
Demonstration of the user flow of one of ModelHug's tools, RegrPerf, and flexible graph views

01. DESIGN?

Joining Midway
When I first joined the team, which consisted of the supervisor, a front end developer, a back end developer, a data science intern, and several engineers, they were already partly done developing the features they wanted to implement based on current engineer feedback. Their general plan was to build the skeletal site first, and do a UX/UI pass afterward.
​
Based on my early discussions with the team, my initial understanding was that ModelHug was a tool to quickly retrieve and display existing information for AI models the user either selected or uploaded.
In order to streamline the design and construction of new pages, I decided to prioritize the establishment of the website’s visual language.
As I moved into prototyping, I was unable to secure participants for usability testing, so I continued to iterate the page designs based on team and stakeholder feedback, while expanding the style guide to accommodate new components.
02. IDEATE


At the same time, I sketched an early layout redesign of several tools' pages - DAGView, RegrPerf, PerfEval, and YmlGen - based on an examination of the current site and my conversations with the engineers on the team. In this case study, I will be focusing on the two major tools, DAGView and RegrPerf.

Change of Plans
After a few pages had been designed and implementation had begun, the scope changed dramatically: Engineers wanted quick numbers and versatile visuals to analyze the incredible amount of data from their AI models, starting from the DAGView page.
​
Currently, ModelHug displayed results in a data table, but why would engineers use the smaller website table over Excel? As I worked to incorporate visualization into the website, I realized that simple displays just weren’t going to cut it - so I suggested a dashboard.

03. RESEARCH AND (RE)DESIGN: DAGVIEW

Asking Better Questions
When I finally pitched the dashboard idea, I realized I had been asking the wrong questions and looking in the wrong direction. Up until that point, I had simply been reskinning the features they had already made, but I should have been looking deeper - looking at the reasons why they were building those features.
​
In order to dig deeper into the problem and see past the surface, I started asking what data they needed, why they needed it, and what decisions effect the data would have on decisions.
Two Users, One Stop
Based on my discussions with the team, there were two primary users of the tool: The engineers, who needed to closely examine the nodes that made up a model, and the managers, who needed a birds eye view of the model's numbers and composition.
​
To accommodate both users in DAGView, I designed the model quick-look Overview tab, as well as the Weight Analysis tab, where the user could navigate and examine statistical data from each node within the model.

"How many significant digits? As many as possible."
One of the biggest challenges of determining card layout was that many of the numbers and names were incredibly long. In order to balance visibility with convenience, I designed a collapsing node list, so that users could browse nodes while viewing statistics, or they could expand the list to see the full node names.
Demonstration of the DAGView dashboard's Overview and Weight Analysis tabs
04. RESEARCH AND (RE)DESIGN: REGRPERF

Scrapping and Restarting
Around the time I finished designing the new and improved DAGView, the team had decided on a complete rework of the RegrPerf page; like DAGView, the original page had simply been outputting a data table that was less useful than Excel. However, unlike DAGView, RegrPerf needed to display information about anywhere from one to hundreds of models, all in one place.
​
Not only that, but depending on whether the user selected just one or multiple versions, the data displayed was completely different.
Sometimes One, Sometimes Many
In order to better understand the role RegrPerf needed to play, I asked users how many models they were usually working on, what data they needed to compare and why, and what decisions came out of that data.
The engineers, who were usually responsible for one or several models, needed to track a model's performance across time as well as across development versions. On the other hand, the managers needed to be able to track performance and quickly spot potential issues for an entire version, which could easily include hundreds of models.
​
One user needed to view just a few models in detail, while the other needed to view many as quickly and effectively as possible. To accommodate both, I added a toggle that allowed the user to choose how many graphs displayed at once.

Balancing Legibility with Size
One of the challenges I faced when designing for one of the datasets, Runtime, was that users needed to see up to twelve metrics at once for a single model, on one graph. While this was easy to accomplish on a large graph, it was challenging to balance information with legibility at smaller sizes, especially with single version line charts.

05. HANDOFF

Once three, now four of us
At the same time I had been working on the dashboard, I had prepared the first set of page designs for handoff to the solo front end developer and the intern, broken down into a roadmap with specific changes and references per the developer's request.
​
When it came time to prepare the dashboard design for handoff, one more front end developer from the Wuhan branch had agreed to assist with the implementation.
​
However, the implementation of the website was noticeably different from my design. I asked myself, how did this happen? What was I missing?
"What pages?"
A few days later, it hit me. When I was discussing visual fixes with a teammate, they asked me what I was referring to when I said style guide. I mentioned that it was a page within the same file as the prototype, and he said he couldn't find it.
​
Figma initially displays the page list as collapsed, so someone unfamiliar with the software may not notice it unless pointed out to them.
​
Due to time differences I wasn’t able to directly hand off the design files to the developer - they weren't familiar with Figma and most likely didn't see the style guide. Additionally, annotations on the prototype were sparse at best - it was only after seeing the implementation that I realized just how important it was to tell as well as show.
"Everyone is saying it's easy to use."
- Black Sesame data engineer
At the end of eight weeks, the first version of ModelHug was live with an array of two minor tools and two major tools.
​
The two major tools, DAGView and RegrPerf, provide both big picture and granular data visualizations of the user's selected AI models.
​
Despite the bumpy start, ModelHug was successfully launched, with overwhelming positive feedback of both the usability and the visual design.
LEARNINGS & NEXT STEPS
Takeaways
-
If there is no space for UX, pitch a tent and invite your teammates in.
-
After being surrounded by design peers for so long, it was a huge shock to join a team that had very little history with and understanding of UX.
-
At first, I didn't know how to proceed, and ended up following along without pausing to dig deeper. I let go of being unable to usability test, only to realize later that because I didn't explain what I meant by usability testing, my teammates thought I meant usability testing the live implementation of the design.
-
However, once I started trusting myself more and speaking more to the rationale and purpose of my process, I was able to bring my team on board with a larger part of the design thinking process.
-
-
Tell and show.​​
-
Since this was my first time working with developers, I wasn't sure what to expect or what was expected of me, and furiously searching for advice online could only help so much.
-
It was only after seeing the first implementation of my designs that I really understood what I should have prepared alongside the prototype and style guide.
-
Notes. I should have written so many more notes.
-
-
Think in responsive.
-
Although ModelHug is a desktop-only web tool, the resolutions at which the site is viewed varies drastically depending on if the user is on a smaller laptop or a larger monitor. When I first started my designs, I wasn't considering how the site would stretch and squish, especially on smaller laptops.
-
As I learned more about using constraints in Figma, I was able to start incorporating some responsive considerations as I designed the screens, but I wasn't always able to get to all of it - in that case, I should have just written out my intentions to the developers next to the mockup.
-
Next Steps
The first version of ModelHug is launched, but the work is never done:
​
-
Iterating based on user feedback
-
Since I was unable to usability test the prototype, I will be collecting user feedback on the live tool to determine which features need improvement.
-
-
Designing new tools and expanding current ones
-
In the first version, four tools are live, but more tools will be added as the need arises.
-
In the next round of updates, I will be expanding upon one of the two minor tools to be accommodate more user customization.
-
-
Refining the style guide
-
As I progressed through the project and built more and more components, I learned a lot and my design decisions became more consistent.
-
Given the time, I would like to go back to the sections of the style guide I built early on and bring them in line with the designs I created later.
-