Vera — Work

Redesigning the backend tagging tool for an AI-powered retail platform — a study in why information architecture is the foundation of any AI solution.

Summary

Vera is a retail platform that influences shoppers to find new fashion trends and inspiration from their preferred retail store. It uses Artificial Intelligence (AI) to provide style recommendations. The platform consists of two main applications: an in-store kiosk and a tagging tool.

The tagging tool is a crucial component used to reconcile fashion trends with store inventory. To do this, the company developed an in-house AI solution.

But like many projects, problems need to be identified, solutions are to be studied, products are to be designed, developed, tested, and measured — and the processes are repeated until a satisfactory result. Was this AI solution bulletproof? If not, what was the missing key?

I was the product designer and researcher for the tagging tool, which was designed for backend support. This was a six-month project, and I owned the redesign with a heavy emphasis on research methods.

Delimitation of this case study

Since the platform is not yet widely available, this case study only focuses on the research conducted for the tagging tool and the exploratory solutions. I have also added fictional content to maintain confidentiality.

The purpose of this case study is to walk you through the process that led me to discover the importance of information architecture in implementing an AI solution.

The original page included a short video of the redesigned tagging tool — not yet re-hosted in this archive.

Design approach

I tailored my approach to a service design method: research, then ideation, then prototyping.

Research

Preparatory research
Secondary research
Improvised performance measurement
Creating personas

Ideation

User journey map of status quo
User journey map of future state

Prototyping

Rapid prototyping

#1 Preparatory research

According to This Is Service Design Doing, preparatory research is your personal preparation before you start your actual research.

I started with informal, quick co-creative sessions with team members, colleagues, and some stakeholders. The purpose: to learn about the platform and its features. What was the inspiration? Who was it created for?

Broader topics included:

What does shopping feel like today?
How do consumers use social media?
What technology is used in the market?
Who are the competitors?

I learned about the rationale of the retail stores who expressed interest in investing in this technology. As suspected, one of the driving forces was to increase sales by improving the customer’s in-store experience. The secondary objective was to use social media to support that effort.

I also learned about the industry, dynamics, key players, and interactions — all of which I carried into the next steps.

I closed preparatory research with an informal discussion and presentation to the team.

During this activity, I had the opportunity to see the existing tagging tool and to interview the developers who conceptualized it. One of the main concerns I noticed right away was the unstructured information architecture.

#2 Secondary research

To kick off the secondary research, I considered the findings from preparatory research — the apparel market (industry), shoppers (key players), social media (interactions/dynamics), and in-store experience (interactions/dynamics) — and created an outline of questions:

Apparel market: What is the global apparel market size? What are the projections for the USA region? Which apparel is on-demand?
Consumer behavior: What are their behaviors pre-purchase? What are their behaviors post-purchase?
Social media: How does it influence consumer behavior? What age demographic uses social media in purchasing decisions? What platform is most popular for clothing choices?
In-store experience: With the rise of mobile, does enhancing the in-store experience offer a good ROI? If stores are not going away completely, what improves store resilience? How do we make in-store shopping better?

I identified the sources and evaluated whether they were reliable. To close out the activity, I created a summary with a visual presentation. (There was also research on the available technology, including how to train machine learning data, AI, and computer vision.)

The results from preparatory and secondary research became reference points for creating a workflow and interactions that could make or break the UI. In both activities, I hypothesized that the lack of structure in the IA significantly caused the inaccuracy of training machine learning data. But these claims could not be substantiated without quantitative study.

How do I measure the effectiveness of IA? I asked the following questions as a guide:

Is it clear?
Is it informative?
Is it usable?
Is it credible?

#3 Improvised performance measurement

Ideally, I wanted to conduct baseline testing to evaluate how users navigate the tagging tool. Unfortunately, there were many constraints — the research budget and the geographic location of the people responsible for validating the images (the “taggers”).

So I created an improvised performance measurement. This is not exactly a standard user research method. Rather, it was a mathematical approach to evaluate the accuracy of the data set.

Steps taken:

I requested a complete data set of validated images from the developers. The data captured by the tagging tool was in JSON format; I converted it to CSV and analyzed the accuracy by comparing the images recognized by the computer vision algorithm to the data manually inputted by taggers — using advanced spreadsheet skills and the mathematical computations of mean, median, mode.
I examined the web application — specifically the steps for validating and reporting images. I created tasks and scenarios to test the workflow, and I screen-recorded everything.

High-level findings

The first activity surfaced mismatches caused by non-standard labels, missing categories, and most importantly, the lack of structure in the information architecture.

In the second activity, I observed:

Non-standard or non-conventional taxonomy and labels.
Multi-step, multi-page flows.
Unpredictable toggle behavior in Training Module and Social Task features.
Limited color selection.
Inability to identify other types of apparel.
Inability to detect layers of clothes.
Session time-out at 1 hour.

Recommendations

Design a better information architecture; consider using the existing Google taxonomy.
Use fashion industry-standard terminology.
Scrape images should be 640 × 480 pixels minimum.
Initially, scrape images from the world wide web.
Add breadcrumbs, progress indicators, signposts.
Add a zooming interface.
Add a fashion guide / How-To / Help page.
Auto-save sessions.
Milestone submissions.
Visual cues and icons.

Reflections

IA is important in creating an outstanding user experience. In AI, it is very crucial to create a solid structure of IA, clear labels, and strategic use of categories.

One of the most difficult challenges I observed was that neither the taggers were fashion-forward nor were the developers. As a result, the taggers would guess what these images were.

On one occasion, a tagger classified a skort as a short. Unfortunately, with the limited capabilities of the computer vision algorithm, it was auto-tagged as a skirt. But which is correct? How do you classify an object that could be either?

What about terminologies for lengths or types of neckline? Would it be a good approach to rely on terminologies alone? If the taggers decided to use external resources such as Google search, the terminologies used in the tagging tool were not standard.

Fortunately, Google’s taxonomy is public. It would be useful to benchmark against it while creating specific user stories for outliers.

Having said all that, I realized how critical IA is in AI. Efforts should be exhausted improving it — otherwise it defeats the overall experience.