Big Data Dilemma

Wireframes of the Historical Ad Data analyzer

01.

The Client

Catalist compiles, enhances, stores, and dynamically updates data and provides data-related services exclusively to progressive organizations to help them better identify, understand, and communicate with the people they need to persuade and mobilize. They work with organizations to tailor data subscriptions to their needs and budgets.

02.

Project Summary

Over the course of four months, we had to go from processing one or two million rows of data to 34 million rows of historical data going back to 2010. With each iteration our team improved the load times for data-filtering and generating custom reports. Over two project phases, we finalized the historical ad data analyzer, a web app that makes Catalist's raw data usable. Clients can compare political TV ad trends from the past decade, offering them the ability to compare data on both a macro and a micro scale.

Original ad data analyzer

03.

The Challenge

We had to build a historical version of the original ad data analyzer that allows customers to build on the same custom pivot table reports using historical data.

Quickly processing that amount of data was the biggest challenge. With the ad data analyzer MVP, there were between one and two million rows of data being analyzed. With the historical ad data analyzer that amount jumped to 34 million. It is basically an archive of Catalist's data going back more than a decade.

Understanding the development challenges of dealing with this amount of data, I needed to design a product that would help diminish that data concern and make the product usable for clients.

04.

The Research

User research was vital to understanding client needs. I conducted user interviews and user testing with actual clients to isolate the areas we needed to change or improve from the ad data analyzer for the historical ad data analyzer.

I heard that clients might occasionally look at one year's data in isolation, but often they needed to compare data between two election cycles. I also learned that users were interested in adding an additional filter beyond the four that were available. Clients talked about adding that additional filter at the end of the process, giving them one additional step at the end.

05.

The Solutions

To help solve the big data challenge, I worked with our developers to design a few solutions that would not only improve the user experience but also cut down on the amount of data that needed to be processed. That helped to improve the load speed, but when load speed may still bel long, warn warn users so that the experience met their expectations.

Full Year Data Only

When I dug deeper into how clients were comparing historical data, I learned that they needed to compare data from a full election cycle rather than specific dates. That allowed us to aggregate data for an entire year rather than needing to keep individual records for each day, greatly limiting the amount of data that needs to be processed.

Election Year Selection

A gif showing how to compare dates in the historical ad data analyzer

Overall Filter

I also turned the clients' recommendation for a fifth filter on its head. Instead of adding one additional filter at the end of the process, I added a required overall filter, essentially a filter of the filters, at the start of the process.

No only did this provide users with the extra filter they requested, adding the overall filter to the start of the process immediately helped cut down on the amount of data that would need to be processed. Once you selected the overall filter, filter options in the four original filters would be limited to data.

The filtering of filters reduces some cognitive load. It only exposes relevant options, which in some cases dramatically reduces the number of items a user needs to select from.

Primary report filter

A gif showing how to use the overall filter in the historical ad data analyzer

Warning Message

We didn’t want to restrict the reports that users could create due to their large size. But, in many cases, users were unaware of how large the report would be that they were building.

To notify users of both the size of their report and the potential impact that could have on the load times, I added a warning message near the generate report button that would warn a user in real-time when their report was getting large.

Warning message

07.

The Results

We finalized the historical ad data analyzer that allowed more advanced visualization approaches driven by a larger dataset of historical ad data. We improved the load times for data-filtering and generating custom reports, which can compare political ad trends from the past decade.