A brief walkthrough of a Nike Shoe Products dashboard data visualization I created via Tableau.
The dataset I used was from Kaggle, and was focused on Nike sneakers and info on their reviews, ratings, original price, and sales price. It has the following columns (omitted non-working/redundant columns):
Product Name - Product listing title (ex: Nike Air Force 1)
Product IDÂ - The SKU of the product (some products can share the same product name, but have different SKUs due to color way for example)
Listing Price - The original price of an item
Sale Price - The final sold price of an item
Description - The product description of an item
Rating - Rating score of a product ranging from 1-5 (0 if no review given)
Reviews - An aggregated sum of reviews a product has received
Step 1: Propose the Business Question
Ensures we manipulate the data for the right reasons, and not just because. In this dashboard, I wanted to focus on topics regarding which shoe categories have the most reviews, highest ratings, most discounts, etc.
What shoe categories have the most reviews / products? Which shoe category has the highest rating?
Which shoe categories typically go on discount the most?
Are customers generally satisfied with the shoes offered? Which categories could see some improvement?
Do product listing/description word count/character length impact reviews or ratings?
Step 2: Prepare & Clean the Data
The dataset has a few problems that will need transforming and cleaning. Some examples are that the listing/sale price are inflated.
As you can see in the image above, the prices are multiplied by 100 (first listing price $16995 should actually be $169.95). This was an easy fix through a calculated field of [Listing Price]/100.
However, another issue was that when an item did not go on discount, the listing price would be $0 and would have it's original listing price as the sale price.
I revised the calculated field I originally had to display the revised sale price display (ex: $74.95 instead of $7495 in first sale price row) when the listing price is $0, and if it had a value to have its revised calculation.
The discount column included in the dataset also were all 0's, and did not accurately display the discount percentage. This was fixed through another calculated field below.
I wanted to also categorize each sneaker into a shoe type such as Basketball, Running, Casual, and more. I used ChatGPT to categorize the shoe product names and plugged it into Tableau as a calculated field.
This will allow us to have a more holistic view of each shoe category, answering the business questions listed above.
Some other flags I wanted to include were whether an item had a discount or not, which would be determined if the discount percentage field made earlier was > 0.
I also noticed that rating values from the dataset ranged from 0-5, with 0 representing an item not having any reviews. I wanted to bucket the reviews into customer satisfaction with the product, and used the following logic for the calculated field.
Lastly, I wanted to see if word count on a product listing or description would impact ratings or reviews, or varied across shoe types. This was done using the LEN function on Tableau.
The dataset should now be clean and categorized enough to build insights for the business questions mentioned above.
Step 3: Creating Worksheets & the Dashboard
Creating clear visualizations that tell a story using the dataset to inform customers and stakeholders.
For the first question, "What shoe categories have the most reviews / products? Which shoe category has the highest rating?," I created a few bar charts that shows reviews, average rating, and product count by shoe category.
As we can see above, casual sneakers have the most products available in addition to reviews. However, basketball shoes have the highest rating/product satisfaction on average.
For the next business question, "Which shoe categories typically go on discount the most?," I created a chart displaying the count of products either on discount or marked at full price by category, along with showing the % mix.
Running shoes have item % mix on discount, whereas skateboarding shoes have the most item % mix listed at original/full price.
The third business question, "Are customers generally satisfied with the shoes offered? Which categories could see some improvement?," this can be answered with the Average Ratings by Shoe Category bar chart. With the rating logic of 1-5, from very dissatisfied to very satisfied, customers are generally neutral/satisfied with the shoes offered. The shoe category with the lowest average rating however are casual sneakers.
As a fun question for myself, I wanted to see if a product's listing name or description word count could impact a positive or negative rating. The last question I wanted to answer was "Do product listing/description word count length impact reviews or ratings?," which was done through a bar graph showing the word count and it's average rating.
Looking at the bar graph above, we can see that as word count increases or decreases, there is no significant increase or decrease to its average rating. This shows that there isn't a correlation between these two variables, and was omitted from the final dashboard.
Hope you enjoyed this analysis on Nike shoes! Leave a comment if you have any suggestions on improving or other questions I could try to answer with this dataset.
Thanks for reading!
- Victor
Commentaires