Data Science Chronicles: The Automotive Analytics Adventure.

Lado Ok
9 min readApr 2, 2024

--

Use data-driven insights to Optimize the platform by improving inventory management and pricing strategies to increase market share, drive revenue growth which will boost competitiveness and user experience.

Jump to:

  1. Key takeaways
  2. Notes for Improvements (NFI)
  3. Next up

Overview/Introduction/Abstract

I am documenting my data science journey.

Sitting on various positions, circumstances and situations has allowed the simmering and settling of this outcome; whereby there is a certainty in our direction.

I lost count of the numerous videos, accounts and reels offering advice on steps I need to take or types of projects to embark on. All helpful but that wealth of knowledge is contributory to confusion.

Python fundamentals were the newest concepts and thus, presented challenges and consumed noteworthy time — but I’ve overcome. The feeling when I finally grasped writing functions!

Though the insights gathered turned out juicy, I’ll admit that the code building held greater significance; wrestling with rigorous outputs from GPT, debugging errors, finding out variations of writing the same thing, are only parts of a thoroughly enjoyable journey worth describing. So, the code will be displayed along brief description of what was done.
P.S. I used jupyter notebook mostly and saved to my google drive so I could access on Google Colab when working on my second device.

To overcome the challenges, I had several stand-ups and consultations with my advisors. We ruled to split the project and article into two parts. Part 1 covering the cleaning, visualisation and analysis, while Part 2 documents navigating through model development and evaluation. A ruling that allowed the thorough fine tuning of my machine learning skills.

Similar style stand-up.

My background in data analysis, stemming from my previous studies in mathematics at A-level and my business studies degree, accentuated decision-making process.

The cars stuck because of course, passionate. “Why don’t EV’S dominate majority of auto-retailers’ inventory?”. Aviation, Housing and Energy, are topics to follow suit.

Key takeaways — The project enabled a thorough review of statistical techniques, thereafter dabbling into python fundamentals and Machine Learning algorithms; which encapsulates pandas, Numpy, Seaborn and Scikit-learn. All instrumental for the data summarisation, cleaning and transformation and visualisation that are paramount to successful EDA.
I’ve concluded that the beguiling nature of the beautiful soup library, has inspired me to scrape the data used in the upcoming modelling project.

Data Collection

The scraped dataset was obtained from popular site Kaggle.

Imported and read into my Integrated Development Environment (IDE)

The Feature Engineering process is critical in developing a Machine Learning model. The initial 15 columns were refined to that which was relevant for analysis and model development:

  1. Brand / Model — Manufacturer and specific model of the car
  2. Price in Euro
  3. Fuel Type
  4. Transmission
  5. Car Age
  6. Mileage
  7. Colour

Data Cleaning

The dataset was filled with all sorts. So, the cleaning process provided an excellent opportunity for me to apply and refine useful expertise. A pivotal step as the model will only performs as well as what you feed it.

Dropping Irrelevant columns
Changed to numeric values
Checking for duplicates and dropping the lot (200k+ rows, 10 cols)

The more data we have, the better the perfomance of the models.

Null Values
Changed the format to suit analysis
Used a function to turn Semi-Auto to Auto. Transmission
Needed to remove random values from the Fuel Type column
Arranged the columns for orderly presentation using .iloc

Exploratory Data Analysis

Structure for this segment splits insights into Inventory Management and Pricing Strategy. Contributes to finding suitable algorithm during model building.

Pricing Strategy

Transmission & Fuel Type vs Pricing

Analysing transmission types guides pricing decisions and boosts competitiveness.

Figure 1a & 1b: Code used + boxplot illustrating Median Prices.

Insight

  • Auto cars median price is higher than manual. As expected based on demand.
  • Mau, 2023 details socio-economic factors that have contributed to high costs. Supply chain interruptions. While Electric Cars Scheme breakdowns the nature of the battery contribute to the high prices. Reid, 2023 sheds light on lack of historical data for insurance companies contributing to inflated prices.

Price Distribution & Analysis

Identifying most expensive and cheapest cars will guide pricing refinements and ensures competitive pricing to attract customers seeking range.

Figure 2: Code used in generating subplots
Figure 3a: Subplots generated . Figure 3b: Code used + histogram illustrating Price Dist.

Insights -Most and least expensive cars we have.
- Distribution is right skewed, a few high-priced luxury cars, majority are mid-priced with a noticeable peak around 15k–23k. NFI

Inventory Management

Warehouse Dominance — Optimising inventory by understanding brands that occupy large and smaller shares. The goal is to ensure a diverse and appealing selection for customers of all spec.

Figure 4: Code used to derive insights
Figure 5a: Insights Derived. Figure 5b: Table obtained online showing Popular Brands in Germany

Insights— Top 4 in our graph (Figure 2a) closely resembles the data released by the KBA (Figure 2b)— compiled by a German based writer — . Positive that 10 of the remaining 11 most popular are in the top 20 best sellers in Germany. Lamborghini and Aston-Martin are anomalies — high value + niche market means decent stock is permitted. The others are simply not very popular. NFI

Fuel Type Analysis

Understanding distribution helps tailor marketing strategies and attract customers with different preferences.

Figure 6: Code for insights
Figure 7: Until December 2023, Germany had EV incentives.

Insight — Number arguably appropriate based on current EV market, (Buesemann, 2023), Reuters reports the abrupt termination of Germany’s electric vehicle subsidy programme as of December 2023. Volvo CEO remains “bullish on the very strong plug-in electric hybrid line-up”, with an observation of “people entering electrification through plug-in electric hybrids” before the full migration. Furthermore, “Germany, Europe’s biggest car market, EV sales are set to slump 14% this year, (Lindeberg, 2024).
Internal Combustion Engines rightly account for most of vehicles in the inventory, which is strangely followed by Diesel cars. NFI

Electric V Market & Diesel Consideration

Presence of popular EV’s guides inventory composition, expansion and reflects market. what percentage is populated by EV’s and what popular models do we have? Will also aid sustainability initiatives.

Figure 8: Code used for insights
Figure 9a: Bar chart generated. Figure 9b: Best Selling EV’s (Source: KBA)

Insight — Decent spread of our top 20 BEV’s (by count) is represented in KBA published top 20 best selling electric car models full year 2023. NFI

Figure 10: 35% of the inventory is diesel which is insane.

Insight — The decline in diesel vehicle market share from 17.8% to 17.1%, is nourished by rulings such as that which allows cities to “ban vehicles with older and polluting diesel engines”. Sales of this type still increased in 2023 meaning people are still resorting to the option as a fall-back from EV. NFI

Transmission Distribution

Knowing the % distribution aids inventory management and optimising user experience.

Figure 11: Code used to derive Insights
Figure 12: Bar chart showing Transmission ratio on the platform

Insights — Automatic cars are now ubiquitous.
Ford have approximated that vehicles sold “ equipped with automatic transmission has more than tripled, from 10 percent of Ford Europe sales to 31%” as of Jan 2020. Additionally, between 2014 and 2019, a 19% automatic vehicle increase was recorded as “all of the best-selling cars were launched with an automatic transmission option.” Driving comfort, lower fuel consumption, are factors accredited to the preference. NFI

Car Age Analysis

Will inform inventory refreshment strategies and highlights market demand for durable models.

Figure 13: Code used + Histogram illustrating Distribution of Car’s by age.

Insights — Car Age is normally distributed. Heavily populated around the 1–5 year old, dipping around the 10 year mark. Market showing preference for older cars based on confirmed durability. Federal Motor Transport Authority (KBA) share the average age of vehicles on the market is 10 years old, Heymann, 2023. NFI

Car Age vs Price Relationship

Can enhance value proposition communication to customers.

Figure 14: Code and scatterplot showing relationship between Car Age and Price.

Insights — negative correlation suggests new cars = higher price. However variability in cars of same age suggests other factors influence pricing.

Mileage Distribution

Figure 15: Code used + Histogram displaying mileage distribution.

Insight — majority falling between a wild shape. some outliers above 150k, very old cars — vintage?

Notes for Improvements

Optimize Inventory:

  • Focus on optimizing the inventory to match market demand. Ensure a healthy balance by managing the stock levels of popular and less popular brands.
  • Competitive pricing for popular brands (with options for mark-up open to exploration)
  • Potential discounts and extra marketing for the less common to attract a wider audience.
  • Price cars strategically based on age to align with market demand.

Transmission Mix:

  • Capitalize on potential revenue from automatic cars which generally command higher prices, consider promoting their sale to.
  • Maintain a balance to meet the demands of customers preferring manual transmission.

Fuel Type Focus:

  • Emphasize higher priced options to boost overall revenue.
  • Introduce marketing strategies to increase the visibility of electric and hybrid vehicles and educate customers on the benefits of electromobility.
  • Pod Point gathered that UK is seeing EV users “save around £528 through the year” — can be promoted sales.
Figure 13:

Electric Car Marketing:

  • Year on Year average growth of 3.33% since 2018 suggests trend towards electromobility.
  • The pronounced effects of socio-economic tensions, causing price inflation of raw materials. Constraints with infrastructure, lengthy charging times, less than sufficient driving range are some reasons the absolute switch is not yet feasible.
  • Encourage a long-term investment perspective, suggesting patience and strategic planning. Would you take your money out a hot stock due to momentary market downturn?

Diesel Car Disposal: Not time to dispose, but important to explore means to mass sale. Possibly offering special promotions or collaborating with dealerships that specialize in diesel cars.

Utilize the insights from the distribution of price, mileage, and age to implement dynamic pricing strategies.
Not yet time to overturn the inventory in favour of the EV’s, it is important that we work aligned with trajectories and ensure we stock what the market shows as the dominant forces.

Next…

Contemplating the implementation of a price prediction model caused more consideration. I struggled to fathom its practical utility within our context. Here came the emergence of a dynamic pricing model coming up next in the series i.e. Model Development and Evaluation.

--

--

No responses yet