AWS for Industries

Empowering ML Teams with Amazon Q Automotive Knowledge

A common challenge in implementing industry-specific predictive use cases is the disparity between data scientists’ expertise in technology and domain knowledge. Conversely, individuals with extensive business experience in a specific domain might lack proficiency in AI tools and technologies. This disparity hinders organizations’ ability to envision and develop machine learning (ML) models that can help drive business value. This blog post will discuss ideas on how an AI-powered assistant bridges the knowledge gap for data scientists. This assistant can serve as a valuable resource for ML developers who lack a business operations perspective throughout the ML development lifecycle.

The blog Battery Digital Twin: The Future of Battery Intelligence establishes the need for domain specific ML model development and delves into the data collection strategy for electric vehicles. By continuously monitoring and analyzing battery data, potential failures can be predicted before they escalate into critical issues like thermal runaway, accelerated degradation, lithium plating, etc. Advanced analytics or building an ML model that relies on collected data to predict failures necessitates deep domain expertise.

In this blog, we’ll use publicly available relevant example datasets or schemas and demonstrate how Amazon Q can help identify potential ML models associated with those datasets. This blog utilizes three distinct datasets, logged from electric vehicles or battery cyclers, input to Amazon Q Developer. Amazon Q Developer is an AI-powered assistant for software development that goes beyond coding, such as troubleshooting and creating data engineering pipelines, to help developers and data scientists with their tasks.

Setup

You can either configure Amazon Q developer within your JupyterLab or local developer environment.

JupyterLab

Configure the JupyterLab notebook with Amazon Q by following the steps mentioned at Set up Amazon Q Developer for your users. After which, users will have the following access to Amazon Q within the console:

Figure 1 - Amazon Q in JuypyterLab EnvironmentFigure 1 – Amazon Q in JuypyterLab Environment

Local Environment

Install Amazon Q for command line. Assuming you have AWS Command Line Interface (AWS CLI) setup, you can also utilize Amazon Q capabilities to install and configure AWS CLI.

Figure 2 - Amazon Q for command lineFigure 2 – Amazon Q for command line

Note: You may receive slightly different outcome or commands as highlighted in the blog post, because the outcome of LLM is non-deterministic. If you want a fresh start and to help ensure the previous input is not influencing the response, then remove the context from Amazon Q by applying clear command.

/q chat

/context clear --global

Scenario 1: Understand Fleet Telemetry Data

We will use the publicly available data schema of Fleet Telemetry Data. This schema is structured with categories defined for each attribute. Therefore, let us explore the potential predictions for the Powertrain category.

>What are two critical failure prediction models for Powertrain that can be created based on the data schema available at https://developer.tesla.com/docs/fleet-api/fleet-telemetry/available-data?

Amazon Q will request to execute a few shell commands. You may need to run these commands manually because of non-deterministic nature of LLM response.

curl -s https://developer.tesla.com/docs/fleet-api/fleet-telemetry/available-data

curl -s https://raw.githubusercontent.com/teslamotors/fleet-telemetry/main/protos/vehicle_data.proto

Amazon Q comprehends the data model and proposes two potential failure prediction models. Here are the two models suggested by Amazon Q:

Figure 3 - Amazon Q output for Fleet Telemetry Data

Figure 3 – Amazon Q output for Fleet Telemetry Data

We can further ask Amazon Q for suitable algorithm or critical data fields.

> What ML Algorithm would be suitable for 'Drive Inverter Thermal Failure Prediction Model'

> Which data fields would be critical to increase accuracy or F1 score?

> How do we validate the ground truth of 'Drive Inverter Thermal Failure Prediction Model'?

You can continue conversing with Amazon Q and gathering the necessary input to create an ML model. In this example, we began with a data model as input to determine the specific domain use case, the required data fields, suitable ML algorithms, and so on.

Scenario 2: Are we Hallucinating?

Let’s do a ground truth to find if the outcome of Amazon Q is hallucinating. AWS has published Guidance for Electric Vehicle (EV) Battery Health Prediction along with sample data. We will use this sample data as input.

> Explain the data scheme and two crucial predictions can be derived from the dataset. https://raw.githubusercontent.com/aws-solutions-library-samples/guidance-for-electric-vehicle-battery-health-prediction-on-aws/refs/heads/main/source/demo/raw_dataset.csv

It will request to execute a few shell commands.

Amazon Q comprehends the data model in csv files and highlighted that the data has 100,377 rows, approximately 115 unique batteries throughout their entire lifecycle, with some batteries lasting over 2,000 cycles. Here’s the Amazon Q’s outcome:

Figure 4 - Amazon Q output of EV Battery Health Data

Figure 4 – Amazon Q output of EV Battery Health Data

While the csv file has header in short name ‘IR, QC, QD, Tavg, Tmin, cycle_no, cycle_life, battery_name, chargetime, battery_index’. Amazon Q understood the column and described it accurately:

Figure 5 - Amazon Q output of EV Battery schemaFigure 5 – Amazon Q output of EV Battery schema

Two prediction models suggested by Amazon Q from this data set are:

Figure 6 - Amazon Q output of ML model suggestion

Figure 6 – Amazon Q output of ML model suggestion

Amazon Q’s outcome aligns with the purpose of AWS’s guidance for Battery State of Health (SoH) prediction. Therefore, we can conclude that Amazon Q’s outcome is not hallucinating in this case.

Scenario 3: Understand Binary File Format

So far, we’ve used data in a readable format, such as CSV. NASA’s Prognostics Center of Excellence (PCoE) published an accelerated Li-ion batteries life cycle dataset. This dataset is in binary format and derived from experiments on Li-ion batteries conducted at various temperatures.

q chat 'download and unzip https://phm-datasets.s3.amazonaws.com/NASA/5.+Battery+Data+Set.zip into /tmp'

Amazon Q will ask for permission to run a few commands, such as those necessary to download and unzip the files.

Figure 7 - Amazon Q output of NASA data download

Figure 7 – Amazon Q output of NASA data download

> extract /tmp/5. Battery Data Set/1. BatteryAgingARC-FY08Q4.zip

We clear the context to make sure Amazon Q is not driving knowledge from the Readme file.

>/context clear --global

Let’s use Amazon Q to understand the .mat binary files. Adjust the file path if required.

> What are the different data attributes we have in /tmp/5. Battery Data Set/B0005.mat

Amazon Q will start executing commands to understand the structure of the B0005 file. It will try different approaches to understand the file. It clearly analyzed that the “file contains data from 616 cycles of battery operation, tracking the battery’s performance and degradation over its lifetime” and highlighted the data attributes shown below:

Figure 8 - Amazon Q output of NASA data attributes

Figure 8 – Amazon Q output of NASA data attributes

We continue our conversation with Amazon Q

> Identify two crucial prediction models that can be developed using this dataset.

Amazon Q clearly highlighted two ML model with key features to build the prediction model.

Figure 9 - Amazon Q output of ML model recommendations

Figure 9 – Amazon Q output of ML model recommendations

Additionally, Amazon Q also suggested that the “dataset can be used for the prediction of both remaining charge (for a given discharge cycle) and remaining useful life (RUL).”

As discussed in this blog, Amazon Q, an AI-powered assistant accessible to data scientists within their tool chain, can help decipher the meaning of datasets. This helps empowers them to make more informed decisions and drive innovation in the automotive industry.

Conclusion

As we’ve explored throughout this blog, the traditional divide between technical expertise and domain knowledge no longer needs to be a barrier for organizations seeking to harness the power of machine learning. Amazon Q offers customers a bridge, helping enable data scientists to more rapidly develop domain-specific insights while empowering domain experts to engage meaningfully with ML technologies.

Our exploration of electric vehicle and battery datasets show Amazon Q’s ability to help customers transform raw information into actionable ML strategies—identifying patterns, suggesting appropriate models, and providing contextual guidance that would typically require automotive expertise. This capability democratizes ML development, helping organizations to accelerate their innovation cycles and extract business value more efficiently. Amazon Q empowers teams to push boundaries, accelerate development cycles, and deliver more sophisticated solutions. This relationship between human expertise and AI assistance represents the future of innovation.

Supercharge your CLI with Amazon Q Developer.

Amit Kumar

Amit Kumar

Amit is a World-Wide Tech Strategy Lead of Sustainability & Electric Vehicles (EV) at AWS (Amazon Web Services), serving as an industry leader. He works closely with automakers globally to accelerate their transition toward electrification. Amit focuses on three key pillars of electrification: battery technology, charging infrastructure, and EV-specific vehicle experiences. Amit helps customers reimagine connected vehicle capabilities for electric vehicles while demonstrating how data serves as the foundation for sustainable mobility.