While tools like Excel, SQL, Python, and R are staples in a data analyst’s toolkit, large language models (LLMs) like ChatGPT, Claude and Gemini are becoming increasingly popular. In this article, we’ll explore some of the best prompts you can use to start using AI to help you with everything from cleaning data to presenting insights and everything in between.

Should you use ChatGPT, Claude, Gemini or something else?

While ChatGPT is one of the most popular AI assistants available, there are multiple chat AI applications available now. Each is powered by a different LLM and each has their own unique strengths and weaknesses.

We’ve written a more in depth article about LLMs here, but at the time of writing this article, we tend to use Claude and ChatGPT, but the AI space is moving so fast that today’s best could be tomorrow’s worst.

Throughout this article we’ll use ChatGPT, Claude and Gemini interchangeably, but just know that these prompts can be used on any platform.

We recommend testing your prompts across different platforms and using the one that gives you the best result for your specific task today.

ChatGPT prompts for collecting and extracting data

One of the first steps in any data analysis project is getting your hands on the right data. ChatGPT can help you strategize and automate parts of the data collection process.

Example prompts:

Can you write a Python script to extract data from the following sources: {{source 1}}, {{source 2}}, and compile it into a single CSV file with headers: {{header 1}}, {{header 2}}?

What are the best practices for collecting high-quality data from online surveys and what tools would you recommend for this purpose?

ChatGPT prompts for data cleaning and preprocessing 

Cleaning and preparing data is often the most time-consuming part of a data analyst’s job. Use ChatGPT to help identify issues and automate preprocessing tasks.

Example prompts:

Can you provide a Python script to handle missing values in this dataset {{dataset_name}}?

How can I clean and preprocess a dataset in Python where dates are inconsistently formatted and some entries contain null values in the ‘price’ column?

Write a Python function that automatically detects and replaces outliers in the dataset with the median value of their respective columns.

ChatGPT prompts for data exploration and analysis

Exploratory analysis is critical for uncovering patterns and insights in your data. ChatGPT can suggest analytical approaches and help interpret results.

Example prompts:

What statistical techniques and visualizations should I use to explore the relationships between age, income, and spending habits in a customer dataset?

How do I perform a chi-square test in Python to determine if there is a statistically significant association between gender and product preference?

Generate a summary of key statistical metrics for {{dataset_name}}.

Plot a correlation matrix for the following variables in my dataset: {{list_of_variables}}.

ChatGPT prompts for data visualization and reporting

Communicating data insights effectively is a key skill for analysts. Get ChatGPT’s input on visualization techniques and even have it generate template code for common chart types.

Example prompts:

Can you guide me through the steps to create an interactive dashboard in Tableau for sales data that allows users to filter by region and product type?

What are the key elements to include in a data-driven report aimed at non-technical stakeholders to make the case for increased IT budget?

Create a script to generate a line chart showing trends over time for {{metric}} in {{software_package}}.

What are some effective visualization techniques for representing high-dimensional data?

What type of visualization would best represent the evolution of market share among competitors over time? Can you help me create this using Python’s Matplotlib library?

I need to create a dashboard for monthly sales performance. What key metrics should I include and how can I set it up using Tableau?

ChatGPT prompts for statistical modeling and machine learning

Building predictive models is an advanced skill that takes data analysis to the next level. Use ChatGPT to prototype approaches and debug your modeling workflows.

Example prompts:

Based on this summary of my dataset (provide summary), which machine learning model would you recommend and why?

What metrics should I use to evaluate a regression model in Python and how do I interpret these metrics to assess model performance?

What machine learning model would you recommend for predicting customer churn based on user activity and demographic data?

Can you explain how to evaluate the performance of a logistic regression model in predicting binary outcomes?

ChatGPT prompts for data storytelling and communication

Numbers alone are not enough – you need to weave them into a compelling narrative. Have ChatGPT help brainstorm storylines and talking points to get your message across.

Example prompts:

How can I use data visualization to tell a compelling story about the impact of recent marketing campaigns on sales growth?

What are the best strategies for presenting complex data analyzes to a board of directors to ensure understanding and engagement?

How can I effectively communicate the findings from a complex data analysis to non-technical stakeholders?

What are some compelling ways to present a case study on the impact of a digital marketing campaign?

ChatGPT prompts for data management and governance

Keeping data organized, secure, and compliant is a critical responsibility for analysts. Use ChatGPT to draft documentation and explore best practices.

Example prompts:

What are the key components of a data governance framework? Describe the purpose of each and how they support data management.

Outline a template for documenting a dataset, including metadata, schema, lineage, and access controls. Provide examples for each section.

How can data privacy and security risks be mitigated when working with sensitive customer information? Discuss technical and policy considerations.

What strategies should I implement for data governance to ensure compliance with GDPR in my data analyses?

How can I establish a secure and scalable data architecture that allows for both real-time and batch data processing?

ChatGPT prompts for business domain knowledge

To be effective, analysts need to deeply understand the business domain they work in. Tap into ChatGPT’s broad knowledge to get up to speed quickly and explore industry-specific use cases.

Example prompts:

What are some common data analysis use cases and challenges in the retail industry? Provide examples of how analytics can drive business value.

I’m new to analyzing data in the healthcare domain. What are some key metrics and data sources I should be familiar with?

Describe a framework for scoping and structuring a data analysis project from a business perspective. How do you align on objectives and deliverables with stakeholders?

How can data analysis reveal opportunities for cost reduction in manufacturing processes?

What are the key data points I should analyze to understand customer satisfaction in the hospitality industry?

How to integrate AI into your data analysis workflow

Use the website for ChatGPT, Claude or Gemini

You can use ChatGPT, Claude or Gemini directly through their websites. You’ll typically have to create an account and you’ll get instant access to a free version of one of their LLMs, but if you want a more advanced LLM, you’ll have to pay. For example, right now OpenAI offers GPT3.5 for free, but you can pay to upgrade to GPT4.

We’ve found this option is best for individuals who only need to use one platform and only need to use it by themselves.

If that’s you, then you can also use PromptDrive.ai as your prompt management tool. We have an unlimited free plan for individual users and a free chrome extension that will allow you to access your entire prompt library without the need to switch between tabs.

But if you want to save money, use different LLMs and collaborate on prompts and chat responses, then you might benefit from using PromptDrive.

Use PromptDrive.ai

PromptDrive.ai is a platform designed specifically for teams to collaborate on ChatGPT, Claude and Gemini prompts and workflows. It’s particularly useful for businesses or organizations where multiple team members need to work together on projects involving one or more LLMs.

An additional benefit is that you’ll be able to take advantage of AI for a fraction of the cost of using their native websites. You do need to add your own API keys, the process to get one is straightforward and we have guides to help you get started.

Final word

As you can see, AI tools like ChatGPT, Claude, and Gemini can be a versatile and valuable tool for assisting with a wide range of data analysis tasks and responsibilities. 

From collecting and preprocessing data to building models, visualizing insights, and communicating with stakeholders, prompts like the ones we’ve explored here can help analyze data more efficiently.

The key is to think creatively about how ChatGPT’s natural language understanding and generation capabilities can be applied to your specific data challenges. 

Don’t be afraid to experiment and iterate – the prompts shared here are just a starting point. With a little imagination and fine-tuning, you can adapt them to all sorts of data analysis scenarios.

So give them a try, and see how ChatGPT can help streamline your workflows, spark new insights, and take your data analysis skills to the next level. Happy analyzing!


