Automate time series plot with ChatGPT

Visualize any time series data in a minute using this GPT prompt.

Visualize any time series data in a minute using this GPT prompt.

Poom Wettayakorn

Nov 28, 2023

Nov 28, 2023

ai

👋 Hi there, welcome to Datascale: SQL+Data modeling. We help create database diagrams and map out data models from existing schema and SQL queries 🌱. Find us on LinkedIn 🎉

✅ This is GPT trick will get you visualize any time series data in a minute! It works for any dataset with dates and numbers.

📌 I’ll put the prompt at the end of this post, save this for later.

The results here we have detailed chart with moving average, highlighted weekends, custom calculated fields, and handling of missing dates.

How it works:

✍️ This is one of an example of how we should structure the prompt with clear instructions to achieve perfect results every time!

When we perform prompt engineering, you should test the changes systematically. It’s tweaking few variables at the time.

Before finalizing this prompt, I asked GPT for slightly more analysis on top of a previous result. If it performed worse, I edited the prompt util it gets right answer and then proceeded to further steps.

A trick I use:

After finalizing the conversation with GPT, ask GPT to summarize the instructions into one prompt. This helps streamline the process and we can keep the prompt as a template for later use.

In summary

✍️ Crafting the perfect prompt is an art. Here’s how to get it right every time:

  1. Start with clear, structured instructions.

  2. Test systematically — tweak a few changes at a time.

  3. Build on what works, refine what doesn’t.

Here’s the prompt:



Bonus!

This is the code generated by ChatGPT with this prompt.

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np

# Load the dataset into a pandas DataFrame
file_path = '/mnt/data/06_logins_per_day.csv'
data = pd.read_csv(file_path)

# Convert the 'date' column to datetime format
data['date'] = pd.to_datetime(data['date'])

# Create a complete date range from the minimum to the maximum date in the data
date_range = pd.date_range(start=data['date'].min(), end=data['date'].max())

# Reindex the DataFrame to the full date range, filling missing dates with zero logins
data = data.set_index('date').reindex(date_range, fill_value=0).reset_index()
data.rename(columns={'index': 'date'}, inplace=True)

# Convert the 'logins' column back to integers if necessary
data['logins'] = data['logins'].astype(int)

# Calculate the 7-day moving average of the logins
data['7d_moving_avg'] = data['logins'].rolling(window=7).mean()

# Prepare for plotting
plt.figure(figsize=(15, 8))

# Plot daily logins using blue line chart with markers
plt.plot(data['date'], data['logins'], label='Daily Logins', color='blue', marker='o')

# Plot the 7-day moving average with an orange dashed line
plt.plot(data['date'], data['7d_moving_avg'], label='7-Day Moving Average', color='orange', linestyle='--')

# Highlight weekends with a specific background color
for i in range(len(data)):
    if data['date'][i].weekday() >= 5:  # Saturday and Sunday
        plt.axvspan(data['date'][i], data['date'][i] + pd.Timedelta(days=1), facecolor='grey', alpha=0.3)

# Annotate each data point with its corresponding login value, and adjust text y-position to above the line
for i, txt in enumerate(data['logins']):
    plt.annotate(txt, (data['date'][i], data['logins'][i]), textcoords="offset points", xytext=(0,10), ha='center')

# Custom legend indicating the number of days above and below the moving average
above_avg = np.sum(data['logins'] > data['7d_moving_avg'])
below_avg = np.sum(data['logins'] < data['7d_moving_avg'])
plt.legend(title=f"Above Avg: {above_avg} days\nBelow Avg: {below_avg} days")

# Format the x-axis to show the date with the day of the week
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d (%a)'))
plt.gca().xaxis.set_major_locator(mdates.DayLocator(interval=1))
plt.xticks(rotation=45)

# Add a title, axis labels, and a grid to the plot
plt.title('Daily Logins and 7-Day Moving Average')
plt.xlabel('Date')
plt.ylabel('Number of Logins')
plt.grid(True)

# Use a tight layout for the plot to ensure that labels and titles fit well within the display
plt.tight_layout()

# Display the plot
plt.show()

_ Happy hacking ~ 🦾✨

Contents

Section

Get a clear view of your SQL dependencies

Datascale helps reverse engineer data models from existing schema and SQL queries

Learn more

Where data gets modeled

Where data gets modeled

Where data gets modeled