A Super-Fast and Sharp Way to Send E-mails from an ETL Pipeline Python Code Using Gmail API — Starting with Setting the Credentials all the way to Code

Flávio Brito
6 min readAug 9, 2021
Photo by Carl Heyerdahl on Unsplash

Introduction

Gmail is undoubtedly a great e-mail service from Google, to use it to its fullest capacity requires some knowledge of how to configure the service’s API as well as enable authentication via OAuth2. All these steps are well documented here:

https://developers.google.com/gmail/api/quickstart/python

Good practice in the case of an ETL Pipeline is to not use a username and password for authentication.

I’ll take a two-step approach, the first one demonstrating how to set up credentials and the last one how to use a library that encapsulates the complexity of the API code and does what has to be done with a few lines of code.

Enabling Gmail API

To develop a solution using the GMAIL API is necessary that your GCP user has permission to enable the API service before start.

Check-in https://console.cloud.google.com/

If you don’t have permissions you will receive a message like this:

Ask the GCP administrator to give access to your user to enable the API and also create the credentials.

The GMAIL service has its own API, you need to enable it by going to APIs & Services menu and then click on API Library.

Then you will find a button to ENABLE. If it is already enabled you will see the following picture:

Creating Credentials

This step aims to create a new OAuth 2.0 Client ID to be used in our code. To do it, click on + CREATE CREDENTIALS link on the top of the page and select OAuth Client ID.

Now that GMAIL API was enabled, you need to configure the scope and also the OAuth process Application type and select Web application. In our case, we will use it in a pipeline.

You can choose a name to the OAuth 2.0 client and add in Authorized redirect URIs a:

Select Web application and redirect URI is

https://developers.google.com/oauthplayground

Copy and save the credentials and Go to this URL and follow the steps that will appear on the right side of the page. Download the JSON credential file from the link on the top of the page.

OAuth 2.0 Configuration

Click in the gear icon on the top right and mark Use your own OAuth credentials filling ****the OAuth Client ID and ****the OAuth Client secret.

In the step Select & Authorize APIs, select Gmail API v1 and click.

Then click the “Authorize APIs” button.

There is no need to add domain verification and page usage.

Now is the time to talk more about how we can simplify the sender code.

SMTP Client Library

There are some interesting libraries as SMTP client but one that caught my attention was yagmail. This library is easy to install and use. In an ETL process, we need a solution that can use Gmail OAuth2 credentials and accept attaches. Touchê, yagmail do it elegantly.

As yagmail in the requirements.txt file, I am using the version:0.14.256

yagmail==0.14.256

if you want you can install it on Google Collab to test. In a cell you can run:

!pip install yagmail

If you are still using Google Collab, you need to upload the JSON credential file, otherwise, the yagmail will ask for the user's mail. If it happens you can have an odd behaviour and there is no guarantee that the e-mail will be sent.

Moving on, the main page of the library you will find a small but important demo code that will give an idea of basic use.

https://pypi.org/project/yagmail/

Implementation

We must have an implementation to keep it sharp and dry. To do it let's create a file into a common code folder that we can call whenever we want.

gmail_client.py

import yagmail


class Gmail(yagmail.SMTP):
def __init__(self):
super().__init__(oauth2_file="app/apis/gmail_credentials.json")

def send_mail(self, toaddr, subject, message, attachment_files=None):
self.send(to=toaddr, subject=subject, contents=message, attachments=attachment_files)

In this code, I created a Gmail class that inherits yagmail.SMTP. As I am passing to oauth2_file parameter a JSON credential's filename instead of user and password, that is stored, in this case, in app/apis folder project.

You can use the yagmail send method,

Instance of Gmail

mail = Gmail()

calling send method

mail.send("TARGET_EMAIL","SUBJECT TEXT","BODY TEXT",attachments="./report_.pdf")

In this case, there are some important parameters that you need to consider. If don't need to add attachments, remove this parameter from the method call.

mail.send("TARGET_EMAIL",
"SUBJECT TEXT",
"BODY TEXT",
attachments="FILE_PATH_NAME")

but in my case, I was replacing a send_mail method used by all ETL Pipelines, this is a nice example of how to do it if you need it.

def send_mail(self, toaddr, subject, message,attachment_files=None):
self.send(to=toaddr, subject=subject, contents=message, attachments=attachment_files)

Now you need to import the class into the pipeline code as:

from app.data_integration.gmail_client import Gmail

In this case, the pipeline path that will invoke the Gmail client features is:

app/data_integration /

and into this folder, we created the gmail_client.py as described previously.

As we want to only invoke if the pdf file was generated to be attached, we need to check if the variable is empty, if not we will call the send_mail method.

if pdf_file_name:
gmail = Gmail()
gmail.send_mail(email_receiver, email_subject, email_contents, [pdf_file_name])

The code below demonstrates how you can replace dynamically the email contents parameters between @@ symbols (a way to avoid mysterious characters) that came from a SQL result where

  • @@start-date@@ will be replaced by custom_start_date
  • @@end-date@@ will be replaced by custom_end_date

for example, if csv_created ( a variable that has a value different from None if a CSV was created successfully).

if csv_created:

gmail = Gmail()

email_contents = email_contents.replace('@@start-date@@', custom_start_date).replace('@@end-date@@', custom_end_date)

gmail.send_mail(email_receiver, email_subject, email_contents, [pdf_file_name, csv_file_name])

logging.info(f"Email for {customer_name} sent to {str(email_receiver)}")

A logging message will be printed in the ETL pipeline log telling each user the send_mail method was called.

Summary

In this article, we are focused on how we implement, set up and use Gmail API service with few lines of code from an ETL Pipeline, and how to improve your productivity.

Please feel free to add me on LinkedIn or follow me on Twitter.

--

--