About the Project

Context

During my work at Deloitte as an Analytics Consulting Intern, I worked on a project focusing on cloud migration and data modernization using Google Cloud Platform (GCP) for a logistics conglomerate. One of my tasks was to remotely trigger the Data Fusion pipelines to run using Workflows. After that, I was responsible for implementing an automatic authentication process for the workflow and engineering parallel execution in order to optimize the run time for my current workflow.

Note: Data Fusion and Workflows are GCP services.

When I worked on this task, I couldn't find any guiding materials as the services were fairly new at the time and there were barely any resources online. Therefore, I hope this blog post can be a reference material for those who try to accomplish similar tasks.

Result

  • I was able to cut down the run time by ~70% by implementing parallel execution for the codes.

  • I managed to build a functional workflow to trigger Data Fusion pipelines remotely in 1.5 days without previous knowledge of these two services.

  • I successfully implemented an automatic authentication process in the current workflow.

Goals

My goal in this blog is to:

  • Document the technique that I used for educational purpose

  • Create a resource for those who want to achieve similar tasks on GCP

Last updated

Was this helpful?