GCP Data Engineering - Round 2!

Its been two years since my last post on the Professional Data Engineer certification AND it was time for renewal. Successfully renewed the certification exactly 2 years later

I wanted to update some of my recommendations on what I used.

Linux Academy

Google Cloud Documentation for the services -

Big Query, Dataflow, BigTable, Pub/Sub , Composer and other big data services

Solution approach

Migrating Apache Spark to Dataproc

Building your datalake

Data Lifecycle

  • When to use Dataflow vs Dataproc, BigTable vs Spanner vs Datastore, ML APIs vs Automl, Composer vs Kubeflow , Transfer Service vs Appliance, Pub/Sub vs Kafka
  • IAM Permissions for all the services
  • Background - Hadoop and its components

Wishing you the best of luck for your certification