Its been two years since my last post on the Professional Data Engineer certification AND it was time for renewal. Successfully renewed the certification exactly 2 years later
I wanted to update some of my recommendations on what I used.
Linux Academy#
Google Cloud Documentation for the services -#
Big Query, Dataflow, BigTable, Pub/Sub , Composer and other big data services
Solution approach#
Migrating Apache Spark to Dataproc
- When to use Dataflow vs Dataproc, BigTable vs Spanner vs Datastore, ML APIs vs Automl, Composer vs Kubeflow , Transfer Service vs Appliance, Pub/Sub vs Kafka
- IAM Permissions for all the services
- Background - Hadoop and its components
Wishing you the best of luck for your certification
