Google Cloud

Can’t connect a Private Cloud SQL from Cloud Data Fusion? Here’s 3 reasons why and what you can do to fix it.

13 November 2020

The Data Engineers reading would probably know that Google Cloud SQL and Cloud Data Fusion are part of Google’s offerings for Data and Analytics in Google Cloud Platform (GCP). Hence, it should be simple to connect both of them together to form a simple data pipeline right? Yes, if you don’t mind using public interfaces; no, if both Data Fusion and Cloud SQL are meant to be internal services.

Here are some simple guidelines to ensuring that the Private versions of Cloud SQL and Data Fusion can work together.

Deploy Cloud SQL and Data Fusion in the same network

Shared VPC vs Standalone VPC: Google Documentation

The first consideration that needs to be put in place is that Cloud SQL and Data Fusion should reside in the same VPC network; both Cloud SQL and Data Fusion employ Google’s Private services access concept which allows them to “peer” services running in Google’s VPCs to ours. A nuance of this kind of peering means that it doesn’t support Transitive Connections, meaning that two peers cannot connect to each other over our VPC, this situation can be made worse if we ourselves are peering two VPCs together. So, keep them on the same VPC (hint: it can also be a Shared VPC).

Utilize a proxy or TCP forwarding between Cloud SQL and Data Fusion

Diagram showing a proxy between Cloud SQL and Data Fusion — Adding a Proxy between Cloud SQL and Data Fusion

In order to ensure that two Private access services can speak to each other, they have to think that the connection comes from within our VPC. This means that there needs to be a middle man in between each service so that GCP thinks the connections come from our VPC and are not the result of Transitive Peering. Here are the two methods that worked for us:

Deploy a Cloud SQL Proxy (either in docker or non-docker form)
Deploy a VM with iptables forwarding

Both these solutions will create the desired effect, ensuring that the connections to Cloud SQL from Data Fusion originate from a VM in our VPC because it is acting as a proxy. An example of how you might configure TCP forwarding using IPTables is to:

sudo iptables -t nat -A PREROUTING -p tcp –dport 3306 -j DNAT –to <CloudSQL IP>
sudo iptables -A FORWARD -p tcp -d <CloudSQL IP> –dport 3306 -j ACCEPT
sudo iptables -t nat -A POSTROUTING -j MASQUERADE
The above forwards traffic received by the VM on port 3306 to Cloud SQL on port 3306 and masquerades the traffic as coming from the IP of the VM

Ensure that firewall rules allow the Proxy to communicate

This is a step we find that many implementers have missed out on, after setting up the proxy in the step above, the GCP firewall needs to be configured to allow traffic from the correct source and ports being used to communicate with the proxy instance for this to work.

In our testing, we realized that we could not change ports for some of the drivers in Data Fusion, so depending on the driver, you may need to allow port 3306 from the Allocated Internal IP range for Data Fusion which was defined in the Private Service Connection tab of your VPC (or VPC host if you are using a Shared VPC). This will commonly be described as a cdf-<data fusion instance name>. If you are using a Data Fusion driver that supports changing ports, you can allocate the selected port accordingly.

In short, create the following ingress and egress firewall rules:

Allow ingress traffic from your Cloud Data Fusion range to talk to the proxy VM
Allow egress traffic from your proxy VM to talk to Cloud SQL’s internal IP range

Note: don’t forget to add other rules depending on other use-cases.

Conclusion

You can allow private Data Fusion instances to talk to private Cloud SQL instances. However, the answer just isn’t as straightforward as turning them on. Having a proxy in the middle also has other considerations if he traffic volume is high. For example, do you need High-Availability? Ensure that the proxy is scaled up according to the volume needed and remember to apply your Cloud Monitoring and Logging agents to the proxy to ensure that you get a full end-to-end view of your Cloud SQL and Data Fusion pipeline!

Did this article help you? Reach out to us at marketing@matrixc.com or read up more of our blog posts at https://www.matrixc.com/blog/. We’d be happy to help you out on your GCP journey!

Contact Our Team

Are you interested to learn more about our products?

Do you wish to speak to us for professional advice on digitalizing your business? Click on the button below to book a complimentary 1-on-1 consultation with an expert from our team.

Ready to accelerate your digital transformation? Get your FREE Digital Readiness Score today!

Can’t connect a Private Cloud SQL from Cloud Data Fusion? Here’s 3 reasons why and what you can do to fix it.

Contact Our Team

Are you interested to learn more about our products?

Alice Smith International School Soars to New Heights with Google Cloud Platform, Powered by Matrix Connexion

Celebrating Excellence: Matrix Connexion Wins Google Cloud Sales & Services Partner of the Year Award for Malaysia

How Google genAI is Unleashing the Power of Generative AI

Mastering Productivity with Google Workspace

The Benefits of Generative AI

The Power of Insight: Statistical Analysis Redefined with Google Cloud

How to use Google Workspace to automate tasks and save time

Unleashing the Magic: Your Guide to the Cool World of Generative AI

Decoding the Magic of Generative AI: How it Learns, Creates, and Challenges Our World

Unlocking Ultimate Productivity: Supercharge Your Workflow with Google Workspace

Unlocking the Power of Data Analytics

Matrix Connexion Wins Google Cloud Partner of the Year for Sales & Services in Malaysia

Matrix Connexion and Google Cloud Launch AI-Powered Data Analytics for Mid-to-Large Enterprises

Sinar Harian Features Matrix Connexion’s STAX: Smarter, Safer AI-Driven Data Analytics Now Accessible to Businesses

MatrixC is a Google Cloud Premier Partner providing end-to-end digital transformation solutions for Southeast Asian businesses.

Services

Resources

Company

© MatrixC. All Rights Reserved 2025 | Designed By AF

Ready to accelerate your digital transformation? Get your FREE Digital Readiness Score today!

Can’t connect a Private Cloud SQL from Cloud Data Fusion? Here’s 3 reasons why and what you can do to fix it.

Share

Deploy Cloud SQL and Data Fusion in the same network

Utilize a proxy or TCP forwarding between Cloud SQL and Data Fusion

Ensure that firewall rules allow the Proxy to communicate

Conclusion

Read more about Google Cloud here

Contact Our Team

Are you interested to learn more about our products?

You may also like

MatrixC is a Google Cloud Premier Partner providing end-to-end digital transformation solutions for Southeast Asian businesses.

Services

Resources

Company

© MatrixC. All Rights Reserved 2025 | Designed By AF