Guide: Using Snowflake with Databox
HOW TO
-
Snowflake connections are available on the Professional and Performer plans. Request a free trial of Snowflake by following
these steps.
How to prepare your Snowflake database
We'll name this new database "DATABOX_TEST"
Click Finish to save the new database. This database will now appear in our list, and the current role, SYSADMIN in our case, will be the database owner.
Roles can be changed in the upper right corner of Snowflake. Learn more about Snowflake security best practices here.
Next, we'll create a schema for our database. To do this, we'll click on the database name and will select Schemas > Create
We'll name this new schema "TEST_SCHEMA"
We'll create a simple table with 3 columns for our example, and will name it "SAMPLE_DATA"
Click on the table name
Next, we'll click on the Warehouses icon and will select Create to create a new warehouse
We'll name this new warehouse "WAREHOUSE_X_SMALL." We've selected X-Small as the Size, 5 minutes as the Auto Suspend time, and we checked the Auto Resume checkbox.
Click Finish to save the new warehouse. This warehouse will now appear in our list.
Now it's time to load a sample dataset. We've prepared a CSV file with the following content:
2017-11-03,John Waters,3849.23
2017-11-10,John Waters,105.27
2017-10-05,Jennifer Tracy,124.02
2017-03-06,Milla Marx,932.99
Click Load Table and choose the warehouse we created earlier
Click Next and choose the appropriate CSV file. Click + and define a file format
Select the CSV
Click Load to load the data in the table
The database is now set up, data is loaded and the warehouse is ready to go.
Finally, we need to add the Databox IP to the Snowflake Account so queries can be run from Databox. Click the drop-down arrow in the upper right corner of the Account and Switch Role to "ACCOUNTADMIN"
Click on Account > Policies and select Create to create a new Network Policy
Enter "DATABOX" as the Name and Databox's public IP (52.4.198.118) in the Allowed IP Addresses field
This new Network Policy will now appear in the main window, and remote access from Databox is set up.
How to connect Snowflake to Databox
Once our Snowflake Account is properly prepared, we'll connect it to Databox.
To connect Snowflake to Databox, go to Data Sources > + New connection. Type Snowflake in the search bar and click on the green Connect button.
This will open the Activation window, which is where you will be prompted to enter your connection data and authorize the Databox app.
The account is the first part in the URL, before the .snowflakecomputing.com, so enter just that, without the .snowflakecomputing.com part.
If the default Role will be used, leave the Role field empty. If a specific Role is needed to access your Snowflake data warehouse and/ or your table schema, enter it in the Role field to ensure the connection is set up correctly.
When connecting a Snowflake database to Databox, the Proxy authentication is being used. Learn more about how Proxy authentication works here. The Authenticator field is not mandatory and can be left empty.
Click the green Activate button to complete the connection process. If Snowflake was successfully connected, the Activation window will close shortly, and you'll see a Connected notice.
If you're running into connection issues, test your remote access to Snowflake using their command-line client called SnowSQL. Learn more here.
How to use the Metric Builder for Snowflake
The Metric Builder for Snowflake allows you to create Custom Metrics using data from your Database to use in Databoards, Alerts, Goals, etc. Learn more about the Metric Builder for Snowflake here.
Additional Information
- Query results are limited to 10,000 rows. If you expect your query to return over 10,000 rows or if you're not sure how many rows will be returned, be sure to LIMIT the query. If you need more than 10,000 rows in order to successfully report on your Snowflake data in Databox, feel free to submit your request on our Roadmap here.
- The query must contain a column named "Date." If you need to, use AS in your SELECT statement to satisfy this requirement
- One column should always contain the Metric Value
- "Null" values will not be recognized. Use a COALESCE function when needed. Learn more here.