Guide: Using Snowflake with Databox
Plot and view your SQL query results from a Snowflake data warehouse right on your mobile, big screen or computer browser.
For those of you not yet in the know, Snowflake is a fully-relational SQL data warehouse built for the cloud on Amazon Web Services (AWS). Whether your data is structured or semi-structured (JSON, Avro, XML), Snowflake provides complete relational database support and has comprehensive support for the SQL language. As a turn-key cloud service, it requires no administration and is secure by design. It provides support for ETL (Extract, Transform, Load) and BI (Business Intelligence) tools (like Databox) and developers can use it to build applications.
IN THIS ARTICLE
Because both of our systems are in the cloud and our connection to your Snowflake database stays live, you can use Snowflake and Databox together to monitor your data in real-time on any device, create views of your data for different stakeholders within your company and even be alerted when certain numbers reach certain thresholds.
1. Create a Snowflake Account
For example’s sake, our database will be very simple.
Firstly, visit your unique Snowflake URL and login. Note the Account name, it’s the first part of the URL, before “.snowflakecomputing.com”, you’ll need it later.
Next, click on the Databases icon and then Create... a new database, it’ll be called DATABOX_TEST:
Click Finish and that’s it. The database will now appear in your list, and the current Role (SYSADMIN in our case) will be the owner:
Roles can be switched in the upper right corner. (While security best practices are beyond the scope of this document and covered well in Snowflake’s documentation, I do recommend you create a new role and user, and give it only the access it absolutely needs and make it read-only.)
Let’s create a schema for this DB. Click on DB name and then select Schemas tab, and Create… a new schema:
Now, for example purposes, we’ll create a simple table with only three columns:
Next, click on the table name and our screen should look like this:
With database/schema and table created, we now need the compute part, which, in Snowflake terms is called Warehouse. Click on the Warehouses icon and Create… a warehouse:
We’ve entered a name, and chosen a Size, the tiniest X-Small size is enough for this example’s needs. It will be auto suspended if idle and Auto resumed when needed.
Let’s load a sample dataset. Prepare a CSV file with content:
Click on Load Table and choose the warehouse you’ve created earlier:
Click Next and choose the CSV file, click “+” and define a file format:
Click Load and the data should be imported in the table:
The Database is now up, data is loaded and the Warehouse compute power to drive the queries is also ready.
The only task left to do on the Snowflake is to the add Databox IP to the account, so queries can be run from the Databox side. Click the down arrow in the upper right corner and Switch role to ACCOUNTADMIN. A new icon called Account now appears:
Click on Account / Policies, then Create… a new policy and enter the Databox IP 188.8.131.52:
Policy DATABOX now appears in the main window. Remote access should now work inside of Databox.
The Snowflake part of this tutorial is now complete. We have a database populated with a single table in a custom schema and the warehouse is selected and running when needed. We’ve also added the remote access IP so we can pull the data to Databox. Now we’ll move to Databox, connect to Snowflake via the Snowflake connector and write a simple query to ensure the data loads correctly.
2. Connect the Snowflake Data Source to Databox
Our database is now ready! The next step is to connect it and test that it’s returning the data we need for our visualizations.
Log in to Databox at https://app.databox.com/, click on the ‘Data Manager’ tab, then select the New connection button:
A screen with many connector will be shown. Filter it by typing Snowflake in the Search box on the right. Snowflake connector appears, which at the time of this writing is still in beta:
Hover over the connector tile and click Connect. The connect window will appear:
Fill the form with the data from Snowflake like I did above for our example. You will need to modify Account, User and Password with your unique data.
Account is the first part in the URL, before the .snowflakecomputing.com, so enter just that, without the .snowflakecomputing.com part.
If the default role will be used, just leave the Role field empty. If a specific role is needed to access your Snowflake data warehouse and/or your table.schema, enter it so that the connect works properly.
For Warehouse, write the warehouse name you created earlier.
Database Name.Schema should be populated with Database Name and Schema, separated only with a dot.
Click Activate and after a few seconds, the text “Successfully connected” should appear in a popup. If all went well, we can now test a simple query on the minimal table data we’ve created earlier. If you’re running into connections issues, I recommend testing your remote access to Snowflake using their command line client called SnowSQL or contact support, so we can help you.
Now, your Snowflake datasource isconnected and Databox will be able to communicate with your database.
3. Running a custom query
Next, I will show you how to write a custom query. Databox will run this query and display the data returned in a tabular format. There are a couple of things you need to ensure:
- The query returns a maximum of 1000 rows, LIMIT the query if you’re not sure.
- The query must contain a column named Date. If you need to, use AS in your SQL SELECT statement to satisfy this requirement.
Select Data Manager, then Query Builder, and click the New Query button:
Select your newly connected Snowflake datasource in the dropdown and enter your query now:
Click Run query and the results will appear at the bottom of the window:
Congratulations, you have now pulled data from your Snowflake database into Databox.
Save this custom query. Once you save your query, data will be fetched once an hour. In the last step, we will visualize the data using the Databox Designer.
3. Visualize Your Data with Databox Designer
Now that the database is connected, we will use the Designer to query, visualize and display the data in a format that’s most appropriate and useful for your needs:
Let’s go to the Databoards tab.
- Choose an existing Datacard or create a new one.
- Choose the Datablocks icon on the left
- For this example, we will display the data in a tabular format. To do this, drag & drop the Table block onto your Datacard.
Next, click on the newly created block on your Databoard and enter the properties on the right so it looks like this:
We have added our Snowflake data source, then we’ve chosen the Untitled Custom Query metric and added additional This Month and This Year intervals in Data ranges.
Our table now looks like this:
As you can see from the query we ran earlier, John Waters should have a larger value. Because the Table uses the Last value by default, we’ll need to modify this to SUM in Advanced settings.
Click on the little gear icon above the properties
The Advanced settings window is shown:
Choose the Data Range for This Year. Modify the Aggregation field to show SUM of all values for this year. Data in Data preview window below will update:
Sum for This Year interval is now correct. Let’s Save the modifications. Data in Databoard will update shortly to reflect our changes:
Well done! Your database is now connected to Databox, queries can be executed and then displayed on your mobile, big screen or your computer.
There’s much, much more you can do with your data in Databox now. You can cdd more custom queries, add blocks, explore different types of visualizations, set alerts, set and track goal performance, query your data in Slack, visualize your Snowflake data next to other datasources, etc, etc.
Ready to try it for yourself? Signup for free today at https://databox.com/signup and let us know how it went for you.
Remember: we’re always glad to help if you run into any obstacles!