This course introduces essential concepts and tasks using the Snowflake command-line client, SnowSQL. Show
Creating required Snowflake objects (databases, tables, etc.) for storing and querying data. 1. Introduction to SnowflakeIn this section, the introduction part covers the key concepts and architecture of Snowflake. Become a Snowflake Certified professional by learning this HKR Snowflake Training ! 1.1 Key Concepts of SnowflakeSnowflake's Data Cloud is based on a cutting-edge data platform that is available as Software-as-a-Service (SaaS). Snowflake offers data storage, processing, and analytic solutions that are faster, more user-friendly, and more customizable than traditional systems. The Snowflake data platform isn't based on any existing database or "big data" software platforms like Hadoop. Snowflake, on the other hand, combines a completely new SQL query engine with an innovative cloud-native architecture. Snowflake gives all of the capability of an enterprise analytic database to the user, as well as several additional special features and capabilities. Data Platform as a Cloud ServiceSnowflake is a real software-as-a-service solution. To be more specific.
Snowflake is entirely based on cloud infrastructure. Snowflake's service (except for command-line clients, drivers, and connectors, which are optional) is run entirely on public cloud infrastructures. Snowflake's computational needs are met by virtual compute instances, and data is stored permanently through a storage service. Snowflake isn't compatible with private cloud environments (hosted or on-premises). Snowflake isn't a user-installable package of software. Snowflake is responsible for all software installation and update activities. 1.2 Architecture of SnowflakeThe architecture of Snowflake is a mix of classic shared-disk and shared-nothing database technologies. Snowflake employs a central data repository for persisting data, similar to shared-disk architectures. All computing nodes in the platform have access to this repository. Snowflake, however, performs queries utilizing MPP (massively parallel processing) compute clusters, in which each node in the cluster maintains a piece of the full data set locally, akin to shared-nothing systems. This method combines the ease of data management of a shared-disk design with the performance and scale-out advantages of a shared-nothing architecture. Snowflake Training
The three major layers that makeup Snowflake's unique architecture are:
When you import data into Snowflake, it's restructured into a columnar format that's optimized and compressed internally. The optimized data is saved to the cloud through Snowflake. Snowflake is in charge of all aspects of data storage, including file size, structure, compression, metadata, statistics, and other features. Customers cannot see or access the data objects stored by Snowflake; they can only access them through SQL query operations executed through Snowflake. Get ahead in your career with our Snowflake Tutorial ! Query ProcessingThe processing layer is where queries are executed. Snowflake employs "virtual warehouses" to process queries. Each virtual warehouse is an MPP compute cluster made up of numerous compute nodes provided by a cloud provider and allotted by Snowflake. Each virtual warehouse has its own compute cluster, with no shared resources. As a result, one virtual warehouse's performance has no bearing on the performance of the others. Cloud ServicesThe cloud services layer is a set of services that help Snowflake coordinate activities. From login to query dispatch, these services connect all of Snowflake's many components to perform user requests. Snowflake provisioned compute instances from the cloud provider for the cloud services layer. This layer manages the following services:
1.3 Connecting to SnowflakeSnowflake allows you to connect to the service in a variety of ways:
2. Supported Cloud PlatformsSnowflake is a Software-as-a-Service (SaaS) application that is entirely hosted on cloud infrastructure. This implies that Snowflake's three levels of architecture (storage, computation, and cloud services) are all deployed and managed on a single cloud platform. Any of the following cloud platforms can be used to host a Snowflake account:
Differences in credit and data storage unit charges are calculated by geography on each cloud platform. More information on pricing as it relates to a certain region and platform may be found on the pricing page (on the Snowflake website). Data LoadingRegardless of the cloud platform hosting your Snowflake account, Snowflake may load data from files stored in any of the following locations:
Snowflake can handle both batch and continuous data loading (Snowpipe). Snowflake also allows you to dump data from tables into any of the staging sites listed above. HITRUST CSF CertificationThis certification strengthens Snowflake's security posture in regulatory compliance and risk management, and it applies to Business Critical and higher Snowflake editions. The table below lists the AWS and Azure regions that support the HITRUST CSF certification. The HITRUST CSF certification will be supported in any new AWS and Azure regions. HITRUST CSF is not currently supported by Snowflake on the Google Cloud Platform. AWS Cloud Region ID
3. Snowflake EditionsSnowflake comes in a variety of versions, allowing you to tailor your usage to your organization's specific needs. Each edition expands on the preceding one by adding edition-specific features and/or providing higher service levels. It's also simple to switch editions as your company's demands evolve. 3.1 Standard EditionStandard Edition is our entry-level product, giving you complete, unrestricted access to all of Snowflake's standard features. It achieves a good combination of functionality, support, and price. Subscribe to our youtube channel to get new updates..!3.2 Enterprise EditionEnterprise Edition includes all of the features and services of Standard Edition, as well as new features tailored to the needs of large businesses and organizations. 3.3 Business-Critical EditionBusiness Critical Edition, formerly known as Enterprise for Sensitive Data (ESD), provides even greater data security to meet the needs of enterprises with extremely sensitive data, notably PHI data that must adhere to HIPAA and HITRUST CSF regulations.
3.4 Virtual Private Snowflake (VPS)For companies with the toughest needs, such as financial institutions and other major enterprises that collect, analyze, and communicate highly sensitive data, Virtual Private Snowflake delivers our greatest level of protection. It has all of the features and services of Business Critical Edition, but it runs in its own Snowflake environment, independent from the rest of your Snowflake accounts (i.e. VPS accounts do not share any resources with accounts outside the VPS). 4. PrerequisitesTo load and query data, you'll need a database, a table, and a virtual warehouse. To create these Snowflake objects, you'll need a Snowflake user with the appropriate role and access control rights. SnowSQL is also necessary to complete the SQL statements in the tutorial. Finally, the course involves the loading of CSV files containing sample data.
4.1 User and Permissions RequirementsYour Snowflake user must have a role that has been granted the required rights to build the database, table, and virtual warehouse used in this tutorial. Please contact one of your account or security administrators (users with the ACCOUNTADMIN or SECURITYADMIN roles) if you don't have a Snowflake user yet or if your user doesn't have the right role. 4.2 Installation of SnowSQLThe SnowSQL installation is available from the Snowflake Client Repository for download. There is no need for authentication. This version of the SnowSQL installer supports patch auto-update. Installing SnowSQL has more detailed instructions. To set up SnowSQL, follow these steps. 1.Open a terminal window. 2.To download the SnowSQL installer, run curl. AWS endpoint: $ curl -O https://sfc-repo.snowflakecomputing.com/snowsql/bootstrap/1.2/windows_x86_64/snowsql-1.2.17-windows_x86_64.msi Microsoft Azure endpoint $ curl -O https://sfc-repo.azure.snowflakecomputing.com/snowsql/bootstrap/1.2/windows_x86_64/snowsql-1.2.17-windows_x86_64.msi 3. Run the installer: For Windows OS:
Top 30 frequently asked snowflake interview questions & answers for freshers & experienced professionals 4.3 Loading Data Sample FilesTake a look at the sample data files. Save the link/file to your local file system by right-clicking the name of the archive file, getting-started.zip. The sample files can be unpacked anywhere; however, we recommend utilizing the locations listed in the tutorial examples: For Windows OS: C:\temp The example files contain five rows of dummy employee data in CSV format. The comma (,) character serves as a field delimiter. Example record: Althea,Featherstone,,"8172 Browning Street, Apt B",Calatrava,7/12/2017 Snowflake TrainingWeekday / Weekend Batches 5. Implementation steps for working with SnowflakeThe following are the processing steps working with Snowflake:
Step 1: Log into SnowSQLTo log in after installing SnowSQL, follow these steps:
Step 2: Creating the Snowflake ObjectsBefore you could even load data, you'll need a database and a table. This tutorial loads data into the sf_tuts table of a database. Furthermore, loading and querying data necessitates the usage of a virtual warehouse, which offers the necessary compute resources. If you have your own warehouse, you can utilize it; if not, this topic includes a SQL command for creating an X-Small warehouse. You can drop these objects to delete them from your account once you've finished the tutorial. Database Creation Using the CREATE DATABASE command, create the sf_tuts database: create or replace database sf_tuts;
select current_database(), current_schema(); Table Creation Using the CREATE TABLE command, create a table named emp_basic in sf_tuts.public create or replace table emp_basic ( The table's number of columns, their positions, and data types correspond to the fields in the example CSV data files you'll be staging in the next step of this tutorial. Virtual Warehouse CreationUsing the CREATE WAREHOUSE command, create an X-Small warehouse named sf_tuts_wh: create or replace warehouse sf_tuts_wh with
Also, keep in mind that the warehouse is now in use for your session. This data is shown in your SnowSQL command prompt, but it may also be seen using the context function: select current_warehouse(); Step 3: Stage the Data FilesSnowflake may load data from files that have been staged internally (in Snowflake) or externally (in Amazon S3, Google Cloud Storage, or Microsoft Azure). If you already save data files in these cloud storage systems, loading from an external stage is convenient. In this tutorial, we will upload (stage) the sample data files (downloaded in Prerequisites) to an internal table stage. The command used to stage files is PUT. Files StagingPUT is used to upload local data files to the table stage for the emp_basic table that you built. Because it refers to files in your local environment, the command is OS-specific: For Windows OS: PUT file://C:\temp\employees0*.csv @sf_tuts.public.%emp_basic; Let’s take a closer look at the command:
The following is the output of the command, which shows the files which were staged: The TARGET_COMPRESSION column indicates that the PUT command compresses files by default using gzip. Listing the Staged Files (Optional)Using the LIST command, you may see a list of the files you successfully staged: LIST @sf_tuts.public.%emp_basic; Step 4: Copy Data into the Target TableTo load your staged data into the target table, run COPY INTO This command necessitates the creation of an active, functioning warehouse, which you did as a precondition for this tutorial. You'll need to construct a warehouse if you don't already have one. COPY INTO emp_basic
The COPY command also has a feature that allows you to validate files before loading them. Additional error checking and validation instructions may be found in the COPY INTO topic and the other data loading tutorials. The following are the results of Snowflake: Step 5. Query the Loaded DataYou can query the data loaded in the emp_basic table using standard SQL and any supported functions and operators. You can also use typical DML instructions to change the data, such as changing the loaded data or introducing new data. Query All DataReturn the entire table's rows and columns: select * from emp_basic; Insert Additional Data Rows You can use the INSERT DML command to insert rows directly into a table in addition to loading data from staged files. To add two extra rows to the table, for example: insert into emp_basic values Query Rows Based on Email Address Using the LIKE function, return a list of email addresses with UK domain names: select email from emp_basic where email like '%.uk'; To calculate when some employee benefits might begin, add 90 days to employee start dates using the DATEADD function. Filter the list to exclude individuals who started before January 1, 2017: select first_name, last_name, dateadd('day',90,start_date) from emp_basic where start_date <= '2017-01-01'; Step 6. Summary and Clean UpCongratulations! This tutorial has been finished successfully. Please take a few moments to examine a brief summary of the tutorial's main elements. You could also wish to clean up by removing any objects you made during the course. We also provide links to other topics in the Snowflake Documentation at the bottom of the page if you want to learn more. Become a Snowflake Certified professional by learning this HKR Snowflake Training in Hyderabad ! Summary of this Tutorial and Key PointsIn conclusion, data loading is divided into two steps: Step 1: Files containing data to be loaded should be staged. Internally (in Snowflake), or externally, the files could be staged. Step 2: Data from the staged files should be copied into a target table. This step requires a fully operational warehouse. You must also have an existing table into which the data from the files would be loaded to complete this step. When loading CSV files, there are a few points to keep in mind:
The records will not be loaded if the numbers, positions, or data types do not all match. Tutorial Clean Up (Optional)To get your system back to where it was before you started the lesson, do the following DROP Related Article:
How do you create a Snowflake account?Register for a Trial Account. In the account configuration, choose an edition to use. ... . Next, choose the cloud provider you want to use. ... . Finally, choose a region close to where your data is:. And that's it!. Behind the scenes, Snowflake is setting up your account in the cloud provider of your choice.. Which cloud infrastructure provider become newly available as a platform for Snowflake accounts in 2020?Snowflake is bringing its cloud-based data warehouse and analytics service to Google's cloud platform.
What are the names of the three Snowflake editions offered when signing up for a trial account?You have three options to choose between Snowflake editions, such as Standard, Enterprise, and Business Critical. If you are a beginner, you can start with the standard edition. Else, you can choose between other editions based on your business requirements. Step 4: Then, you have to choose the respective cloud.
Which cloud infrastructure providers were available as platforms for Snowflake accounts in 2019?A Snowflake account can be hosted on any of the following cloud platforms: Amazon Web Services (AWS) Google Cloud Platform (GCP). Internal (i.e. Snowflake) stages.. Amazon S3.. Google Cloud Storage.. Microsoft Azure blob storage.. |