Time Travel in Snowflake

 

Snowflake time travel: snowflake time travel enables accessing historical data (i.e. Data that has been changed or deleted) at any point within a defined period.

·       It serves as a powerful tool for performing the following table.

·       Restoring data- related objects (tables, schemas, and database)that might have been accidentally or intentionally deleted.

·       duplicate and backing up data from key points in the past.

·       Analysing data usage/manipulation over specified period of time.

·       Time Travel Retention period is (1 to 90 days)

TIME TRAVEL ALLOWED :SELECT..AT|BEFORE….

                                                CLONE….AT|BEFORE….

                                                  UNDROP……..

CLONE:

·       Creates a copy of an existing object in the system. This command is primarily used for creating zero-copy clones of databases, schemas, and tables; however, it can also be used to quickly/easily create clones of other schema objects (i.e. external stages, file formats, and sequences).

·       The command is a variation of the object-specific CREATE <object> commands with the addition of the CLONE keyword.

SYNTAX TO CREATE CLONE:

·       Create or replace {database|schema|tables|stream} [if not exists]<object_name>clone<source_object_name>

[ {AT|BEFORE} ({TIMESTAMO=><timestamp>|OFFSET=><time_difference>|STATEMENT=><id})]

 

Using Time Travel, you can perform the following actions within a defined period of time:

·       query data in the past that has since been updated or deleted.

·       create clone of entire tables, schemas, and databases or before specific points in the past.

·       Restore tables, schemas, databases that have been dropped.

·       Once the defined period of time has elapsed, the data is moved into snowflake fail-safe and these actions can no longer be performed.

 

 

 

TIME TRAVEL SQL EXTENSION:

·       TO support time travel ,the following sql extensions have been implemented:

·       AT|BEFORE clause which can be specified in select statements and create…. clone commands (immediately after the object name).

·       To pinpoint the exact historical data, we have to use   TIMESTAM along with the AT\BEFORE extension.

·       OFFSET(time difference in seconds or minutes from the present time ex:1min,60 sec).

·       STATEMENT/QUERY ID(before delete the statement we can revert the data ,so that we can use queryid)

·       UNDROP: undrop command for tables, schemas, and databases

If we delete accidentally or intentionally we can revert back the table, schema and databases we can use undrop command.

SNOWFLAEK DATA LIFECYCLE WITH TIME TRAVEL AS SHOWN BELOW



 

Method-1:By using OFFSET key:

Here we can call database &schema as shown below.

Database and Schema Creation:

use database my_first_db;

use schema public;


Here we can create a table in the name of XYZ as shown below.

Table Creation:

create or replace table xyz(

  id number,

fullname varchar,

gender varchar

);



Here we can insert the records first 3 records I have to insert at a time later I have inserted the record 4 as shown below

Inserting Records:

insert into xyz(id,fullname,gender)

values(1,'abc','male'),

(2,'bcd','female')

insert into xyz (id,fullname,gender)

values(3,'aaa','f');

insert into xyz (id,fullname,gender)

values(4,'aaa','m')

select * from xyz;


 




 

Here I have to perform OFFSET for 1min before I want to fetch the data which is present in xyz table as shown below

By Using OFFSET :

select * from xyz at(offset => -60+1);




Method-2:By using TIMESTAMP:

step-1:
create database Timetravel;
use schema public;

Step-2:Table Creation
create or replace table data(
  id number,
name varchar,
emailid varchar
);

Step-3:
By selecting creating file format on snowflake console and name it nd select header =1 as shown below

Step-4:
Select load data into snowflake console then select the csv file in local as shown below

Step-5:Then,select the file format type ,which we given in the step 3 ,then load the file into it

Step-6:Here we can see the how many rows have in a file and how many rows are loaded as  shown below           


Step-7:Here the file has been loaded  as shown below                                                                                                                    

Step-8:Here we can fetch the loaded data into it  by using select query          select * from data;                                                                                                                                             





Step-9:select the current timestamp.
select current_timestamp;



Step-10:Here we can perform update command ,by using update command we can update the email id column as shown below.
syntax:
UPDATE DATA SET EMAILID='ABC@123';

Step-11:select timestamp here.
syntax: select current_timestamp;


Step-12:Here we can fetch the distinct emailid ,previous step we have to update the emailid so that we can check data has been updated or not.

Showcase the after update command and by using timestamp we can check 
SELECT DISTINCT emailid 
FROM data AT
(TIMESTAMP => '2022-01-24 05:30:30.740'::timestamp_tz);

Step-13:Here we can check the entire table with update column as shown below 

Showcase the data by using update statement execute and run the timestamp after updating query runs as shown below

SELECT * 
FROM data AT
(TIMESTAMP => '2022-01-24 05:30:30.740'::timestamp_tz);


Step-14:Here we can find the original table with timestamp after loading the data and select command runs.
Here we can revert the original data ,after updating the emailid ,by using timestamp we can getback original data as  shown below
SELECT *
FROM data AT
(TIMESTAMP => '2022-01-24 05:29:38.126'::timestamp_tz);


Method-3 :BY USING QUERY ID
Step-1:
Table Creation:
create or replace table data(
  id number,
name varchar,
emailid varchar
);

Step-2:load the file into snowflake console 


Step-3:
reterive the records which are loaded as shown below.     
syntax:
select * from data;
                                     


Step-4:copying the query id which we got the data in previous step.



Step-5 :
Delete the records(accidentally or intentionally)
syntax:
delete table data;

Step-6:Showcase the table present or not ,after performing delete operation
Syntax:
select*from data


Step-7:Here we can perform the deleted data ,we can revert by using queryid as shown below.
Syntax:
SELECT * FROM DATA BEFORE (STATEMENT => '01a1d843-0000-6b2e-0001-7e0a00016a36');


Step-8:Here is the output for original records as shown below
Syntax:
select * from data;




Data Retention Period:  A key component of snowflake Time Travel is the data retention period.

·       when data in a table is modified, including deletion of data or dropping an object containing data, snowflake preserves the state of the data before the update.

·       The data retention period specifies the number of days for which this historical data is preserved and, therefore, time travel operations (SELECT, CREATE…CLONE…UNDROP) can be performed on the data.

·       The standard retention period is 1 day (24 hours) and is automatically enables for all snowflake accounts.

·       for snowflake standard edition, the retention period can be set to 0(or unset back to the default of 1 day) at the account and object level.


For snowflake enterprise Edition (and higher):

·       for transient databases, schemas and tables, the retention period can be set to 0(or unset back to the default of 1 day).the same is also true for temporary table transient table is set to 0.

·       for Permanent databases, schemas and tables, the retention period can be set to any value from 0 up to 90 days.

·       When the retention period ends for an object, the historical data is moved into snowflake fail-safe.

Enabling and Disabling Time Travel:

·       No tasks are required to enable time travel.it is automatically enabled with the standard 1-day retention period.

·       Time travel cannot be disabled for an account; however, it can be disabled for individual databases, schemas, and table by specifying DATA_RETENSION_TIME_IN_DAYS with a value of 0 for the object.

·       Users must have ACCOUNTADMIN Role can set DATA_RETENSION_TIME_IN_DAYS to 0 at the account level, which means that all databases (and subsequently all schemas and tables) created in the account have no retention period by default.

·       This default can be overridden at any time for any database, schema or table.

·       As a general rule, snowflake recommend maintaining a value of (at least) 1 day for any given object.


Comments

Popular posts from this blog

Snowflake Data Unloading into Json Format

Loading JSON data from Amazon S3 into Snowflake Using Copy Command