>> Deepened data capturein SQL Server has been a very popular featurefor a very long time. In this chapter, learn what it is, and learn what’s new in Azure SQL Managed Instanceand Azure SQL Database. This week on Data Exposed.[ MUSIC ]. Hi. I’m Anna Hoffman and welcometo this episode of Data Exposed. Today, I’m joined by Mara Steiu, a Program Manager onthe Azure SQL team. Mara, thanks so muchfor joining us today. Can you start by telling us alittle bit about what the hell are you do? >> Hi Anna. Thanks for having me. I’m a Curriculum Manager in theAzure SQL team at Microsoft, and I’m focused on datareplication technology, such as change data captivate, mutate tracking, and SQL Data Sync. >> Awesome. Well, it seems likeyou’re the exact right being for us to have on the demonstrate to talkmore about mutate data captivate. Now, this is something that a lot of SQL people might be familiar with, but some people might alsobe new to this topic.I’m fairly new to CDCor change data capture. I’d also like to understand howit differs between SQL Server, Azure SQL Managed Instance, and Azure SQL database. Mara, are you able make us through like, what is CDC and how does it labour? >> Sure. At the very high level, CDC corrects andrecurse the MN modifications. Positions, modernizes, deletesthat happened on your tables. More exclusively, the wayit drives is that first you have to enable CDC on your database, and then you enable CDC on each counter that you want totrack for the ML modifies. Once you do that for eachsource table that you enable, you have an associatedCDC paper that has been created in which we’ll beable to see the data changes.CDC consists of two main processes, that is a capture process, which takes the changesonce they get to the transaction log andputs them in the CDC table. There is also a cleanupprocess which scavenges the CDC tables on aretention-based policy. Lastly, you have somecredit performs that exhaust the data changesfrom the CDC table. You are asking me how CDCdiffers across Azure SQL MI, SQL Server, Azure SQL dB. In our SQL Serverand on Azure SQL MI, we have this tool captured and cleanup processesand SQL Server Agent fells. Nonetheless, media symphony Cynthia analyze this integersequence databases. In sequence databases, these jobsare replaced by a CDC scheduler. The scheduler guns automaticallyclean up and captivate. Nonetheless, patrons stillhave the option to manually executesscan on or clean up, but you can safelyassume that it is being done on the back onby the scheduler. Apart from the scheduler, there is not much that differsfunctionally for customers across the different deploymenttypes that we have for CDC .>> That becomes sense. >> Yeah. >> It’s mostly like we aretaking the same accurate engineering, but we had to add something totake the place of SQL Server Agent since SQL Server operator isn’tavailable in Azure SQL database. It forms impression, it’s the sametechnology apart from that. Mara I have anotherquestion for you. What are some use actions you’reseeing parties use CDC for? >> Yeah.We witnes people usingCDC for several purposes. Nonetheless, some of the mostcommon situations that we’ve seen are around auditingpurposes and analytics, so making data from the changetables to run audit and analytics. Then we ascertain many customerssending changes to other downstream subscribersor sending them from the weathy plan to thedata lake or data store. Here we view, for instance, Azure Data factory being useda lot to move change data from the mutate decreases toother readers and lastly we also check event-basedprogramming works. Have your employments plan promptly respondto the changes in your data and some applications from these might be dynamic pricing. Dynamic product pricing, you write based on changes in your inventorythat you implied. You can achieve that. This would be some of the key scenarios thatwe’ve seen CDC used for .>> Awesome. I especiallylike that fourth one. Because instead of havingto pull or whatever, I could just instantaneouslyreact to changes in my data. That’s pretty cool. Now, ifI’m a patron and I’m like, oh, this is awesome, Iwant to start using it. How do I go aboutenabling or incapacitating it? >> Yeah. That’s a great question. As I was mentioning, first you have to enableCDC at the database level. Here formerly you runthis procedure that I demo you here andwe’ll also do a demo so on to see how itworks in practice. Once you enable CDC atthe database level, these five tables thatyou can see here get created in the system’sstable on your database, so it’s important tomonitor your cavity utilization closely because allthese CDC artifacts are being stored on thesame source database. Once you enable CDC atthe database level, the CDC schemer obviouslygets created as well. You can see in these tableswhat are the capture articles, what are the change tablesthat we enabled for CDC, what is the DDL history? Here you can see all theDDL reforms that ought to have stimulated since the captureinstance has been created.However, it’s worth noting that, once you make a DDL changeon your source counter, that does not automatically getreflected in your convert table. You have to restartyour captured instance. That’s something we constantlyraise patrons agonists around. Then you have the indicator editorials and the long block sequencenumber experience planning. This is all the data youcan see in these counters. Yeah. You also have to enableCDC at the table rank. Once you achieve that, you can see that change table being created and the CDC occupations stable. This is for Azure SQLdatabases solely. If I mentioned that in my sequenceserver, sorry for interrupting, you too have to make surethat the SQL Server Agent is running for these chores to establish formerly you enable CDCat the counter height .>> Got it. This seems pretty straightforward and Ilike to caveat again, announcing up a SQL Server Agent verse what the CDC schedulers going to see do. Now Mara, during this, you mentioneda few things to be aware of. Is there anything else thatpeople should be aware of when they enable this inAzure SQL database? >> Yes. We invariably geta lot of questions around achievement considerationsfor CDC in general, and it’s very hard to offersome specific benchmarks or to predict performance time because it depends so much better onevery user’s workload. Nonetheless, there are somekey common factors that users should beaware of in general. You should be aware of thenumber of CDC-enabled counters, the frequency of changesin those moved tables, the cavity available inyour source database.Because as I was mentioning, all the CDC artifacts are being stored in the same source database, so it’s good to be aware of thatbefore you start using CDC. Whether you are using an elasticpool or a single database. Because in elastic kitty, you must also be aware ofhow many databases are enabled for CDC apart fromthe number of counters, time because you havethose databases actually having common resourcessuch as disk space. The more databases youenable within that pool, the higher the testto get to the limit of your elastic full diskspace, for example. You should be aware of thesedifferences, and in general, we advise customers tolook up for the CPU usage, recognition, and records output .>> Awesome, very cool.This is good to know and immense tips-off forpeople getting started. Speaking of getting started andtaking a look at this thing. I understand youprepared something to show us of actuallyseeing CDC in action. >> Yes. I cooked a short demo on how CDC works on anAzure SQL database. This is a general-purpose database. Before we get started, something else that I wantedto mention is that CDC in Azure SQL databases works ontiers higher than standard 3. For instance, if you have a basictier database in Azure SQL DB, unfortunately, youcan’t use CDC exclusively. There is one more thingto be aware of before you get started withCDC in Azure SQL DB. Let’s get started with the demo. Here I have a simple databasewith some patrons, it’s only one customer so far, myself and I want to enableCDC on this database. As you can see, if I start here and I gointo the system counters, organization counters will be creatingthe ones that we just talked about along with the CDC schema. This is taking only a little and you can see the index of captured editorials once youenable CDC at the counter elevation, you can select whichcolumns to be tracked for. The vary tables, theseare obviously empty as if now I’m just trying howthey look like ddl history, index editorials and enter sequencenumber, epoch mapping. Then we’re going to go to our generator table that wewant to enable for CDC.As I was mentioning, it only hasone single customer as of now, and I want to enable CDC atthe table level as well. Here you have someadditional parameters which I’m not exerting right now. You can activate net mutates tracking so by defaultyou get all changes, but you can also support netchanges when you enable counters, and you can select the columnsthat you want for drag. Right now, apparently, you can see the convert counter has been created. Once we enable CDCat the table position, and this conversion tableis empty as expected. They all show up here, the conversion counter and the CDC responsibilities. The CDC occupations, shows you the default parameterswhich you can change like retention policythat I was mentioning, threshold, and so on.Now let’s set some data into these tables I’m addingAnna as a brand-new client, and you can see this scheduleby default automatically has very quickly detectedthese DML change, so Anna has already beenadded to the change table. However, you can still run scanmanually as you can see here. Examination still worksmanually as it used to on SQL Server and Azure SQL MI. Here we can change the retentionformula cleanup policy. Let’s say we want to cleanup the change counter, for instance, every 30 seconds. We’re going to do this, and if we go back tothe cdc_jobs table, which got procreated afterwe enabled the table, you can see that the retentionhas been changed to 30 seconds. The patron can configure this, which is very convenient. Let’s say now we’re done, we want to disable CDC. I’m looking at all thesystem tables and here I can see all my counters and CDC artifacts.If I forget which onesare enabled for CDC, in my suit, I exclusively haveone so it is quite easy, but I can easily seehere is moved by CDC, the 7th one which isthe customer table. I disable the counter and then I incapacitated the database forCDC and once you do that, all the CDC artifactsare automatically removed and asexpected, they go away. That’s pretty much it. That’s a very simpledemo for how CDC works in Azure SQL databases. Here you can see the systemtables section is empty. >> Cool, Mara, this is a great demo. I love how you took usthrough the whole story of, how do I go enable thisat a database height, at table level, and howdo I see it in action? I didn’t even know I was goingto be a part of his demo, it sounds cool to you to getadded into your customer’s counter. As we close out, Mara, do you have anytips for our viewers? Whether they’re just getting started or they’re very familiar with this? Any tips on how they get the most out of this or[ inaudible] or anything else? >> Definitely.As I was saying, we just very recently released the public view of CDC inAzure SQL databases so we have a bi-weekly blocks series that youcan keep track of at this relate. As I was mentioning, one of the most commonscenarios for using CDC is to send data changes to otherdownstream customers. In this series of blogs, we’re going to show youevery other week how to use Azure Data Factory andother cool engineerings in integration with CDC to send your change datato other customers, and also here are some relations forCDC in general in SQL Server, Azure SQL MI, and Azure SQL DB, how you can enable, disable, how it was working with other features.How to administer it and monitor. Look four months and so on. >> Awesome, thanks so muchMara and for our viewers, we’re going to leant all of thoselinks in the specific characteristics, so be sure to check the descriptionand check out these resources. Thanks again, Mara so much for coming on the depict wereally appreciate it. I learned a lot, I ponder ourviewers learn a good deal very. For our viewers, ifyou like this video, go ahead and render it a like, leave us a comment andlet us know what you’re doing with CDC or whatyou learn in this video, and we hope to see younext time on Data Exposed .[ MUSIC ].
