WEBVTT 00:00.000 --> 00:16.000 All right, everybody, we're going to get started. 00:16.000 --> 00:18.000 Please be quiet if you want to talk. 00:18.000 --> 00:20.000 Do it outside. 00:20.000 --> 00:24.000 Be respectful of the speakers, please. 00:24.000 --> 00:29.000 Hello, everybody, please be quiet because Charlie's going to give us a run-down 00:29.000 --> 00:34.000 of what happens under the hood when you actually execute an SQL select statement. 00:34.000 --> 00:35.000 Hey, the way. 00:35.000 --> 00:38.000 Thank you. 00:38.000 --> 00:39.000 Hey, everybody. 00:39.000 --> 00:43.000 It's just one better or the other one better. 00:43.000 --> 00:44.000 The other one? 00:44.000 --> 00:46.000 Okay. 00:46.000 --> 00:48.000 Okay, it's similar to that. 00:48.000 --> 00:49.000 I'm possible, right? 00:49.000 --> 00:53.000 So they think that they's going to gamble with your wish. 00:53.000 --> 00:56.000 The planer is going to do some bats. 00:56.000 --> 00:58.000 It's going to get some informations. 00:58.000 --> 01:05.000 It's going to get some coins and going to bat with your wish. 01:05.000 --> 01:10.000 And then the executor at the end is going to pay for the bill. 01:10.000 --> 01:13.000 Will it pay a smaller or a larger bill? 01:13.000 --> 01:14.000 How does that work? 01:14.000 --> 01:17.000 This is what we're going to see today. 01:17.000 --> 01:20.000 So this is our example bat. 01:20.000 --> 01:23.000 We're going to run this select. 01:23.000 --> 01:30.000 If you see we have four tables, three joints, two filters, and one soft limit. 01:30.000 --> 01:35.000 It's not the very simplistic select, but it's not the very complex select either. 01:35.000 --> 01:40.000 So it's something that you can use to see how that's going to play together, right? 01:40.000 --> 01:46.000 So when you send this to the database, this was supposed to be an animation. 01:46.000 --> 01:48.000 See? 01:48.000 --> 01:50.000 Okay, let's run again. 01:50.000 --> 01:56.000 So when you send this to the database, it goes to the pass, then to the analyzer, 01:56.000 --> 01:59.000 re-writer, planer, executor, and then the result. 01:59.000 --> 02:05.000 So help each because you're going to see each of them going to do here. 02:05.000 --> 02:08.000 So everything starts from the pass. 02:08.000 --> 02:14.000 So the pass is the guy that's going to get your query and see if everything is in order. 02:14.000 --> 02:16.000 It doesn't care if the table exists now. 02:16.000 --> 02:19.000 It doesn't care if the columns exist or not. 02:19.000 --> 02:24.000 But it cares if it has an order, just like when you speak an English. 02:24.000 --> 02:28.000 When you say, I would like to, you say, I would like to. 02:28.000 --> 02:31.000 You don't say, like would I do to do something, right? 02:31.000 --> 02:33.000 So it has order. 02:33.000 --> 02:39.000 It's need to say if you write front, not for, or M, F, or R. 02:39.000 --> 02:41.000 So that's what I'm going to do. 02:41.000 --> 02:44.000 And then it's going to create a tree. 02:45.000 --> 02:48.000 Just like the one that we see here, this is the pass in three. 02:48.000 --> 02:55.000 So this pass in three, going to follow our curve along the way, because it's going to be re-use it layer. 02:55.000 --> 03:00.000 This is the starting of the internal language of the database. 03:00.000 --> 03:04.000 We're going to use force with an example here, but that happens in almost all the database. 03:04.000 --> 03:07.000 At least the ones that I know. 03:07.000 --> 03:12.000 So after that, we need to do a semantics analysis. 03:12.000 --> 03:16.000 Okay, now we really care if they will exist or not. 03:16.000 --> 03:19.000 So we need to check if those tables exist. 03:19.000 --> 03:22.000 We need to check if the comparison that we're doing there, 03:22.000 --> 03:27.000 we're using the right operator for the correct times. 03:27.000 --> 03:33.000 Sometimes you want to do a select where something was equals something else. 03:33.000 --> 03:35.000 But we're comparing different types. 03:35.000 --> 03:39.000 We're comparing an integer with a date or something else. 03:39.000 --> 03:44.000 That depending on your database and the syntax is it's not allowed. 03:44.000 --> 03:53.000 Then you need to do either change the types or you need to do an in place change for them. 03:53.000 --> 03:57.000 So this is what this guy is going to do. 03:57.000 --> 04:00.000 So and then it create an audit tree. 04:00.000 --> 04:06.000 If we use the tree that we sent before, a pass tree, it's going to create a query tree. 04:06.000 --> 04:12.000 So these are those where you could rewrite some inserts inside of the bills. 04:12.000 --> 04:18.000 So these are all of those things that are going to be to change it here. 04:18.000 --> 04:24.000 All right, now we have most of the things that we need to execute. 04:24.000 --> 04:26.000 We have a plan now. 04:26.000 --> 04:33.000 So but before we execute, we need to decide how we're going to execute. 04:33.000 --> 04:37.000 We know which are the tables we want to select. 04:37.000 --> 04:43.000 We know which are the roles we want to select, the kind and no because we have the referrals. 04:43.000 --> 04:47.000 And you know the max number of roles we want to get. 04:47.000 --> 04:49.000 Remember we have a limit 100 here. 04:49.000 --> 04:54.000 So we want to cap our result when it gets to 100 roles. 04:54.000 --> 04:55.000 We don't want more. 04:55.000 --> 04:58.000 We may get less, but we don't want more of those ones. 04:58.000 --> 05:01.000 So this is the information that the database has. 05:01.000 --> 05:04.000 But now it needs to make a few of decisions. 05:04.000 --> 05:07.000 So the first decision that it needs to do. 05:07.000 --> 05:10.000 Here is how to scan. 05:10.000 --> 05:12.000 Then how to join. 05:12.000 --> 05:15.000 And the order and if you do some part of the list. 05:15.000 --> 05:19.000 And to do those things to make those decisions. 05:19.000 --> 05:23.000 This is where come the odds for the our batch. 05:23.000 --> 05:26.000 The decisions that are made by based on course. 05:26.000 --> 05:28.000 Remember we want to pay the lesser bill. 05:28.000 --> 05:29.000 The lesser value. 05:29.000 --> 05:32.000 The lesser money to run our select. 05:32.000 --> 05:34.000 But how do we calculate that course? 05:34.000 --> 05:36.000 How do they calculate that course? 05:36.000 --> 05:41.000 So we have some unities that the database is going to use. 05:41.000 --> 05:46.000 And some of them, how do we get the data? 05:46.000 --> 05:48.000 Do we get the data sequentially? 05:48.000 --> 05:54.000 Or do we get the data in a random way inside of the disk? 05:54.000 --> 05:56.000 Why does it matter? 05:56.000 --> 06:00.000 So in all days when a user to use is speeding this. 06:00.000 --> 06:05.000 Get in the data sequentially is a lot faster than randomly. 06:05.000 --> 06:07.000 Because the speed in the disk. 06:07.000 --> 06:10.000 The head of the disk for one place or another took a lot of time. 06:10.000 --> 06:12.000 And if you install your course risk well right now. 06:12.000 --> 06:14.000 And you go for the configuration. 06:14.000 --> 06:16.000 You see this parameter here. 06:16.000 --> 06:18.000 Random page cost equals four. 06:18.000 --> 06:23.000 It's telling the database that if you need to take a random page. 06:23.000 --> 06:29.000 It's going to cost four times a stake in one sequential page. 06:29.000 --> 06:32.000 Or give it, why does it matter? 06:32.000 --> 06:37.000 Well, it matters because your index is basically random access on the database. 06:37.000 --> 06:42.000 And your table, full table scan is sequential access on the database. 06:42.000 --> 06:46.000 So if you have a table and in that table have an index. 06:46.000 --> 06:51.000 And if the number of roles compared that's read from the index on the table is similar 06:51.000 --> 06:56.000 or four times lesser the database is not going to use your index. 06:56.000 --> 06:58.000 It's going to do a full table scan. 06:58.000 --> 07:02.000 And that's why sometimes you have this really nice index here in my table. 07:02.000 --> 07:05.000 And if it is a lack, it's not using my index. 07:05.000 --> 07:07.000 Because it's expensive to the database. 07:07.000 --> 07:10.000 That's one of the reasons, but they only one. 07:10.000 --> 07:12.000 So we have all the parameters as well. 07:12.000 --> 07:15.000 So we have the CPU couple costs. 07:15.000 --> 07:20.000 That is how much going to cost to the database to process each role that you're going to select. 07:20.000 --> 07:25.000 So when you have these outdated statistics, those numbers might be wrong. 07:25.000 --> 07:29.000 And then you might be having full table scans instead of index scans. 07:29.000 --> 07:34.000 Not because it's cheaper, but because the polar things are believes it's cheaper. 07:34.000 --> 07:36.000 Based on the statistics, it gets. 07:36.000 --> 07:37.000 Right? 07:37.000 --> 07:39.000 We need to be careful on that one. 07:39.000 --> 07:43.000 So this is having an example here of a planning. 07:43.000 --> 07:48.000 This is what you get when you do an x-playing on post-grace quo. 07:48.000 --> 07:52.000 It's going to try to see, based on what you have, all the cost. 07:52.000 --> 07:54.000 What are the cost for my sequential scan? 07:54.000 --> 07:57.000 They index this scan, and so on and so forth. 07:57.000 --> 08:04.000 So we have the information that we put here and how it used on the explain. 08:04.000 --> 08:05.000 Right? 08:05.000 --> 08:08.000 And an example. 08:08.000 --> 08:11.000 And if you are familiar with post-grace quo, 08:11.000 --> 08:14.000 we have all the database they do the same as well. 08:14.000 --> 08:16.000 We have a few different types of explain. 08:16.000 --> 08:18.000 We have the explain and explain analyze. 08:18.000 --> 08:22.000 The explain is what the planer believes it's going to do. 08:22.000 --> 08:28.000 And the analyze it actually runs and shows what exactly it does. 08:28.000 --> 08:31.000 And this is the build that you're going to pay. 08:31.000 --> 08:34.000 Not the build that we believe we're going to pay. 08:34.000 --> 08:38.000 So on this way, we have to look at different things. 08:38.000 --> 08:40.000 Okay, sequential scan. 08:40.000 --> 08:44.000 Why or how we usually run sequential scans? 08:44.000 --> 08:45.000 Well, there's a matter. 08:45.000 --> 08:51.000 If you have a table that is too small or if you're going to get a lot more data from the table, 08:51.000 --> 08:56.000 compared to the total amount of the table, it's going to do a full table scan. 08:56.000 --> 08:58.000 Because it's cheaper. 08:58.000 --> 09:06.000 Remember, when we do a index scan, it needs to go to the index which is a random search. 09:06.000 --> 09:11.000 Again, the information then is still need to get the data from the table. 09:11.000 --> 09:14.000 Unless we do what we call an index only scan. 09:14.000 --> 09:18.000 If all the information you want is on the index, that's fine. 09:18.000 --> 09:19.000 There's a need to go to the table. 09:19.000 --> 09:24.000 For example, if you want, if you have an index that has the names. 09:24.000 --> 09:27.000 And they won't only to get the name, so all that information.