Author: Hoang Anh Tuan
Step into my SQL Data Engineering portfolio—an anthology of projects unveiling my finesse in sculpting data landscapes. From intricate models to streamlined processes, witness how I leverage SQL to empower insights and drive transformative solutions in the dynamic world of data engineering
To highlight my proficiency in SQL commands, I have actively engaged in numerous projects that require extensive use of SQL. The primary coding platform for my SQL work is SQL Server, where I have accumulated several years of experience. This section of my portfolio serves as a comprehensive representation of my expertise, covering all aspects of SQL that I have mastered over the years.
Outlined below are the particulars of the pivotal project I have developed:
Note: Taking advantage of the SQL Server's capability to import and open .bak files, I strategically used this feature by employing the AdventureWorks 2022 Database to create a significant project. This project serves as a comprehensive showcase, including a range of SQL commands from fundamental to advanced levels, covering intricate aspects such as Dynamic SQL and user functions.
Technology: SQL - SQL Server
Description: This initiative utilizes the AdventureWorks 2022 Database as the foundational data source for knowledge acquisition. It is compartmentalized into 20 distinct sections, each designed to cover specific SQL concepts. Within each section, there are 50 SQL requirements, each articulated in a concise format ranging from a single line to a maximum of 50 lines, depending on the complexity of the task. The complexity of requirements intensifies progressively, ensuring comprehensive coverage from foundational to advanced levels. Each requirement is uniquely numbered from 1 to 1000, incorporating distinctive elements to prevent replication. The following is an exhaustive list detailing each segment of this comprehensive project, along with their respective specifics.
- URL: Part 1 - Getting used to beginner level of SQL commands
- Description: This section serves as a foundation for the entire project, focusing on elementary SQL commands like SELECT, WHERE, LIKE, and ORDER BY. Challenges here are concise, typically solvable within five lines of code. While predominantly straightforward, a few tasks introduce beginners to slightly more complex queries involving multiple subqueries and/or CTEs, offering a gentle transition to more intricate SQL operations.
- Length: 50 Problems - 314 Lines
- Main Knowledge: TOP, Subqueries, CTE, Temporary Tables, DATEPART, DATEDIFF, GROUP BY, Aggregate Functions
- URL: Part 2 - Diving Deeper into Data Manipulation
- Description: Part 2 marks an elevation in SQL proficiency, delving into an intermediate level of database manipulation. Explore a diverse array of challenges featuring a blend of SQL commands, including GROUP BY, JOIN, HAVING, and complex conditional queries. Tasks vary in complexity, from analyzing data distributions to manipulating multiple datasets through joins and aggregations. This segment aims to solidify foundational knowledge while introducing nuanced SQL operations, laying the groundwork for advanced data querying and analysis.
- Length: 50 Problems - 392 Lines
- Main Knowledge: Subqueries, CTE, Multi-CTE, Temporary tables, DATEPART, DATEDIFF, ROUND, CAST AS Data-type
- URL: Part 3 - SQL Metamorphosis: Crafting with REPLACE, CAST, and NTILE
- Description: Part 3 delves into intermediate SQL operations, building on foundational concepts. It explores various functionalities like REPLACE, Common Table Expressions (CTEs), and conditional operations using CASE/WHEN statements. The tasks in this section involve grouping data using NTILE, conditional aggregations, and creating calculated columns based on specific conditions. These exercises challenge learners with tasks like analyzing credit ratings, implementing conditional logic for grouping data, and performing segmented computations. While not overly complex, this section presents a bridge between basic and more intricate SQL operations, offering learners a chance to expand their skills within a comfortable yet engaging environment.
- Length: 50 Problems - 556 Lines
- Main Knowledge: CTE, Multi-CTE, CAST, CASE/WHEN, NTILE, NEWID (Shuffle Random Order), REPLACE
- URL: Part 4 - SQL Shapeshifters: Introduction to Data Ranking and Splitting
- Description: This set of exercises delves into multifaceted queries leveraging a combination of SQL features. It involves filtering and extracting specific data sets based on various conditions using different SQL techniques like JOIN, EXISTS, and complex calculations involving date functions, mathematical operations, and conditional statements. Tasks range from utilizing JOIN operations to derive specific information about employees, employing EXISTS for complex filtering, to utilizing PIVOT for aggregating data based on different criteria. The use of CTEs and window functions like RANK(), PARTITION BY, and NTILE() adds complexity by manipulating data partitions and performing ranking operations.
- Length: 50 Problems - 738 Lines
- Main Knowledge: CTE, Multi-CTE, Subqueries, EXISTS, PIVOT, CASE/WHEN, RANK, PARTITION BY, NTILE
- URL: Part 5 - Introduction to Functions: Learning about Table-Returned Functions
- Description: Part 5 introduces the concept of SQL functions returning tables. These functions are designed to retrieve specific sets of data from the employee database based on various input parameters. They employ a range of SQL techniques such as string manipulation, numeric comparisons, and conditional filtering to extract information. The functions are diverse, tackling different aspects of data retrieval. These functions offer a deeper understanding of how to use SQL functions to create tailored queries, leveraging parameters to extract precise datasets. This set of exercises demonstrates the versatility of SQL functions in isolating targeted information from complex databases.
- Length: 50 Problems - 957 Lines
- Main Knowledge: FUNCTIONS Return Tables
- URL: Part 6 - Dynamic SQL and Stored Procedures
- Description: In this section, the focus shifts towards leveraging the power of dynamic SQL, stored procedures, and variable manipulation within SQL Server. The exercises delve into the realm of dynamic SQL, allowing for the creation and execution of SQL statements at runtime. Participants will engage with stored procedures, a powerful feature that encapsulates SQL logic for reuse and efficient management. The exercises showcase the usage of DECLARE and SET to handle variables within SQL Server procedures. These variables allow for the storage and manipulation of values, enabling dynamic control over queries and stored procedure behavior. Additionally, the concept of EXEC (Execute) is introduced, demonstrating its use to run dynamic SQL statements or execute stored procedures within other procedures. The tasks within this part emphasize the importance and utility of dynamic SQL in constructing adaptable and flexible queries, especially in scenarios where SQL statements need to be generated based on varying conditions or user inputs.
- Length: 50 Problems - 761 Lines
- Main Knowledge: Dynamic SQL, Stored Procedures, DECLARE, SET Variables, EXEC
- URL: Part 7 - Advanced Procedures and Function Problems
- Description: Part 7 explores SQL functions' power to return tables or specific values and delves into dynamic SQL concepts. It introduces COALESCE for handling NULL values effectively and demonstrates TRY_CAST's safe data type conversion. Additionally, it covers CROSS APPLY, enabling the combination of rows from multiple tables. These exercises deepen understanding of functions' versatility, NULL value management, safe type conversions, and advanced querying techniques like CROSS APPLY.
- Length: 50 Problems - 1051 Lines
- Main Knowledge: FUNCTIONS Return Tables/Values, Dynamic SQL, COALESCE, TRY_CAST, CROSS APPLY
- URL: Part 8 - The Art of SQL Logic: Looping Techniques and Conditional Structures
- Description: While loops and basic logics were barely mentioned in the previous parts, part 8 is dedicated to helping learners grasp logic and develop a problem-solving mindset about loops and conditions. This section shows all the knowledge required to understand the concepts of WHILE and IF, but it does so with a smaller focus on tables and data than the previous sections. Despite the fact that many challenges were developed uniquely and that task requirements may have been similar, solving them required a great deal of critical thinking. Furthermore, the BIT, also known as "Booleans" in other programming languages, which are either 1 (True) or 0 (False), also caused some of the very first problems with this section. In combination with the complex use of Subqueries and 'WHERE EXISTS,' this section enhances the problem solving mechanism.
- Length: 50 Problems - 1079 Lines
- Main Knowledge: Subqueries, Loops using WHILE, Conditions using IF, WHERE EXISTS, BIT