Welcome to this installment of the “So You want to be an Azure Solutions Architect Expert” series. This is the fifth accompanying article, and it will focus on some key objectives that are part of the AZ-303 and AZ-304 exams.
To attend the session on Friday, July 24, 2020, please access the link for the Cloud Lunch and Learn website
These sessions are also being recorded and posted on the Cloud Lunch and Learn YouTube channel
In this session and article, I will be focusing on Data and Storage in Azure. There are three fundamental components that make up the cloud infrastructure (or any infrastructure) and they are compute, networking, and storage. The previous sessions have covered compute (both IaaS and PaaS), and networking (included in the security session). This article and session will focus on the Storage services and the various ways store and access your data.
The key topics that I will be covering are potentially heavily tested topics within the exams, so the goal is that this information will provide you with a strong foundation of information that you can build upon as you prepare for the exams.
The topics that I will discuss on Friday include the following:
Data evolution and types of data: Before you begin to create resources with Azure, or any network infrastructure, it is important to understand how data is going to be stored and utilized. As more and more devices become connected, the amount of data that is created grows exponentially. The concept of Moore’s law has been accelerated and organizations are finding ways to utilize cloud services for expanding their compute capabilities to process the larges amounts of data. If you are a Data Engineer or DBA, this is becoming a highly sought after role in the industry and I recommend looking into the Data Engineer Associate certification path, which takes a much deeper look at the use of data. From the Azure Solutions Architect Expert scope, you will need to understand the types of data, and the various storage options and uses within Azure.
Storage types: Understanding the different storage types and where you would store this data in Azure services is important to understand. Storage can be placed in three categories: structured, unstructured, and semi-structured. When talking about structured and semi-structured data, we are usually storing this data within a SQL (structured) or no-SQL (CosmosDB) database. Unstructured data are in the object storage category that utilizes Blob storage.
Azure Storage: https://docs.microsoft.com/en-us/azure/?product=storage The foundation of storage within Azure is the Azure Storage account. Within a General Purpose v2 (Azure default recommended option) storage account, there are four types of storage: Containers (Blobs), File Shares, Tables, and Queues. As you prepare for the Solutions Architect exams, it is important to understand the uses of each of these, with a particular emphasis on Blobs and Files. https://docs.microsoft.com/en-us/azure/storage/common/storage-introduction?toc=%2fazure%2fstorage%2fblobs%2ftoc.json
We will also discuss during the session how to enable and use Replication, Soft-delete, and Data Lifecycle Management to protect from data loss and adhere to retention requirements.
Relational and Non-Relational Databases: https://docs.microsoft.com/en-us/azure/?product=databases In the PaaS session on July 10, 2020, we discussed two of the key database services that should be a focus of your preparation, SQL database and Cosmos DB. A tip here would be to understand the different options for deploying a SQL database within Azure. Each of these options can be found at the link provided above and here: https://docs.microsoft.com/en-us/azure/azure-sql/. There are also scenarios where Cosmos DB can be utilized with a SQL API to provide a global database distribution, and this is key. Know the different APIs and consistency levels that can be used with Cosmos DB. https://docs.microsoft.com/en-us/azure/cosmos-db/. As you move to the AZ-301/304 exam, you will want to understand how and when to use Azur Database Migration service as well.
Big Data Options: https://docs.microsoft.com/en-us/azure/?product=analytics There are many other options within Azure services for the ingestion and transformation of large datasets. Understanding these at a high level, and how and where these may be used can help you on the exam. The exams do not go deep into these but you may find some questions on some of these services, particularly Data Lake or SQL Data Warehouse (Azure Synapse Analytics). I have provided links to the documentation for these big data services below:
Azure Data Lake Storage: https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction
Azure Synapse Analytics (formerly SQL Data Warehouse): https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/
Azure Databricks: https://docs.microsoft.com/en-us/azure/databricks/
Choosing a Data storage option: Once you are comfortable with the various data storage options within Azure, you need to choose the best option for the environment and workloads that you are deploying. What are the characteristics and key features that you are requiring? How are you going to be ingesting the data, how are you using it, and how are you querying the data? Finally, what are you doing to secure your data?
Asking these questions will guide you on the path to choosing the proper storage option and service within Azure.
At a high level, these topics will be covered in my Cloud Lunch and Learn session on Friday, July 24, 2020. The recording will be located on the Cloud Lunch and Learn YouTube channel
Additional links to assist in initial preparation include:
Cloud Lunch and Learn github
Build5Nines self-assessment tools: https://build5nines.com/free-oss-exam-self-assessment-tool/
Azure Greg posts Microsoft Docs links for every exam objective here: Gregor Suttie github
Pixel Robots is a helpful source of information, especially for AKS: https://pixelrobots.co.uk/