Litware, Inc. is a manufacturing company that has offices throughout North America. The analytics team atLitware contains data engineers, analytics engineers, data analysts, and data scientists.Litware公司是一家在北美各地设有办公室的制造公司。Litware的分析团队包括数据工程师、分析工程师、数据分析员和数据科学家。
Existing Environment 现有环境
Fabric Environment Fabric环境
Litware has been using a Microsoft Power BI tenant for three years. Litware has NOT enabled any Fabriccapacities and features.
Litware已使用Microsoft Power BI租户三年。Litware尚未启用任何Fabric功能和特性。
Available Data 可用数据
Litware has data that must be analyzed as shown in the following table.
Litware有数据需要分析,如下表所示。
(含图)
The Product data contains a single table and the following columns.
产品数据包含一个单一的表格和以下列。
The customer satisfaction data contains the following tables:
客户满意度数据包含以下表:
- Survey 调查表
- Question 问题
- Response 响应
For each survey submitted, the following occurs:
对于每份提交的调查,会发生以下情况:
- One row is added to the Survey table.
- 一条记录被添加到调查表中。
- One row is added to the Response table for each question in the survey.
- 每个调查问题在响应表中增加一行。
The Question table contains the text of each survey question. The third question in each survey responseis an overall satisfaction score. Customers can submit a survey after each purchase.
问题表包含每个调查问题的文本。每次调查回复中的第三个问题是整体满意度评分。客户可以在每次购买后提交调查问卷。
User Problems 用户问题
The analytics team has large volumes of data, some of which is semi-structured. The team wants to useFabric to create a new data store.
分析团队拥有大量数据,其中部分为半结构化数据。团队希望使用Fabric创建一个新的数据存储系统。
Product data is often classified into three pricing groups: high, medium, and low. This logic is implementedin several databases and semantic models, but the logic does NOT always match across implementations
产品数据通常被划分为三个定价类别:高、中、低。这一逻辑被应用于多个数据库和语义模型中,但不同实现方式下的逻辑并不总是完全一致的。
Requirements 需求
Planned Changes 计划变更
Litware plans to enable Fabric features in the existing tenant. The analytics team will create a new datastore as a proof of concept (PoC). The remaining Litware users will only get access to the Fabric featuresonce the PoC is complete. The PoC will be completed by using a Fabric trial capacity.
Litware计划为现有租户启用Fabric功能。分析团队将创建一个新的数据存储作为概念验证(PoC)之用。其余的Litware 用户只有在PoC 完成之后才能访问 Fabric 功能。PoC 的完成将通过利用 Fabric 的试用容量来实现。
The following three workspaces will be created:
将创建以下三个工作区:
- AnalyticsPOC: Will contain the data store, semantic models, reports pipelines, dataflow, and notebooksused to populate the data store.
- AnalyticsPOC:将包含用于填充数据存储的数据存储、语义模型、报告管道、数据流和笔记本。
- DataEngPOC: Will contain all the pipelines, dataflows, and notebooks used to populate OneLake.
- DataEiPOC:将包含用于填充OneLake的所有管道、数据流和笔记本。
- DataSciPOC: Will contain all the notebooks and reports created by the data scientists
- DataSciPOC:将包含数据科学家创建的所有笔记本和报告
The following will be created in the AnalyticsPOC workspace:
以下内容将在AnalyticsPOC工作区中创建:
- A data store (type to be decided)
- 一个数据存储(类型待定)
- A custom semantic model
- 一个自定义语义模型
- A default semantic model
- 一个默认的语义模型
- Interactive reports
- 交互式报告
The data engineers will create data pipelines to load data to OneLake either hourly or daily depending onthe data source. The analytics engineers will create processes to ingest, transform, and load the data tothe data store in the AnalyticsPOC workspace daily. Whenever possible, the data engineers will use low-code tools for data ingestion. The choice of which data cleansing and transformation tools to use will be atthe data engineers' discretion.
数据工程师将创建数据管道,以便根据数据源的不同,以小时或日频度将数据加载至OneLake中分析工程师则将创建相关流程,用于每日将数据导入、转换并加载至Analytics POC工作区中的数据存储库中。在条件允许的情况下,数据工程师将使用低代码工具进行数据导入。选择使用何种数据清洗与转换工具则完全由数据工程师自行决定。
Allthe semantic models and reports in the Analytics POC workspace will use the data store as the soledata source.
AnalyticsPOC工作区中的所有语义模型和报告都将使用数据存储作为唯一的数据源。
Technical Requirements 技术要求
The data store must support the following:
数据存储必须支持以下功能:
- Read access by using T-SQL or Python
- 使用T-SQL或Python进行读取访问
- Semi-structured and unstructured data
- 半结构化和非结构化数据
- Row-level security (RLS) for users executing T-SQL queries
- 执行T-SQL查询的用户支持行级安全(RLS)
Files loaded by the data engineers to OneLake will be stored in the Parquet format and will meet DeltaLake specifications.
数据工程师加载到OneLake的文件将以Parquet格式存储,并符合Delta Lake规范。
Data will be loaded without transformation in one area of the AnalyticsPOC data store. The data will thenbe cleansed, merged, and transformed into a dimensional model.
数据将在AnalyticsPOC数据存储的一个区域中未经转换直接加载。随后,数据将被清洗、合并,并转化为维度模型。
The data load process must ensure that the raw and cleansed data is updated completely beforepopulating the dimensional model.
数据加载过程必须确保在填充维度模型之前,原始和清洗后的数据已完全更新。
The dimensional model must contain a date dimension. There is no existing data source for the datedimension. The Litware fiscal year matches the calendar year. The date dimension must always containdates from 2010 through the end of the current year.
该维度模型必须包含一个日期维度。目前尚无现成的日期维度数据源。Litware的财政年度与日历年度相吻合。日期维度必须始终包含从2010年至当前年结束期间的日期。
The product pricing group logic must be maintained by the analytics engineers in a single location. Thepricing group data must be made available in the data store for T-SQL queries and in the default semantic model. The following logic must be used:
产品定价组的逻辑必须由分析工程师在一个特定位置进行维护。定价组数据必须被存储在数据仓库中以供T-SQL查询使用,并纳入默认语义模型中。以下逻辑必须得到应用:
- List prices that are less than or equal to 50 are in the low pricing group.
- 价格小于或等于50的列表属于低价组。
- List prices that are greater than 50 and less than or equal to 1,000 are in the medium pricing group.
- 价格大于50且小于或等于1,000的商品属于中等定价组。
- List prices that are greater than 1,000 are in the high pricing group.
- 超过1,000的列表价格属于高价组。
Security Requirements 安全要求
Only Fabric administrators and the analytics team must be able to see the Fabric items created as part ofthe PoC.
只有Fabric管理员和分析团队才能查看作为PoC一部分创建的Fabric项目。
Litware identifies the following security requirements for the Fabric items in the AnalyticsPOC workspace:
Litware为Analytics POC工作区中的Fabric项目识别了以下安全要求:
- Fabric administrators will be the workspace administrators.
- Fabric管理员将是工作区管理员。
- The data engineers must be able to read from and write to the data store. No access must be grantedto datasets or reports.
- 数据工程师必须能够读取和写入数据存储。不得授予对数据集或报告的访问权限。
- The analytics engineers must be able to read from, write to, and create schemas in the data store. Theyalso must be able to create and share semantic models with the data analysts and view and modify allreports in the workspace.
- 数据分析工程师必须具备从数据存储中读取、向其中写入以及创建数据模式的能力。他们还必须能够与数据分析师共同创建并分享语义模型,并能查看及修改工作区内的所有报告。
- The data scientists must be able to read from the data store, but not write to it. They will access thedata by using a Spark notebook。
- 数据科学家必须能够从数据存储中读取,但不能写入。他们将通过使用Spark笔记本访问数据。
- The data analysts must have read access to only the dimensional model objects in the data store. Theyalso must have access to create Power BI reports by using the semantic models created by theanalytics engineers.
- 数据分析师必须仅对数据存储中的维度模型对象拥有读权限。他们还必须能够使用分析工程师创建的语义模型来创建Power B报告
- The date dimension must be available to all users of the data store.
- 日期维度必须对数据存储的所有用户可用。
- The principle of least privilege must be followed.
- 必须遵循最小权限原则。
Both the default and custom semantic models must include only tables or views from the dimensionalmodel in the data store. Litware already has the following Microsoft Entra security groups:
默认和自定义语义模型都必须仅包含数据存储中维度模型中的表或视图。Litware 已经拥有以下MicrosoftEntra安全组:
- FabricAdmins: Fabric administrators
- FabricAdmins:Fabric管理员
- AnalyticsTeam: All the members of the analytics team
- 分析团队:所有分析团队成员
- DataAnalysts: The data analysts on the analytics team
- 数据分析师:分析团队中的数据分析师
- DataScientists: The data scientists on the analytics team
- 数据科学家:分析团队中的数据科学家
- DataEngineers: The data engineers on the analytics team
- 数据工程师:分析团队中的数据工程师
- AnalyticsEngineers: The analytics engineers on the analytics team
- 分析工程师:分析团队中的分析工程师
Report Requirements 报告要求
The data analysts must create a customer satisfaction report that meets the following requirements:
数据分析师必须创建一份符合以下要求的客户满意度报告:
- Enables a user to select a product to filter customer survey responses to only those who havepurchased that product.
- 允许用户选择产品,以筛选仅针对已购买该产品的客户的调查回复。
- Displays the average overall satisfaction score of all the surveys submitted during the last 12 monthsup to a selected date.
- 显示截至选定日期的过去12个月内提交的所有调查问卷的平均整体满意度得分。
- Shows data as soon as the data is updated in the data store.
- 数据存储更新后立即显示数据。
- Ensures that the report and the semantic model only contain data from the current and previous year.
- 确保报告和语义模型仅包含当前和上一年的数据。
- Ensures that the report respects any table-level security specified in the source data store.
- 确保报告遵守源数据存储中指定的任何表级安全策略。
- Minimizes the execution time of report queries.
- 最小化报告查询的执行时间
Question 1
You need to assign permissions for the data store in the AnalyticsPOC workspace. The solution must meetthe security requirements.
您需要为AnalyticsPOC工作区中的数据存储分配权限。解决方案必须满足安全要求。
Which additional permissions should you assign when you share the data store? To answer, select theappropriate options in the answer area.
在共享数据存储时,应分配哪些额外权限?请在答案区域中选择适当的选项作答。
NOTE: Each correct selection is worth one point.
注意:每项正确选择得一分。
(含图)