Skip to content

Microsoft 365

Microsoft 365 is a core EDP source system for cloud productivity, collaboration, licensing, usage, and service metadata. It should be documented separately from on-premises Active Directory because it uses Microsoft Graph and cloud service APIs, has different identifiers, and exposes different operational data.

Overview

Microsoft 365 can provide EDP with cloud-side identity, group, license, collaboration, usage, and service information.

EDP should use Microsoft 365 data to support:

  • Cloud identity and account visibility
  • License assignment and utilization reporting
  • Group and Microsoft 365 group visibility
  • Teams and channel visibility
  • Collaboration governance
  • User activity and adoption reporting
  • Cross-system identity correlation
  • Onboarding and offboarding validation
  • Access review and membership analysis
  • Operational dashboards for cloud service usage

Ownership

RoleResponsibility
Source system ownerOwns Microsoft 365 tenant operations, licensing, access policies, and service configuration
Technical ownerSupports app registrations, Graph permissions, admin consent, and connector access
Data stewardDefines license, group, team, channel, and usage reporting meaning for EDP consumers
Platform ownerRuns the connector, raw landing, transformations, quality checks, and downstream models

Data Domains

Initial Microsoft 365 data domains:

  • Users
  • Groups and Microsoft 365 groups
  • Group memberships and ownership
  • Teams
  • Channels
  • License SKUs
  • User license assignments
  • Service plans
  • Usage reports
  • Tenant and service metadata

Later domains may include:

  • SharePoint sites
  • OneDrive usage
  • Exchange mailbox metadata
  • Planner metadata
  • Audit and activity reports where licensing and permissions allow
  • Administrative role assignments

Connector Options

Connector OptionWhen It FitsNotes
Microsoft Graph APIPreferred production pattern for most Microsoft 365 entitiesSupports users, groups, Teams, licenses, reports, and many cloud resources
Microsoft Graph reports APIPreferred for Microsoft 365 usage and adoption reportsUseful for usage reporting and service adoption dashboards
PowerShell-assisted exportUseful when an administrative module exposes needed data more easilyShould still produce governed files or API results into raw landing
Admin center exportUseful as an interim or fallback approachShould be treated as a temporary bridge, not a long-term connector pattern
Manual seedAcceptable only for early testing or static reference dataShould not be used for production operational reporting

Prebuilt Package Status

Status: prebuilt planned.

A reusable Microsoft 365 connector package should include:

  • Containerized Graph connector runtime
  • Configuration schema for tenant, endpoints, object sets, schedules, and report periods
  • Required Graph permission documentation
  • App registration and admin consent setup guide
  • Connection test
  • Microsoft Graph throttling and retry handling
  • Airflow DAG template
  • Raw landing tables for users, groups, memberships, Teams, channels, licenses, and usage reports
  • dbt source definitions
  • ODS identity, account, group, team, channel, license, and usage mappings
  • Data Vault hub, link, and satellite templates
  • Data quality tests
  • Operational runbook

Required Access

The connector should use an approved Microsoft Entra app registration or managed identity pattern where available. Permissions should be application permissions or delegated permissions based on the deployment model, using least privilege wherever practical.

Document these items before production use:

  • Tenant identifier
  • App registration or managed identity
  • Required Graph permissions
  • Admin consent status
  • Certificate or secret rotation pattern
  • Object types and endpoints to collect
  • Report periods and usage report requirements
  • Throttling and retry behavior
  • Network egress requirements
  • Approval from the Microsoft 365 tenant owner

Permission requirements vary by endpoint. Keep the exact permission list with the connector package and review it whenever Microsoft Graph endpoints or requested data domains change.

Raw Landing Design

Recommended raw tables:

Raw TablePurpose
raw.m365_usersMicrosoft 365 or Entra user payloads and source metadata
raw.m365_groupsGroup and Microsoft 365 group payloads
raw.m365_group_membershipsGroup membership and ownership edges
raw.m365_teamsTeam payloads and metadata
raw.m365_channelsTeam channel payloads and metadata
raw.m365_subscribed_skusTenant license SKU inventory
raw.m365_user_license_detailsUser license and service plan assignments
raw.m365_usage_reportsUsage and adoption report payloads
raw.m365_collection_runsConnector run metadata, counts, timing, and status

Each raw record should include:

  • Tenant identifier
  • Source endpoint or report name
  • Source object identifier
  • Source object type
  • Report period where applicable
  • Extraction timestamp
  • Ingestion run identifier
  • Connector version
  • Raw payload

ODS Mapping

Recommended ODS entities:

  • ods.identity
  • ods.account
  • ods.group
  • ods.group_membership
  • ods.team
  • ods.team_channel
  • ods.license_sku
  • ods.license_assignment
  • ods.service_plan
  • ods.usage_metric
  • ods.source_record

Common mapping rules:

  • Use stable Microsoft cloud object identifiers for source identity.
  • Preserve user principal name and mail attributes as descriptive attributes because they can change.
  • Correlate cloud accounts to Active Directory accounts through immutable identifiers, synchronization attributes, UPN, mail, or approved matching rules.
  • Model direct group and team memberships first.
  • Add ownership and nested membership expansions as derived models with clear lineage.
  • Keep license assignments separate from license utilization.
  • Treat usage reports as measured observations with report periods, not as current-state identity attributes.

Data Vault Mapping

Recommended Data Vault structures:

  • Hub: identity
  • Hub: cloud account
  • Hub: group
  • Hub: team
  • Hub: channel
  • Hub: license SKU
  • Hub: service plan
  • Link: identity to cloud account
  • Link: cloud account to group
  • Link: cloud account to team
  • Link: team to channel
  • Link: cloud account to license SKU
  • Link: license SKU to service plan
  • Satellites: descriptive attributes, assignment history, usage observations, source metadata, and classification history

The Data Vault should preserve historical changes in license assignments, group memberships, team memberships, service plan status, and usage observations when collected over time.

Data Mart Outputs

Initial Microsoft 365 marts:

  • License assignment mart
  • License utilization mart
  • Microsoft 365 group mart
  • Teams and channel visibility mart
  • User cloud account mart
  • Inactive or stale cloud account mart
  • Collaboration governance mart
  • Identity lifecycle reconciliation mart
  • Usage and adoption mart

These marts should support operational dashboards, access reviews, licensing decisions, collaboration governance, and identity lifecycle validation.

Quality Checks

Recommended checks:

  • Collection run completed successfully.
  • User, group, team, channel, license, and usage counts are within expected ranges.
  • Source object identifiers are present and unique within each object type.
  • Membership references resolve to known users, groups, teams, or service principals where possible.
  • License assignments reference known SKUs and service plans.
  • Usage report periods are complete and not duplicated.
  • Expected report periods arrive on schedule.
  • Graph throttling responses are handled with backoff and retry behavior.
  • Freshness meets the expected schedule.

Refresh Schedule

Start with daily collection for tenant inventory, users, groups, Teams, channels, memberships, and license assignments.

Usage reports may have different availability windows and reporting periods. Store report period metadata and avoid treating usage reports as immediate real-time signals.

Use more frequent collection when operational needs require closer tracking, such as license assignment changes, onboarding, offboarding, or membership review.

Operational Runbook

The Microsoft 365 connector runbook should include:

  • How to validate app registration or managed identity access
  • How to confirm Graph permissions and admin consent
  • How to run a connection test
  • How to run a manual collection
  • How to review collection counts
  • How to handle Graph throttling and Retry-After responses
  • How to handle permission failures
  • How to rotate certificates or secrets
  • How to backfill report periods
  • How to disable the connector safely

Known Limitations

Microsoft 365 APIs may throttle requests. Connector logic should respect throttling responses and use backoff behavior.

Some reports are aggregated, delayed, anonymized, licensed, or limited by tenant settings. Document report behavior for each data domain.

Microsoft 365 and Active Directory may contain related but different identity records. Correlation rules should be explicit and auditable.

Permission requirements can change as endpoints and data domains expand. Keep permissions reviewed and documented with the connector package.

References