Re-imagining Semantic Search Inside Power BI

The Hidden Cost of “Simple” Search Apps

Many teams we talk to already use Azure AI Search. It’s a powerful service for making text and documents searchable with semantic and vector search.

But here’s the pattern we see over and over:

  • A new web app is built (maybe Streamlit, maybe a custom React app).
  • It’s hosted on Azure App Service or VMs.
  • It duplicates authentication, hosting, monitoring, DevOps pipelines…
  • And at the end of the day, users just get a search box + results table.

The business value is real, but the delivery is complex and costly for what it achieves.

Imagine if your users could

  • Type a search query,
  • Get semantic results with highlights and summaries,
  • And see them right inside the Power BI dashboards they already use every day.

No new app.
No separate portal.
No extra infra to maintain.

Just a familiar search box in Power BI, powered by Azure AI Search + AI summarization.

Also note that this approach requires no PowerApps or Power Automate — it runs entirely within Power BI.

What does it include

Power BI report provides:

  • Free text input for users to type search queries.
  • Native Power BI filtering (date, region, product, etc.) alongside semantic search results.
  • Paging support to navigate through large result sets.
  • Export options to Excel, CSV, PDF, and more.
  • Multiple report pages to query across different indexes, documents or datasets.
  • Power BI native authentication to the report

High-Impact Use Cases

Organizations like yours can quickly benefit from this approach for scenarios such as

  • Customer Service: Mine complaint text for themes (refunds, delivery delays, product defects).
  • Compliance & Legal: surface contract clauses or policy excerpts directly in dashboards.
  • Ops & IT: search across incident logs and root cause notes.
  • HR & Internal Comms: make policies instantly discoverable by employees.

All without building another app that IT must support.

Why it matters?

  • Cost savings: no Azure App Service, no custom UI hosting, no redundant auth.
  • User adoption: everyone already knows Power BI. No training needed.
  • Speed: what used to take weeks of app dev can now be delivered in days.

Where This Approach Fits Best

This design is intentionally simple and focused. It excels when you need:

  • A single query to retrieve relevant results.
  • Clear insights (summaries, highlights, tags) displayed directly in Power BI.
  • Seamless integration with existing dashboards and metrics.

It’s built for search + analytics, not for chatbot-style experiences.

So, if your scenario requires things like:

  • Conversational Q&A with follow-up questions,
  • Multi-turn history or context retention,
  • Uploading and reasoning over new documents during query,

Those are better served by a different architecture.

Think of this as “one search → one set of insights → shown in Power BI” — fast, clean, and highly effective for dashboards.

Your Next Step Toward Smarter Search

The exciting part here isn’t that it’s impossible tech or solution — it’s that the approach is far simpler than many expect, and that simplicity makes it faster, cheaper, and easier to adopt.

If you’re juggling multiple initiatives, this approach is lightweight and ideal for proving value before scaling further. You don’t need to build a full-blown application — instead, you can get something off the ground in days, not weeks. The sooner you see it running on your own data, the faster you’ll recognize its value.

Bring us your use case, and we’ll show you results fast (POC/implementation). You can skip the app dev overhead and light it up search inside Power BI directly. You’ll be surprised how simple (and cost-effective) it can be.

Automating Backup, Retention, And Restoration For Lakehouse In Microsoft Fabric

Data resilience is a cornerstone of modern analytics platforms. In Microsoft Fabric, maintaining backups and implementing automated policies for retention and restoration can elevate data management.

While Fabric is a robust platform, disaster recovery (DR) is not designed to address operational issues like data refresh failures or accidental deletions, which necessitate an automated approach to bridge the gaps and ensure operational continuity.

Effective backup, retention, restoration strategies are essential to maintaining reliable data platform, particularly in scenarios involving refresh failures or data corruption.

Note: This is not a substitute for disaster recovery feature of Microsoft Fabric, but a complementary approach to enhance resilience, streamline restoration processes, and minimize downtime through automation and proactive configurations.

Here’s an overview of setting up, configuring, and automating these processes while addressing challenges and their solutions.

Setting Up Backup and Retention Policies

Microsoft Fabric’s Lakehouse and OneLake provide unique capabilities for handling data. Backing up data involves:

  • Daily Incremental Backups: Ensuring minimal data loss by creating daily snapshots.
  • Retention Policy Configuration: Establishing tiers like daily, weekly, monthly, and yearly retention to balance storage costs and compliance.
  • Automation with Notebooks: Using Fabric notebooks to schedule backups and enforce retention policies, such as retaining the last 7 daily backups or 6 monthly backups and cleaning up obsolete ones.

Automation Highlights:

  1. Backup Creation: Scheduled scripts create snapshots at specific intervals. For example, Spark jobs can efficiently copy data using APIs like mssparkutils.
  2. Retention Enforcement: A policy-driven approach automatically removes outdated backups while preserving critical ones for auditing or recovery.
  3. Logging and Monitoring: Every backup, cleanup, and restoration action is logged to ensure transparency and auditability.

Restoration: Recovering from Data Loss

Fabric allows for full or selective restoration of data from backups. Restoration tasks involve:

  • Restoring entire Lakehouse or specific tables from a backup.
  • Using structured logs to identify and resolve errors during the restoration process.
  • Minimizing downtime by enabling rapid data recovery with scripts or automation tools.

Why Automate Backup and DR in Microsoft Fabric?

Automation mitigates risks and improves efficiency:

  • Data Integrity: Automated backups ensure all critical data is consistently safeguarded.
  • Operational Continuity: Quick restoration scripts minimize business downtime.
  • Cost Optimization: Automating cleanup eliminates outdated backups, reducing unnecessary storage expenses.
  • Scalability: Structured policies can accommodate growing datasets without additional manual effort.

Conclusion

While Microsoft Fabric is a promising data platform, addressing data corruptions, accidental deletion challenges require a proactive and automated approach. By leveraging our automation for backup, retention, cleanup, and restoration, organizations can safeguard data, ensure business continuity, it provides significant value for the business.

RAG in the Real World: Why Scalable AI Needs More Than Just Retrieval and Prompts

Executive Summary

Retrieval-Augmented Generation (RAG) has quickly become one of the most glorified terms in enterprise AI. RAG is typically showcased over a handful of PDFs which seems to be easy and simple. But operating at enterprise scale—lakhs (hundreds of thousands) of records, low-latency retrieval, strict accuracy, and predictable costs—is hard. Real limits show up in four places: embeddings, vector indexes (e.g., Azure AI Search), retrieval/filters, and LLMs themselves. You’ll need hybrid architectures, careful schema/ops, observability, and strict cost controls to get beyond prototypes.

Embedding Challenges at Scale

Cost & volume. Embedding large datasets quickly runs into scale issues. Even a moderately sized corpus—each record carrying a few hundred to thousands of tokens—translates into tens of millions of tokens overall. The embedding phase alone can run into significant dollar costs per cycle, and this expense only grows as data is refreshed or re-processed.

Throughput & reliability. APIs enforce token and RPM limits. Production pipelines need:

  • Token-aware dynamic batching
  • Retry with backoff + resume checkpoints
  • Audit logs for failed batches

Chunking trade-offs.

  • Too fine → semantic context is lost
  • Too coarse → token bloat + noisy matches Use header/paragraph-aware chunking with small overlaps; expect to tune per source type.

Domain relevance gaps. General-purpose embeddings miss subtle, domain-specific meaning (biomedical, legal, financial). Dimensionality isn’t the cure; domain representation is. Without specialization, recall will feel “lexical” rather than truly semantic.

Vector Databases: Strengths… and Constraints for RAG

Immutable schema. Once an index is created, fields can’t be changed. Adding a new filterable/tag field → recreate the index.

Full reloads. Schema tweaks or chunking updates often require re-embedding and re-indexing everything—expensive and time-consuming (parallelism will still hit API quotas).

Operational sprawl. Multiple teams/use cases → multiple indexes → fragmented pipelines and higher latency. Unlike a DB with views/joins, AI Search pushes you toward rigid, static definitions.

(Reference: Microsoft docs on Azure AI Search vector search and integrated vectorization.)

Retrieval & Filtering Limits

Shallow top-K. Even at top-K, relevant items can fall just outside the cut-off. In regulated domains, a single miss matters.

Context window pressure. As best practice we should send only required fields, then join externally on predicted Answers which will have key identifiers Like ID to finish the answer. (In our Scenario we sent 1-2 columns having Id, descriptions out of 25 columns in a table)

Filter logic ceiling. Basic metadata filters work, but you’ll miss nested conditions, dynamic role-based filters, and cross-field joins that are trivial in SQL.

LLM Limitations You Will Hit

Input limits. Even with large-context models (450k+ tokens),However, even within this boundary, we observed that certain records — particularly those in lower relevance ranges may be implicitly skipped during reasoning .This behaviour is non-deterministicand poses serious challenges for enterprise-grade tasks where every data point is critical

Output limits.Many models cap useful output (e.g., 2k–16k tokens). This affects multi-record responses, structured summaries, and complex decision-making outputs — leading to truncated or incomplete responses, particularly when returning JSON, tables, or lists.

Latency & variance. Complex prompts over hundreds of records can take minutes. Stochastic ranking creates run-to-run differences—tough for enterprise SLAs.

Concurrency & quotas. Enterprises face quota exhaustion due to shared token pools, Concurrent usage by multiple users or batch agents can quickly consume limits Smaller organizations with access to 8K input / 2K outputtoken models (e.g., LLaMA, Mistral) face even tighter ceilings — making RAG challenging beyond pilot projects

Productionisation, Observability — and Evaluation That Actually Matters

While most RAG demos end at a “right-looking” answer, real deployments must be observable, traceable, resilient, and continuously evaluated. GenAI pipelines often underinvest in these layers, especially with multiple async retrievals and LLM hops.

Operational pain points

  • Sparse/unstructured LLM logs → hard to reproduce issues or inspect reasoning paths.
  • Thin vector/AI Search telemetry → silent filter failures or low-recall cases go unnoticed.
  • Latency tracing across hybrids (SQL + vector + LLM) is messy without end-to-end spans.
  • Failure isolation is non-trivial: embed vs ranker vs LLM vs truncation?

What good observability looks like

  • End-to-end tracing (embed → index → retrieve → rerank → prompt → output).
  • Structured logs for: retrieval sets & scores, prompt/response token usage, cost, latency, confidence, and final joins.
  • Quality gates and alerts on recall@K, latency budgets, cost per query, and hallucination/citation signals.

Evaluation: beyond generic metrics

  • Partner with domain experts to define what “good” means. Automatic scores alone aren’t enough in regulated or domain-heavy settings.
  • Build a domain ground-truth set (gold + “acceptable variants”) curated by SMEs; refresh it quarterly.
  • Establish a human-in-the-loop loop: double-blind SME review on sampled traffic; escalate low-confidence or low-evidence answers by policy.
  • Maintain an error taxonomy (missed retrieval, wrong join, truncation, unsupported query, hallucination) with severity labels; track trends over time.
  • Run canaries/A-B tests in prod; compare quality, latency, cost, and SME acceptance before full rollout.
  • Log evaluation metadata (query id, versioned index, chunking config, model/runtime version) so you can pinpoint regressions.

In practice (our pattern)

  • We co-defined a gold set with SMEs and require evidence-backed answers; low-evidence responses are auto-routed to review.
  • We track recall@K + citation coverage for retrieval, and field-level precision/recall for extraction tasks, alongside cost/latency dashboards.

Bottom line Without observability + domain-grounded evaluation, production RAG stays brittle and opaque. With them, you get a system you can debug, trust, and scale—not just a demo that looks good once.

Conclusion: Beyond the Buzzwords

While Retrieval-Augmented Generation (RAG) has gained mainstream attention as the future of enterprise AI, real-world adoption reveals a wide gap between expectation and execution. From embedding inconsistencies and rigid vector schema constraints to LLM context bottlenecks and high operational latency, the challenges compound rapidly at production scale. Even in enterprise environments with access to high-token models, limitations around completeness, determinism, and runtime stability remain unresolved.

This doesn’t mean RAG is fundamentally flawed—it’s a powerful paradigm when paired with the right retrieval tuning, agent orchestration, hybrid pipelines, and system-level observability. But as engineers and architects, it’s time we shift the conversation from aspirational posts to grounded, production-aware designs. Until embedding models evolve to be truly domain-specific, vector systems allow dynamic schemas, and LLMs deliver predictable performance at scale, RAG should be treated not as a plug-and-play solution—but as a custom-engineered pipeline with domain, data, and budget constraints at its core

This blog represents the collective strength of our AI team—where collaboration, innovation, and expertise come together to create meaningful insights. It is a testament to the value SNP delivers through its AI practice, showcasing how we help our customers turn possibilities into impact.

Modernize your On-premises SQL Server Infrastructure by Utilizing Azure and Azure Data Studio

Data estates are becoming increasingly heterogeneous as data grows exponentially and spreads across data centers, edge devices, and multiple public clouds. In addition to the complexity of managing data across different environments, the lack of a unified view of all the assets, security and governance presents an additional challenge.

Leveraging the cloud for your SQL infrastructure has many benefits like cost reduction, driving productivity, accelerating insights and decision-making can make a measurable impact on an organization’s competitiveness, particularly in uncertain times. While infrastructure, servers, networking, etc. all by default are maintained by the cloud provider.

With SQL servers 2008 and 2012 reaching their end-of-life, it is advisable to upgrade them or migrate them to Azure cloud services. Modernizing any version of SQL server to Azure brings up many added benefits, including:

  • Azure PaaS provides 99.99% availability
  • Azure IaaS provides 99.95% availability
  • Extended security updates for 2008, 2012 servers
  • Backing up SQL Server running in Azure VMs is made easy with Azure Backup, a stream-based, specialized solution. The solution aligns with Azure Backup’s long-term retention, zero infrastructure backup, and central management features.

Tools leveraged

For modernizing the SQL infrastructure, SNP leveraged a variety of tools from Microsoft, such as the following.

  • The Azure Database Migration Service has been used since the beginning to modernize on-premises SQL servers. Using this tool, you can migrate your data, schema, and objects from multiple sources to Azure at scale, while simplifying, guiding, and automating the process.
  • Azure Data Studio is one of the newest tools for modernizing SQL infrastructure with an extension of Azure SQL Migration. It’s designed for data professionals who run SQL Server and Azure databases on-premises and in multi cloud environments.

Potential reference architecture diagram

Let’s take a closer look at the architecture, what components are involved and what is being done in Azure Data Studio to migrate or modernize the on-premises SQL infrastructure.

Among the components of Azure data studio are the source to be modernized, the destination where the on-premises SQL must be moved, and the staging layer for the backup files. Backup files are a major component of modernization.

There are various components that are involved in the Azure Data Studio migration or modernization- Source SQL server. The on-premise SQL server which is to be modernized/migrated, Destination Server- The Azure SQL VM to which the on-prem SQL server will be moved, and the staging layer (Storage Account or the Network Share Folder) for the backup files. Backup files are a major component of modernization.

Azure Data Studio and Azure SQL Migration primarily rely on backup files. It uses a full backup of the database as well as transactional log backups. Another important component is the staging layer, where backup files will be stored.

Microsoft Azure Data Studio uses a network share folder, an Azure storage container, or an Azure file. There must be a specific structure or order in which backup files are placed in either of the places. As shown in the below architecture, backup files specific to the Database must be placed in their own folders or containers.

As part of the migration to Azure, Azure Data Studio along with the Azure SQL Migration extension utilizes a technology called Data Migration Service, which is the core technology behind the scenes. It has also been integrated with Azure Data Factory, which runs the pipeline at regular intervals to copy the backup files from the on-prem network share folder to Azure thereby restoring them on the target or restoring them if they are in containers.

When the backup files are in a network share folder, Azure Data Studio uses Self Hosted Integration Run time to establish a connection between on-premises and Azure. After the connection has been established, the Azure Data Studio begins the modernization process leveraging Azure DMS.

Initially, all full and subsequent transactional log backup files of the databases are placed in a specified database folder or database container. Azure Data Studio copies backup files from network share folders to Azure storage containers if the backup files are in a network share folder.

Following this, Azure Data Studio restores them to the target Azure SQL VM or Azure SQL Managed Instance while Azure Data Studio directly restores backup files from the storage account to the Azure target if the backup files are stored in the storage account.

Following the completion of the last log restoration on the target Azure SQL database, we need to cut over the database and bring it online on the target. The databases will be placed in the Restoring mode during the restoration of the backup files, which means that we will not be able to access them until the cutover has been completed.

Your next steps

If you like what you have read so far, let’s move forward together with confidence. We are here to help at every step. Contact SNP’s migration experts.

Modernize and Migrate your SQL Server Workloads with Azure

Modernizing and migrating SQL Server is just one of the many reasons why a company might want to migrate its data. Other common reasons may include mergers, hardware upgrades or moving to the cloud. In most cases, however, data migrations are associated with downtime, data loss, operational disruptions, and compatibility problems.

With SNP Technologies Inc., these concerns are alleviated, and the migration process is simplified. We help businesses migrate complete workloads seamlessly through real-time, byte-level replication and orchestration. For enhanced agility with little to no downtime, we can migrate data and systems between physical, virtual and cloud-based platforms.

When it comes to the modernization of SQL Server, we can migrate and upgrade your workload simultaneously. Production sources can be lower versions of SQL Server that are then upgraded to newer versions, for example, SQL 2008, 2008R2, and 2012 can be moved to a newer version of Windows and SQL or to Azure.

 

Some key benefits of modernizing or migrating your sql workloads include:

  • Built-in high Availability and disaster recovery for Azure SQL PaaS with 99.99% availability
  • Automatic backups for Azure SQL PaaS services
  • High availability with 99.95% for Azure IaaS
  • Can leverage the azure automatic backups or Azure Backup for SQL Server on Azure VM

Listed below are the various steps SNP follows to migrate an on-premises SQL server to Azure PaaS or IaaS

  • Assessment to determine what is the most appropriate target and their Azure sizing.
  • A performance assessment will be conducted before the migration to determine potential issues with the modernization.
  • A performance assessment will be conducted post-migration to determine if there is any impact on performance.
  • Migration to the designated target.

 

As part of our modernization process, we utilize a variety of tools that Microsoft provides. The following are various tools or services we leverage during the modernization process.

Assessment to determine what is the most appropriate target and their Azure sizing with Azure Migrate:

Azure migrate is a service in Azure that uses Azure SQL Assessment to assess the customer’s on-premises SQL infrastructure. In Azure Migrate, all objects on the SQL server are analyzed against the target (whether it’s Azure SQL Database, Azure SQL Managed Instance or SQL Server on Azure VM) and the target is calculated by considering all performance parameters such as IOPS, CPU, Memory, Costing etc., along with the appropriate Azure size. Following the assessment, SNP gets a better idea of what needs to be migrated, while the assessment report recommends the most appropriate migration solution.

This assessment generates four types of reports:

  • Recommended type (it gives us the best option by comparing all the available options (best fit)– If the SQL server is ready for all the targets, it will give us the best fit considering all the factors like performance, cost, etc
  • Recommendation of instances to Azure SQL MI– It gives the information If the SQL server is ready for MI. If the SQL server is ready for MI, it gives us a target recommendation size. If the Server has any issues with SQL MI, it shows us all the various issues it has and its corresponding recommendations
  • Recommendation of Instances to Azure SQL VM– It will assess individual instance and provides us with the suitable configuration specific to individual instance
  • Recommendation of Servers to SQL Server on Azure VM– If the server is ready to move to the SQL server on Azure, it will give us the appropriate recommendation

Our assessment checks if there are any performance impacts post migration with Microsoft’s data migration assistant

To prepare for modernizing our SQL infrastructure to Azure, we need to know what objects will be impacted post-migration, so we can plan what steps to take post-migration. A second assessment is performed using a Microsoft Data Migration Assistant (DMA) tool to identify all the objects that will be impacted after migration. This tool can be used to determine which objects are going to be impacted post-migration during this phase. The DMA categorizes the objects into five/ four categories for modernizing to SQL Server on Azure VM.

Some key factors considered at this stage include:

  1. Breaking Changes:  These are the changes that will impact the performance of a particular object. Following a migration, we will need to ensure that breaking changes are addressed.
  2. Behavior Changes: There are changes that may impact query performance and should be addressed for optimal results.
  3. Informational issues: We can use this information to identify issues that might affect post-migration
  4. Deprecated Feature: These are the features that are going to be deprecated
  5. Migration blockers: These are the objects that are going to block the migration, either we remove them prior to migration or change them as per the business requirements.

Note: Migration blockers are specific to the Modernization of SQL Server to Azure SQL PaaS

 

Modernization using Azure Data Studio:

Once we have an Azure target along with the Azure size and a list of affected objects, we can move on to modernization, where we migrate our SQL infrastructure to the Azure target. In this phase, the SQL infrastructure is modernized using a tool called Azure Data Studio, which uses an extension called Azure SQL Migration, leveraging Azure Data Migration Service (Azure DMS).

In Azure Data Studio, you will be able to perform a modernization of the SQL server infrastructure by using the Native SQL backups (the latest full back up as well as the transactional log backups from the previous backup). In this method, backup files of SQL server databases are copied and restored on the target. Using Azure Data Studio, we can automate the backup and restore process. All we must do is manually place the backup files into a shared network folder or Azure storage container so that the tool recognizes the backups and restores them automatically.

Post Migration:

Upon completion of modernization, all objects impacted by the modernization should be resolved for optimal performance. DMA provides information regarding all impacted objects and offers recommendations on how to address them.

Your Next Steps:

If you like what you’ve read so far, let’s move forward together with confidence. We’re here to help at every step. Contact SNP’s migration experts here

 

 

Azure Arc enabled Kubernetes for Hybrid Cloud Management — Manage Everything and Anywhere

Azure Arc-enabled Kubernetes extends Azure’s management capabilities to Kubernetes clusters running anywhere, whether in public clouds or on-premises data centers. This integration allows customers to leverage Azure features such as Azure Policy, GitOps, Azure Monitor, Microsoft Defender, Azure RBAC, and Azure Machine Learning.

Key features of Azure Arc-enabled Kubernetes include:

  1. Centralized Management: Attach and configure Kubernetes clusters from diverse environments in Azure, facilitating a unified management experience.
  2. Governance and Configuration: Apply governance policies and configurations across all clusters to ensure compliance and consistency.
  3. Integrated DevOps: Streamline DevOps practices with integrated tools that enhance collaboration and deployment efficiency.
  4. Inventory and Organization: Organize clusters through inventory, grouping, and tagging for better visibility and management.
  5. Modern Application Deployment: Enable the deployment of modern applications at scale across any environment.

In this blog, we will follow a step by step approach and learn how to:

1. Connect Kubernetes clusters running outside of Azure

2. GitOps – to define applications and cluster configuration in source control

3. Azure Policy for Kubernetes

4. Azure Monitor for containers

 

1. Connect Kubernetes clusters

Prerequisites

  • Azure account with an active subscription.
  • Identity – User or service principal
  • Latest Azure CLI
  • Extensions – connectedk8s and k8sconfiguration
  • An up-and-running Kubernetes cluster
  • Resource providers – Microsoft.Kubernetes, Microsoft.KubernetesConfiguration, Microsoft.ExtendedLocation

Create a Resource Group

Create a Resource Group using below command in Azure portal choose your desired location. Azure Arc for Kubernetes supports most of the azure regions. Use this page Azure products by region to know the supported regions.

* az group create –name AzureArcRes -l EastUS -o table

For example: az group create –name AzureArcK8sTest –location EastUS –output table

Connect to the cluster with admin access and attach it with Azure Arc

We use az connectedk8s connect cli extension to attach our Kubernetes clusters to Azure Arc.

This command verify the connectivity to our Kubernetes clusters via kube-config (“~/.kube/config”) file and deploy Azure Arc agents to the cluster into the “azure-arc” namespace and installs Helm v3 to the .azure folder.

For this demonstration we connect and attach AWS – Elastic Kubernetes service and Google cloud – Kubernetes engine. Below, we step through the commands used to connect and attach to each cluster.

 

AWS – EKS

* aws eks –region <Region> update-kubeconfig –name <ClusterName>

* kubectl get nodes

AWS – EKS 2

* az connectedk8s connect –name <ClusterName> –resource-group AzureArcRes

az connectedk8s connect

GCLOUD- GKE

GCloud – GKE

* gcloud container clusters get-credentials <ClusterName> –zone <ZONE> –project <ProjectID>

* kubectl get no

* az connectedk8s connect –name <ClusterName> –resource-group AzureArcRes

az connectedk8s connect

Verify Connected Clusters

* az connectedk8s list -g AzureArcRes -o table

Verify Connected Clusters

Azure Arc

 

2. Using GitOps to define applications & clusters

We use the connected GKE cluster for our example to deploy a simple application.

Create a configuration to deploy an application to kubernetes cluster.
We use “k8sconfiguration” extension to link our connected cluster to an example git repository provided by SNP.

* export KUBECONFIG=~/.kube/gke-config

* az k8sconfiguration create \

–name app-config \

–cluster-name <ClusterName> –resource-group <YOUR_RG_NAME>\

–operator-instance-name app-config –operator-namespace cluster-config \

–repository-url https://github.com/gousiya573-snp/SourceCode/tree/master/Application \

–scope cluster –cluster-type connectedClusters

Check to see that the namespaces, deployments, and resources have been created:

* kubectl get ns –show-labels

We can see that cluster-config namespace have been created.

Azure Arc enabled Kubernetes

* kubectl get po,svc

The flux operator has been deployed to cluster-config namespace, as directed by our sourceControlConfig and application deployed successfully, we can see the pods are Running and Service LoadBalancer IP also created.

Azure Arc enabled Kubernetes

Access the EXTERNAL-IP to see the output page:

Azure Arc enabled Kubernetes

Please Note:

Supported repository-url Parameters for Public & Private repos:

* Public GitHub Repo   –  http://github.com/username/repo  (or) git://github.com/username/repo

* Private GitHub Repo –  https://github.com/username/repo (or) git@github.com:username/repo

* For the Private Repos – flux generates a SSH key and logs the public key as shown below:

Azure Arc enabled Kubernetes

For this demonstration we connect and attach AWS – Elastic Kubernetes service and Google cloud – Kubernetes engine. Below, we step through the commands used to connect and attach to each cluster.

3. Azure Policy for Kubernetes

Use Azure Policy to enforce that each Microsoft.Kubernetes/connectedclusters resource or Git-Ops enabled Microsoft.ContainerService/managedClusters resource has specific Microsoft.KubernetesConfiguration/sourceControlConfigurations applied on it.

Assign Policy:

To create the policy navigate to Azure portal and Policy, in the Authoring section select the Definitions.
Click on Initiative definition to create the policy and search for gitops in the Available Definitions, click on Deploy GitOps to Kubernetes clusters policy to add.
Select the subscription in the Definition locations, Give the Policy assignment Name and Description.

Choose the Kubernetes in the existing Category list and scroll-down to fill the Configuration related details of an application.

Azure Arc

Select the policy definition and click on Assign option above and set the scope for the assignment. Scope can be Azure resource group level or subscription and complete the other basics steps – Assignment name, Exclusions, remediation etc.

Click on parameters and provide name for the Configuration resourceOperator instanceOperator namespace and set the Operator scope to cluster level or namespace, Operator type is Flux and provide your application github repo url (public or private) in the Repository Url field. Now, additionally pass the Operator parameters such as “–git-branch=master –git-path=manifests –git-user=your-username –git-readonly=false” finally click on Save option and see the policy with the given name is created in the Assignments.

Once the assignment is created the Policy engine will identify all connectedCluster or managedCluster resources that are located within the scope and will apply the sourceControlConfiguration on them.

Azure Arc

–git-readonly=false enables the CI/CD for the repo and creates the Auto releases for the commits.

 

Azure Arc enabled Kubernetes

 

Verify a Policy Assignment

Go to Azure portal and click on connected Cluster resources to check the Compliant Status, Compliant: config-agent was able to successfully configure the cluster and deploy flux without error.

Azure Arc enabled Kubernetes

We can see the policy assignment that we created above, and the Compliance state should be Compliant.

Azure Arc

4. Azure Monitor for Containers

It provides rich monitoring experience for the Azure Kubernetes Service (AKS) and AKS Engine clusters. This can be enabled for one or more existing deployments of Arc enabled Kubernetes clusters using az cli, azure portal and resource manager.

Create Azure Log Analytics workspace or use an existing one to configure the insights and logs. Use below command to install the extension and configure it to report to the log analytics workspace.

*az k8s-extension create –name azuremonitor-containers –cluster-name <cluster-name> –resource-group <resource-group> –cluster-type connectedClusters –extension-type Microsoft.AzureMonitor.Containers –configuration-settings logAnalyticsWorkspaceResourceID=<armResourceIdOfExistingWorkspace

It takes about 10 to 15 minutes to get the health metrics, logs, and insights for the cluster. You can check the status of extension in the Azure portal or through CLI. Extension status should show as “Installed”.

Azure Arc enabled Kubernetes

Azure Arc enabled Kubernetes

We can also scrape and analyze Prometheus metrics from our cluster.

Clean Up Resources

To delete an extension:

* az k8s-extension delete –name azuremonitor-containers –cluster-type connectedClusters –cluster-name <cluster-name> –resource-group <resource-group-name>

To delete a configuration:

*az k8sconfiguration delete –name ‘<config name>‘ -g ‘<resource group name>‘ –cluster-name ‘<cluster name>‘ –cluster-type connectedClusters

To disconnect a connected cluster:

* az connectedk8s delete –name <cluster-name> –resource-group <resource-group-name>

 

Conclusion:

This blog provides an overview of Azure Arc-enabled Kubernetes, highlighting how SNP assists its customers in setting up Kubernetes clusters with Azure Arc for scalable deployment. It emphasizes the benefits of Azure Arc in managing Kubernetes environments effectively.

SNP offers subscription services to accelerate your Kubernetes journey, enabling the installation of production-grade Kubernetes both on-premises and in Microsoft Azure. For more information or to get assistance from SNP specialists, you can reach out through the provided contact options. Contact SNP specialists here.

Accelerate Innovation Across Hybrid & Multicloud Environments with Azure Arc

With the growing trend of multicloud and edge computing, organizations are increasingly finding themselves managing a diverse array of applications, data centers, and hosting environments. This heterogeneity presents significant challenges in managing, governing, and securing IT resources. To address these complexities, organizations need a robust solution that enables them to centrally inventory, organize, and enforce control policies across their entire IT estate, regardless of location.

SNP leverages Azure Arc and a hybrid approach to empower its customers to effectively manage resources deployed in both Azure and on-premises environments through a unified control plane. With Azure Arc, organizations can simplify their infrastructure management, making it easier to accelerate migration decisions driven by policies while ensuring compliance with regulatory requirements.

Microsoft Azure enables management of a variety of services deployed externally, including:

  • Windows and Linux servers: These can run on bare metal, virtual machines (VMs), or public cloud IaaS environments.
  • Kubernetes clusters: Organizations can manage their containerized applications seamlessly across different environments.
  • Data services: Azure Arc supports data services based on SQL Azure and PostgreSQL Hyperscale, allowing for consistent data management practices.
  • Microservices applications: Applications packaged and deployed as microservices running on Kubernetes can be easily monitored and managed through Azure Arc.

 

Hybrid Unified Management & How it Benefits your Business

Azure Arc involves deploying an agent on servers or on Kubernetes clusters for resources to be projected on the Azure Resource Manager. Once the initial connectivity is done, Arc extends governance controls such as Azure Policy and Azure role based access controls across a hybrid infrastructure. With Azure governance controls, we can have consistency across environments which helps enhance productivity and mitigate risks.

Some key benefits of Azure Arc include:

  • Azure Arc enabled solutions can easily expand into a Hybrid-cloud architecture as they are designed to run virtually anywhere.
  • Azure Arc data includes technical and descriptive details, along with compliance and security policies.
  • Enterprises can use Azure security center to ensure compliance of all resources registered with Azure Arc irrespective of where they are deployed. They can quickly patch the operating systems running in VMs as soon as  vulnerability is found. Policies can be defined once and automatically applied to all the resources across Azure, data center and even VMs running in other cloud platforms.
  • All the resources registered with Azure Arc send the logs to the central, cloud based Azure monitor. This is a comprehensive approach in deriving insights for highly distributed and disparate infrastructure environments.
  • Leveraging Azure Automation, mundane to advanced maintenance operations services across the public, hybrid or multi-cloud environments can be performed effortlessly.

 

Azure services for support management and governance of other cloud platforms. includes:

  • Azure Active Directory
  • Azure Monitor
  • Azure Policy
  • Azure Log Analytics
  • Azure Security Center/Defender
  • Azure Sentinel

 

Unified Kubernetes Management

With AKS and Kubernetes, Azure Arc provides the ability to deploy and configure Kubernetes applications in a consistent manner across all environments, adopting modern DevOps techniques. This offers:

Flexibility

  • Container platform of your choice with out-of-the-box support for most Cloud native applications.
  • Used across Dev, Test and Production Kubernetes clusters in your environment.

Management

  • Inventory, organise and tag Kubernetes clusters.
  • Deploy apps and configuration as code using GitOps.
  • Monitor and Manage at scale with policy-based deployment.

Governance and security

  • Built in Kubernetes Gatekeeper policies.
  • Apply consistent security configuration at scale.
  • Consistent cluster extensions for Monitor, Policy, Security, and other agents

Role-based access control

  • Central IT based at-scale operations.
  • Management by workload owner based on access privileges.

Leveraging GitOps

  • Azure Arc also lets us organize, view, and configure all clusters in Azure (like Azure Arc enabled servers) uniformly, with GitOps (Zero touch configuration).
  • In GitOps, the configurations are declared and stored in a Git-repo and Arc agents running on the cluster continuously monitor this repo for updates or changes and automatically pulls down these changes to the cluster.
  • We can use cloud native tools practices and GitOps configuration and app deployment to one or more clusters at scale.

 

Azure Arc Enabled Data Services

Azure Arc makes it possible to run Azure data services on-premises, at the edge, and 3rd party clouds using Kubernetes on hardware of our choice. 

Arc can bring cloud elasticity on-premises so you can optimize performance of your data workloads with the ability to dynamically scale, without application downtime. By connecting to Azure, one can see all data services running on-premises alongside those running in Azure through a single pane of glass, using familiar tools like Azure Portal, Azure Data Studio and Azure CLI.

Azure Arc enabled data services can run Azure PostgreSQL or SQL managed instance in any supported Kubernetes environment in AWS or GCP, just the way it would run it in an on-prem environment.

With the of Azure Arc, organizations can reach, for hybrid architectures, the following overall business objectives:

  • Standardization of operations and procedures
  • Organization of resources
  • Regulatory Compliance and Security
  • Cost Management
  • Business Continuity and Disaster Management

 

For more on how you can revolutionize the management and development of your hybrid environments with Azure Arc,

SNP’s Managed Detect & Response Services Powered by Microsoft Sentinel & Defenders (MXDR)

SNP’s Managed Detection and Response (MDR) for Microsoft Sentinel service, brings integrations with Microsoft services like Microsoft Defenders (MXDR), threat intelligence and customer’s hybrid and multi-cloud infrastructure to monitor, detect and respond to threats quickly. With our managed security operations team, SNP’s threat detection experts help identify, investigate and provide high fidelity detection through ML-based threat modelling for your hybrid and multicloud infrastructure.

SNP’s MXDR Services Entitlements:

SNP’s Managed services security framework brings the capability of centralized security assessment for managing your on-premises or cloud infrastructure, where we offer:

 

Leveraging SNP’s security model below, we help our customers:

  • Build their infrastructure and applications with cloud-native protection throughout their cloud application lifecycle.
  • With defined workflows, customers get the ease of separating duties in entitlements management to protect against governance and compliance challenges.
  • Data security is prioritized to protect sensitive data from different data sources to the point of consumption.
  • With Azure Sentinel, we consolidate and automate telemetry across attack surfaces while orchestrating workflows and processes to speed up response and recovery.

 

SNP’s Managed Extended Detection & Response (MXDR) Approach:

Our 6-step incident response approach helps our customers maintain, detect, respond, notify, investigate, and remediate cyberthreats as shown below:

 

For more on SNP’s Managed Detect & Response Services Powered by Microsoft Sentinel & Defenders (MXDR), contact our security experts here.

Bring your Data Securely to the Cloud by Implementing Column Level security, Row Level Security & Dynamic Data Masking with Azure Synapse Analytics

Azure Synapse Analytics from Microsoft is a limitless analytics service that brings together data integration, enterprise data warehousing, and big data analytics. SNP helps its customers migrate their legacy data warehouse solutions to Azure Synapse Analytics to gain the benefits of an end-to-end analytics platform that provides high availability, security, speed, scalability, cost savings, and industry-leading performance for enterprise data warehousing workloads.

A common business scenarios we cover:

As organizations scale, data grows exponentially. And with the workforce working remotely, data protection is one of the primary concerns of organizations around the world today. There are several high-level security best practices that every enterprise should adopt, to protect their data from unauthorized access. Here are our recommendations to help you prevent unauthorized data access.

The SNP solution:

With Azure Synapse Analytics, SNP provides its customers enhanced security with column level security, row-level security & dynamic data masking.

Azure Synapse SecurityBelow is an example of a sample table data which is required to implement the column level security, row-level security & dynamic data masking for your data.

Revenue table:

Azure Synapse Security

Codes:

Step:1 Create users

create user [CEO] without login;

create user [US Analyst] without login;

create user [WS Analyst] without login;

 

Column Level Security

A column-level security feature in Azure Synapse simplifies the design and coding of security in applications. It ensures column-level security by restricting column access to protect sensitive data.

In this scenario, we will be working with two users. The first one is the CEO, who needs access to all company data. The second one is an Analyst based in the United States, who does not have access to the confidential Revenue column in the Revenue table.

Follow this lab, one step at a time to see how Column-level security removes access to the revenue column to US Analyst.

 

Step:2 Verify the existence of the “CEO” and “US Analyst” users in the Data Warehouse.

SELECT Name as [User1] FROM sys.sysusers WHERE name = N’CEO’;

SELECT Name as [User2] FROM sys.sysusers WHERE name = N’US Analyst’;

 

Step:3 Now let us enforce column-level security for the US Analyst.

The revenue table in the warehouse has information like Analyst, CampaignName, Region, State, City, RevenueTarget, and Revenue. The Revenue generated from every campaign is classified and should be hidden from US Analysts.

REVOKE SELECT ON dbo.Revenue FROM [US Analyst];

GRANT SELECT ON dbo.Revenue([Analyst], [CampaignName], [Region], [State], [City], [RevenueTarget]) TO [US Analyst];

Azure Synapse SecurityThe security feature has been enforced,  where the following query with the current user as ‘US Analyst’, this will result in an error. Since the US Analyst does not have access to the Revenue column the following query will succeed since we are not including the Revenue column in the query.

Azure Synapse SecurityAzure Synapse Security

Row Level Security

Row-level Security (RLS) in Azure Synapse enables us to use group membership to control access to rows in a table. Azure Synapse applies the access restriction every time data access is attempted from any user.

In this scenario, the revenue table has two Analysts, US Analysts & WS Analysts. Each analyst has jurisdiction across a specific Region. US Analyst on the South East Region. An Analyst only sees the data for their own data from their own region. In the Revenue table, there is an Analyst column that we can use to filter data to a specific Analyst value.

SELECT DISTINCT Analyst, Region FROM dbo.Revenue order by Analyst ;

Review any existing security predicates in the database

SELECT * FROM sys.security_predicates

 

Step:1

Create a new Schema to hold the security predicate, then define the predicate function. It returns 1 (or True) when a row should be returned in the parent query.

CREATE SCHEMA Security

GO

CREATE FUNCTION Security.fn_securitypredicate(@Analyst AS sysname)

RETURNS TABLE

WITH SCHEMABINDING

AS

RETURN SELECT 1 AS fn_securitypredicate_result

WHERE @Analyst = USER_NAME() OR USER_NAME() = ‘CEO’

GO

Step:2

Now we define a security policy that adds the filter predicate to the Sale table. This will filter rows based on their login name.

CREATE SECURITY POLICY SalesFilter 

ADD FILTER PREDICATE Security.fn_securitypredicate(Analyst)

ON dbo.Revenue

WITH (STATE = ON);

Allow SELECT permissions to the Sale Table.

GRANT SELECT ON dbo.Revenue TO CEO, [US Analyst], [WS Analyst];

 

Step:3

Let us now test the filtering predicate, by selecting data from the Sale table as ‘US Analyst’ user.

Azure Synapse SecurityAs we can see, the query has returned rows here. Login name is US Analyst and Row-level Security is working.

Azure Synapse Security

Azure Synapse Security

Dynamic Data Masking

Dynamic data masking helps prevent unauthorized access to sensitive data by enabling customers to designate how much of the sensitive data to reveal with minimal impact on the application layer. DDM can be configured on designated database fields to hide sensitive data in the result sets of queries. With DDM the data in the database is not changed. Dynamic data masking is easy to use with existing applications since masking rules are applied in the query results. Many applications can mask sensitive data without modifying existing queries.

In this scenario, we have identified some sensitive information in the customer table. The customer would like us to obfuscate the Credit Card and Email columns of the Customer table to Data Analysts.

Let us take the below customer table:

Azure Synapse SecurityConfirmed no masking enabled as of now,

Azure Synapse Security

Let us make masking for Credit card & email information,

Step:1

Now let us mask the ‘CreditCard’ and ‘Email’ Column of the ‘Customer’ table.

ALTER TABLE dbo.Customer 

ALTER COLUMN [CreditCard] ADD MASKED WITH (FUNCTION = ‘partial(0,”XXXX-XXXX-XXXX-“,4)’);

GO

ALTER TABLE dbo.Customer

ALTER COLUMN Email ADD MASKED WITH (FUNCTION = ’email()’);

GO

 

Now, the results show masking enabled for data:

Azure Synapse SecurityExecute query as User ‘US Analyst’, now the data of both columns is masked,

Azure Synapse SecurityUnmask data:

Azure Synapse Security

Conclusion:

From the above samples, SNP has shown how column level security, row level security & dynamic data masking can be implemented in different business scenarios. Contact SNP Technologies for more information.

Top 5 FAQs on Operationalizing ML Workflow using Azure Machine Learning

Enterprises today are adopting artificial intelligence (AI) at a rapid pace to stay ahead of their competition, deliver innovation, improve customer experiences, and grow revenue. However, the challenges with such integrations is that the development, deployment and monitoring of these models differ from the traditional software development lifecycle that many enterprises are already accustomed to.

Leveraging AI and machine learning applications, SNP helps bridge the gap between the existing state and the ideal state of how things should function in a machine learning lifecycle to achieve scalability, operational efficiency, and governance.

SNP has put together a list of the top 5 challenges enterprises face in the machine learning lifecycle and how SNP leverages Azure Machine Learning to help your business overcome them.

Q1. How much investment is needed on hardware for data scientists to run complex deep learning algorithms?

By leveraging Azure Machine Learning workspace, data scientists can use the same hardware virtually at a fraction of the price. The best part about these virtual compute resources is that businesses are billed based on the amount of resources consumed during active hours thereby reducing the chances of unnecessary billing.

Q2: How can data scientists manage redundancy when it comes to training segments and rewriting existing or new training scripts that involves collaboration of multiple data scientists?  

With Azure data pipelines, data scientists can create their model training pipeline consisting of multiple loosely coupled segments which are reusable in other training pipelines. Data pipelines also allows multiple data scientists to collaborate on different segments of the training pipeline simultaneously, and later combine their segments to form a consolidated pipeline.

Q3. A successful machine learning life cycle involves a data scientist finding the best performing model by using multiple iterative processes. Each process involves manual versioning which results to inaccuracies during deployments and auditing. So how best can data scientists manage version controlling?

Azure Machine Learning workspace for model development can prove to be a very useful tool in such cases. It tracks performance metrics and functional metrics of each run to provide the user with a visual interface on model performance during training. It can also be leveraged to register models developed on Azure Machine Learning workspace or models developed on your local machines for versioning. Versioning done using Azure Machine Learning workspace makes the deployment process simpler and faster.

Q4. One of the biggest challenges while integrating the machine learning model with an existing application is the tedious deployment process which involves extensive manual effort. So how can data scientists simplify the packaging and model deployment process?

Using Azure Machine Learning, data scientists and app developers can easily deploy Machine Learning models almost anywhere. Machine Learning models can be deployed as a standalone endpoint or embedded into an existing app or service or to Azure IoT Edge devices.

Q5. How can data scientists automate the machine learning process?

A data scientist’s job is not complete once the Machine Learning model is integrated into the app or service and deployed successfully. It has to be closely monitored in a production environment to check its performance and must be re-trained and re-deployed once there is sufficient quantity of new training data or when there are data discrepancies (when actual data is very different from the data on which your model is trained on and is affecting your model performance).

Azure Machine Learning can be used to trigger a re-deployment when your Git repository has a new code check-in. Azure Machine Learning can also be used to create a re-training pipeline to take new training data as input to make an updated model. Additionally, Azure Machine Learning provides alerts and log analytics to monitor and govern the containers used for deployment with a drag-drop graphical user interface to simplify the model development phase.

Start building today!

SNP is excited to bring you machine learning and AI capabilities to help you accelerate your machine learning lifecycle, from new productivity experiences that make machine learning accessible to all skill levels, to robust MLOps and enterprise-grade security, built on an open and trusted platform helping you drive business transformation with AI. Contact SNP here.