Re-imagining Semantic Search Inside Power BI

Posted on October 17, 2025October 31, 2025 by snp

The Hidden Cost of “Simple” Search Apps

Many organizations already leverage Azure AI Search—a powerful service that makes text and documents searchable through semantic and vector search.

But time and again, we see a familiar pattern:

A new web app is built (maybe in Streamlit or React).
It’s hosted on Azure App Service or virtual machines.
Authentication, hosting, monitoring, and DevOps pipelines are duplicated.
And in the end, users get a search box and a results table.

The outcome delivers business value, yes—but at a high operational cost and unnecessary complexity.

A Simpler, Smarter Alternative

Now imagine a different approach.

What if your users could:

Type a search query,
Get semantic results with highlights and summaries, and
See them directly inside the Power BI dashboards they already use every day?

No new app.
No separate portal.
No extra infrastructure to maintain.

Just a familiar search box embedded in Power BI, powered by Azure AI Search and AI summarization—running entirely inside Power BI, with no Power Apps or Power Automate required.

What the Solution Includes

This integrated Power BI solution offers:

Free-text input for users to type search queries.
Native Power BI filters (date, region, product, etc.) alongside semantic search results.
Paging support for navigating large result sets.
Export options to Excel, CSV, PDF, and more.
Multiple report pages to query across different indexes, documents, or datasets.
Native Power BI authentication for secure, seamless access.

In short, it combines semantic intelligence with Power BI’s interactivity and security—all within a single analytics experience.

High-Impact Use Cases

Organizations can unlock immediate value with this approach in areas such as:

Customer Service: Analyze complaint text for recurring issues like refunds, delays, or product defects.
Compliance & Legal: Instantly surface contract clauses or policy excerpts in dashboards.
Operations & IT: Search across incident logs and root cause analyses.
HR & Internal Communications: Make employee policies and FAQs instantly discoverable.

All this—without building or maintaining another app that IT must support.

Why It Matters

This design delivers impact where it counts most:

💰 Cost Savings: Eliminate redundant web app hosting, authentication, and DevOps overhead.
🙌 User Adoption: Users already know Power BI—no new tools or training required.
⚡ Speed: What once took weeks of development can now be delivered in days.

By keeping it simple, this approach dramatically reduces time-to-value and improves accessibility.

Where This Approach Works Best

This solution is intentionally streamlined and optimized for clarity, speed, and integration. It excels when you need:

A single query to retrieve relevant, contextual results.
Concise summaries and highlights displayed directly in Power BI.
Seamless blending of search results with existing metrics and dashboards.

It’s built for search + analytics, not conversational AI.

If your use case requires:

Multi-turn or follow-up Q&A,
Chat-style context retention, or
Dynamic document uploads and reasoning,
then a different, more interactive architecture would be a better fit.

Think of this solution as:

One search → One set of insights → Instantly displayed in Power BI

Fast. Clean. Insightful.

Your Next Step Toward Smarter Search

What’s exciting about this solution isn’t that it’s cutting-edge—it’s that it’s surprisingly simple. And in that simplicity lies its biggest advantage: faster delivery, lower cost, and easier adoption.

If you’re managing multiple priorities, this is the ideal lightweight, high-impact approach for proving value quickly. You don’t need a full-blown application—just a Power BI report and your existing Azure AI Search index.

You can have it running on your own data in days, not weeks.

Bring us your use case, and we’ll help you stand it up fast—through a proof of concept or full implementation.

Automating Backup, Retention, And Restoration For Lakehouse In Microsoft Fabric

Posted on October 17, 2025October 31, 2025 by snp

Data resilience is the foundation of every modern analytics platform. In Microsoft Fabric, implementing automated backup, retention, and restoration strategies is essential to strengthen data protection and ensure operational continuity.

While Microsoft Fabric includes built-in disaster recovery (DR) capabilities, these are not intended to address everyday operational issues such as data refresh failures, accidental deletions, or data corruption. To bridge these gaps, an automated backup and retention approach becomes critical—helping organizations maintain seamless operations and data reliability.

Note: The methods described here are not replacements for Microsoft Fabric’s disaster recovery features. Instead, they serve as a complementary layer to enhance resilience, streamline restoration, and minimize downtime through automation and proactive configurations.

Setting Up Backup and Retention Policies

Microsoft Fabric’s Lakehouse and OneLake architectures provide robust foundations for data management. However, to ensure durability and compliance, automated backup and retention policies must be thoughtfully designed.

Key Components:

Daily Incremental Backups: Capture daily snapshots to minimize data loss and maintain version history.
Retention Policy Configuration: Define retention tiers—daily, weekly, monthly, and yearly—to balance compliance requirements and storage efficiency.
Automation via Notebooks: Utilize Fabric notebooks or scheduled Spark jobs to automate backup creation, enforce retention policies, and clean up obsolete data.

Automation Highlights:

Backup Creation: Scheduled scripts trigger snapshot creation at defined intervals. For instance, Spark jobs leveraging APIs such as mssparkutils can efficiently copy and version datasets.
Retention Enforcement: Policy-based automation ensures outdated backups are automatically purged while preserving those required for compliance or audits.
Logging and Monitoring: Every backup, cleanup, and restoration task is logged for transparency, traceability, and audit-readiness.

Restoration: Recovering from Data Loss

Restoring data in Fabric can be performed at multiple levels—either the entire Lakehouse or specific tables—depending on the recovery need.

Restoration Best Practices:

Restore data directly from automated snapshots or archived backups.
Leverage structured logs to diagnose and resolve issues encountered during restoration.
Minimize downtime by using predefined scripts or automation workflows that accelerate recovery and reduce manual intervention.

Why Automate Backup and Retention in Microsoft Fabric?

Automation introduces consistency, reliability, and efficiency—key pillars of a resilient data ecosystem.

Core Benefits:

Data Integrity: Automated, incremental backups safeguard critical datasets against accidental loss or corruption.
Operational Continuity: Rapid, scripted restorations minimize business disruption.
Cost Optimization: Automated cleanup routines remove redundant backups, optimizing storage utilization.
Scalability: Policy-driven automation scales effortlessly with expanding data volumes and evolving business needs.

Conclusion

Microsoft Fabric empowers organizations with a unified, intelligent data platform—but ensuring data durability and operational continuity requires more than built-in DR. By automating backup, retention, cleanup, and restoration processes, organizations can strengthen data resilience, minimize downtime, and maximize the business value of their Fabric investments.

In essence, automation transforms data protection from a reactive process into a proactive, scalable strategy—one that keeps pace with modern analytics demands and ensures the integrity of your most valuable asset: data.

RAG in the Real World: Why Scalable AI Needs More Than Just Retrieval and Prompts

Posted on September 24, 2025October 9, 2025 by snp

Executive Summary

Retrieval-Augmented Generation (RAG) has quickly become one of the most glorified terms in enterprise AI. RAG is typically showcased over a handful of PDFs which seems to be easy and simple. But operating at enterprise scale—lakhs (hundreds of thousands) of records, low-latency retrieval, strict accuracy, and predictable costs—is hard. Real limits show up in four places: embeddings, vector indexes (e.g., Azure AI Search), retrieval/filters, and LLMs themselves. You’ll need hybrid architectures, careful schema/ops, observability, and strict cost controls to get beyond prototypes.

Embedding Challenges at Scale

Cost & volume. Embedding large datasets quickly runs into scale issues. Even a moderately sized corpus—each record carrying a few hundred to thousands of tokens—translates into tens of millions of tokens overall. The embedding phase alone can run into significant dollar costs per cycle, and this expense only grows as data is refreshed or re-processed.

Throughput & reliability. APIs enforce token and RPM limits. Production pipelines need:

Token-aware dynamic batching
Retry with backoff + resume checkpoints
Audit logs for failed batches

Chunking trade-offs.

Too fine → semantic context is lost
Too coarse → token bloat + noisy matches Use header/paragraph-aware chunking with small overlaps; expect to tune per source type.

Domain relevance gaps. General-purpose embeddings miss subtle, domain-specific meaning (biomedical, legal, financial). Dimensionality isn’t the cure; domain representation is. Without specialization, recall will feel “lexical” rather than truly semantic.

Vector Databases: Strengths… and Constraints for RAG

Immutable schema. Once an index is created, fields can’t be changed. Adding a new filterable/tag field → recreate the index.

Full reloads. Schema tweaks or chunking updates often require re-embedding and re-indexing everything—expensive and time-consuming (parallelism will still hit API quotas).

Operational sprawl. Multiple teams/use cases → multiple indexes → fragmented pipelines and higher latency. Unlike a DB with views/joins, AI Search pushes you toward rigid, static definitions.

(Reference: Microsoft docs on Azure AI Search vector search and integrated vectorization.)

Retrieval & Filtering Limits

Shallow top-K. Even at top-K, relevant items can fall just outside the cut-off. In regulated domains, a single miss matters.

Context window pressure. As best practice we should send only required fields, then join externally on predicted Answers which will have key identifiers Like ID to finish the answer. (In our Scenario we sent 1-2 columns having Id, descriptions out of 25 columns in a table)

Filter logic ceiling. Basic metadata filters work, but you’ll miss nested conditions, dynamic role-based filters, and cross-field joins that are trivial in SQL.

LLM Limitations You Will Hit

Input limits. Even with large-context models (450k+ tokens),However, even within this boundary, we observed that certain records — particularly those in lower relevance ranges may be implicitly skipped during reasoning .This behaviour is non-deterministicand poses serious challenges for enterprise-grade tasks where every data point is critical

Output limits.Many models cap useful output (e.g., 2k–16k tokens). This affects multi-record responses, structured summaries, and complex decision-making outputs — leading to truncated or incomplete responses, particularly when returning JSON, tables, or lists.

Latency & variance. Complex prompts over hundreds of records can take minutes. Stochastic ranking creates run-to-run differences—tough for enterprise SLAs.

Concurrency & quotas. Enterprises face quota exhaustion due to shared token pools, Concurrent usage by multiple users or batch agents can quickly consume limits Smaller organizations with access to 8K input / 2K outputtoken models (e.g., LLaMA, Mistral) face even tighter ceilings — making RAG challenging beyond pilot projects

Productionisation, Observability — and Evaluation That Actually Matters

While most RAG demos end at a “right-looking” answer, real deployments must be observable, traceable, resilient, and continuously evaluated. GenAI pipelines often underinvest in these layers, especially with multiple async retrievals and LLM hops.

Operational pain points

Sparse/unstructured LLM logs → hard to reproduce issues or inspect reasoning paths.
Thin vector/AI Search telemetry → silent filter failures or low-recall cases go unnoticed.
Latency tracing across hybrids (SQL + vector + LLM) is messy without end-to-end spans.
Failure isolation is non-trivial: embed vs ranker vs LLM vs truncation?

What good observability looks like

End-to-end tracing (embed → index → retrieve → rerank → prompt → output).
Structured logs for: retrieval sets & scores, prompt/response token usage, cost, latency, confidence, and final joins.
Quality gates and alerts on recall@K, latency budgets, cost per query, and hallucination/citation signals.

Evaluation: beyond generic metrics

Partner with domain experts to define what “good” means. Automatic scores alone aren’t enough in regulated or domain-heavy settings.
Build a domain ground-truth set (gold + “acceptable variants”) curated by SMEs; refresh it quarterly.
Establish a human-in-the-loop loop: double-blind SME review on sampled traffic; escalate low-confidence or low-evidence answers by policy.
Maintain an error taxonomy (missed retrieval, wrong join, truncation, unsupported query, hallucination) with severity labels; track trends over time.
Run canaries/A-B tests in prod; compare quality, latency, cost, and SME acceptance before full rollout.
Log evaluation metadata (query id, versioned index, chunking config, model/runtime version) so you can pinpoint regressions.

In practice (our pattern)

We co-defined a gold set with SMEs and require evidence-backed answers; low-evidence responses are auto-routed to review.
We track recall@K + citation coverage for retrieval, and field-level precision/recall for extraction tasks, alongside cost/latency dashboards.

Bottom line Without observability + domain-grounded evaluation, production RAG stays brittle and opaque. With them, you get a system you can debug, trust, and scale—not just a demo that looks good once.

Conclusion: Beyond the Buzzwords

While Retrieval-Augmented Generation (RAG) has gained mainstream attention as the future of enterprise AI, real-world adoption reveals a wide gap between expectation and execution. From embedding inconsistencies and rigid vector schema constraints to LLM context bottlenecks and high operational latency, the challenges compound rapidly at production scale. Even in enterprise environments with access to high-token models, limitations around completeness, determinism, and runtime stability remain unresolved.

This doesn’t mean RAG is fundamentally flawed—it’s a powerful paradigm when paired with the right retrieval tuning, agent orchestration, hybrid pipelines, and system-level observability. But as engineers and architects, it’s time we shift the conversation from aspirational posts to grounded, production-aware designs. Until embedding models evolve to be truly domain-specific, vector systems allow dynamic schemas, and LLMs deliver predictable performance at scale, RAG should be treated not as a plug-and-play solution—but as a custom-engineered pipeline with domain, data, and budget constraints at its core

This blog represents the collective strength of our AI team—where collaboration, innovation, and expertise come together to create meaningful insights. It is a testament to the value SNP delivers through its AI practice, showcasing how we help our customers turn possibilities into impact.

Modernize your On-premises SQL Server Infrastructure by Utilizing Azure and Azure Data Studio

Posted on October 30, 2023December 27, 2024 by snp

Data estates are becoming increasingly heterogeneous as data grows exponentially and spreads across data centers, edge devices, and multiple public clouds. In addition to the complexity of managing data across different environments, the lack of a unified view of all the assets, security and governance presents an additional challenge.

Leveraging the cloud for your SQL infrastructure has many benefits like cost reduction, driving productivity, accelerating insights and decision-making can make a measurable impact on an organization’s competitiveness, particularly in uncertain times. While infrastructure, servers, networking, etc. all by default are maintained by the cloud provider.

With SQL servers 2008 and 2012 reaching their end-of-life, it is advisable to upgrade them or migrate them to Azure cloud services. Modernizing any version of SQL server to Azure brings up many added benefits, including:

Azure PaaS provides 99.99% availability
Azure IaaS provides 99.95% availability
Extended security updates for 2008, 2012 servers
Backing up SQL Server running in Azure VMs is made easy with Azure Backup, a stream-based, specialized solution. The solution aligns with Azure Backup’s long-term retention, zero infrastructure backup, and central management features.

Tools leveraged

For modernizing the SQL infrastructure, SNP leveraged a variety of tools from Microsoft, such as the following.

The Azure Database Migration Service has been used since the beginning to modernize on-premises SQL servers. Using this tool, you can migrate your data, schema, and objects from multiple sources to Azure at scale, while simplifying, guiding, and automating the process.
Azure Data Studio is one of the newest tools for modernizing SQL infrastructure with an extension of Azure SQL Migration. It’s designed for data professionals who run SQL Server and Azure databases on-premises and in multi cloud environments.

Potential reference architecture diagram

Let’s take a closer look at the architecture, what components are involved and what is being done in Azure Data Studio to migrate or modernize the on-premises SQL infrastructure.

Among the components of Azure data studio are the source to be modernized, the destination where the on-premises SQL must be moved, and the staging layer for the backup files. Backup files are a major component of modernization.

There are various components that are involved in the Azure Data Studio migration or modernization- Source SQL server. The on-premise SQL server which is to be modernized/migrated, Destination Server- The Azure SQL VM to which the on-prem SQL server will be moved, and the staging layer (Storage Account or the Network Share Folder) for the backup files. Backup files are a major component of modernization.

Azure Data Studio and Azure SQL Migration primarily rely on backup files. It uses a full backup of the database as well as transactional log backups. Another important component is the staging layer, where backup files will be stored.

Microsoft Azure Data Studio uses a network share folder, an Azure storage container, or an Azure file. There must be a specific structure or order in which backup files are placed in either of the places. As shown in the below architecture, backup files specific to the Database must be placed in their own folders or containers.

As part of the migration to Azure, Azure Data Studio along with the Azure SQL Migration extension utilizes a technology called Data Migration Service, which is the core technology behind the scenes. It has also been integrated with Azure Data Factory, which runs the pipeline at regular intervals to copy the backup files from the on-prem network share folder to Azure thereby restoring them on the target or restoring them if they are in containers.

When the backup files are in a network share folder, Azure Data Studio uses Self Hosted Integration Run time to establish a connection between on-premises and Azure. After the connection has been established, the Azure Data Studio begins the modernization process leveraging Azure DMS.

Initially, all full and subsequent transactional log backup files of the databases are placed in a specified database folder or database container. Azure Data Studio copies backup files from network share folders to Azure storage containers if the backup files are in a network share folder.

Following this, Azure Data Studio restores them to the target Azure SQL VM or Azure SQL Managed Instance while Azure Data Studio directly restores backup files from the storage account to the Azure target if the backup files are stored in the storage account.

Following the completion of the last log restoration on the target Azure SQL database, we need to cut over the database and bring it online on the target. The databases will be placed in the Restoring mode during the restoration of the backup files, which means that we will not be able to access them until the cutover has been completed.

Your next steps

If you like what you have read so far, let’s move forward together with confidence. We are here to help at every step. Contact SNP’s migration experts.

Modernize and Migrate your SQL Server Workloads with Azure

Posted on October 12, 2023December 27, 2024 by snp

Modernizing and migrating SQL Server is just one of the many reasons why a company might want to migrate its data. Other common reasons may include mergers, hardware upgrades or moving to the cloud. In most cases, however, data migrations are associated with downtime, data loss, operational disruptions, and compatibility problems.

With SNP Technologies Inc., these concerns are alleviated, and the migration process is simplified. We help businesses migrate complete workloads seamlessly through real-time, byte-level replication and orchestration. For enhanced agility with little to no downtime, we can migrate data and systems between physical, virtual and cloud-based platforms.

When it comes to the modernization of SQL Server, we can migrate and upgrade your workload simultaneously. Production sources can be lower versions of SQL Server that are then upgraded to newer versions, for example, SQL 2008, 2008R2, and 2012 can be moved to a newer version of Windows and SQL or to Azure.

Some key benefits of modernizing or migrating your sql workloads include:

Built-in high Availability and disaster recovery for Azure SQL PaaS with 99.99% availability
Automatic backups for Azure SQL PaaS services
High availability with 99.95% for Azure IaaS
Can leverage the azure automatic backups or Azure Backup for SQL Server on Azure VM

Listed below are the various steps SNP follows to migrate an on-premises SQL server to Azure PaaS or IaaS

Assessment to determine what is the most appropriate target and their Azure sizing.
A performance assessment will be conducted before the migration to determine potential issues with the modernization.
A performance assessment will be conducted post-migration to determine if there is any impact on performance.
Migration to the designated target.

As part of our modernization process, we utilize a variety of tools that Microsoft provides. The following are various tools or services we leverage during the modernization process.

Assessment to determine what is the most appropriate target and their Azure sizing with Azure Migrate:

Azure migrate is a service in Azure that uses Azure SQL Assessment to assess the customer’s on-premises SQL infrastructure. In Azure Migrate, all objects on the SQL server are analyzed against the target (whether it’s Azure SQL Database, Azure SQL Managed Instance or SQL Server on Azure VM) and the target is calculated by considering all performance parameters such as IOPS, CPU, Memory, Costing etc., along with the appropriate Azure size. Following the assessment, SNP gets a better idea of what needs to be migrated, while the assessment report recommends the most appropriate migration solution.

This assessment generates four types of reports:

Recommended type (it gives us the best option by comparing all the available options (best fit)– If the SQL server is ready for all the targets, it will give us the best fit considering all the factors like performance, cost, etc
Recommendation of instances to Azure SQL MI– It gives the information If the SQL server is ready for MI. If the SQL server is ready for MI, it gives us a target recommendation size. If the Server has any issues with SQL MI, it shows us all the various issues it has and its corresponding recommendations
Recommendation of Instances to Azure SQL VM– It will assess individual instance and provides us with the suitable configuration specific to individual instance
Recommendation of Servers to SQL Server on Azure VM– If the server is ready to move to the SQL server on Azure, it will give us the appropriate recommendation

Our assessment checks if there are any performance impacts post migration with Microsoft’s data migration assistant

To prepare for modernizing our SQL infrastructure to Azure, we need to know what objects will be impacted post-migration, so we can plan what steps to take post-migration. A second assessment is performed using a Microsoft Data Migration Assistant (DMA) tool to identify all the objects that will be impacted after migration. This tool can be used to determine which objects are going to be impacted post-migration during this phase. The DMA categorizes the objects into five/ four categories for modernizing to SQL Server on Azure VM.

Some key factors considered at this stage include:

Breaking Changes: These are the changes that will impact the performance of a particular object. Following a migration, we will need to ensure that breaking changes are addressed.
Behavior Changes: There are changes that may impact query performance and should be addressed for optimal results.
Informational issues: We can use this information to identify issues that might affect post-migration
Deprecated Feature: These are the features that are going to be deprecated
Migration blockers: These are the objects that are going to block the migration, either we remove them prior to migration or change them as per the business requirements.

Note: Migration blockers are specific to the Modernization of SQL Server to Azure SQL PaaS

Modernization using Azure Data Studio:

Once we have an Azure target along with the Azure size and a list of affected objects, we can move on to modernization, where we migrate our SQL infrastructure to the Azure target. In this phase, the SQL infrastructure is modernized using a tool called Azure Data Studio, which uses an extension called Azure SQL Migration, leveraging Azure Data Migration Service (Azure DMS).

In Azure Data Studio, you will be able to perform a modernization of the SQL server infrastructure by using the Native SQL backups (the latest full back up as well as the transactional log backups from the previous backup). In this method, backup files of SQL server databases are copied and restored on the target. Using Azure Data Studio, we can automate the backup and restore process. All we must do is manually place the backup files into a shared network folder or Azure storage container so that the tool recognizes the backups and restores them automatically.

Post Migration:

Upon completion of modernization, all objects impacted by the modernization should be resolved for optimal performance. DMA provides information regarding all impacted objects and offers recommendations on how to address them.

Your Next Steps:

If you like what you’ve read so far, let’s move forward together with confidence. We’re here to help at every step. Contact SNP’s migration experts here

Azure Arc enabled Kubernetes for Hybrid Cloud Management — Manage Everything and Anywhere

Posted on September 5, 2023December 27, 2024 by snp

Azure Arc-enabled Kubernetes extends Azure’s management capabilities to Kubernetes clusters running anywhere, whether in public clouds or on-premises data centers. This integration allows customers to leverage Azure features such as Azure Policy, GitOps, Azure Monitor, Microsoft Defender, Azure RBAC, and Azure Machine Learning.

Key features of Azure Arc-enabled Kubernetes include:

Centralized Management: Attach and configure Kubernetes clusters from diverse environments in Azure, facilitating a unified management experience.
Governance and Configuration: Apply governance policies and configurations across all clusters to ensure compliance and consistency.
Integrated DevOps: Streamline DevOps practices with integrated tools that enhance collaboration and deployment efficiency.
Inventory and Organization: Organize clusters through inventory, grouping, and tagging for better visibility and management.
Modern Application Deployment: Enable the deployment of modern applications at scale across any environment.

In this blog, we will follow a step by step approach and learn how to:

1. Connect Kubernetes clusters running outside of Azure

2. GitOps – to define applications and cluster configuration in source control

3. Azure Policy for Kubernetes

4. Azure Monitor for containers

1. Connect Kubernetes clusters

Prerequisites

Azure account with an active subscription.
Identity – User or service principal
Latest Azure CLI
Extensions – connectedk8s and k8sconfiguration
An up-and-running Kubernetes cluster
Resource providers – Microsoft.Kubernetes, Microsoft.KubernetesConfiguration, Microsoft.ExtendedLocation

Create a Resource Group

Create a Resource Group using below command in Azure portal choose your desired location. Azure Arc for Kubernetes supports most of the azure regions. Use this page Azure products by region to know the supported regions.

* az group create –name AzureArcRes -l EastUS -o table

For example: az group create –name AzureArcK8sTest –location EastUS –output table

Connect to the cluster with admin access and attach it with Azure Arc

We use az connectedk8s connect cli extension to attach our Kubernetes clusters to Azure Arc.

This command verify the connectivity to our Kubernetes clusters via kube-config (“~/.kube/config”) file and deploy Azure Arc agents to the cluster into the “azure-arc” namespace and installs Helm v3 to the .azure folder.

For this demonstration we connect and attach AWS – Elastic Kubernetes service and Google cloud – Kubernetes engine. Below, we step through the commands used to connect and attach to each cluster.

AWS – EKS

* aws eks –region <Region> update-kubeconfig –name <ClusterName>

* kubectl get nodes

* az connectedk8s connect –name <ClusterName> –resource-group AzureArcRes

GCLOUD- GKE

* gcloud container clusters get-credentials <ClusterName> –zone <ZONE> –project <ProjectID>

* kubectl get no

* az connectedk8s connect –name <ClusterName> –resource-group AzureArcRes

Verify Connected Clusters

* az connectedk8s list -g AzureArcRes -o table

2. Using GitOps to define applications & clusters

We use the connected GKE cluster for our example to deploy a simple application.

Create a configuration to deploy an application to kubernetes cluster.
We use “k8sconfiguration” extension to link our connected cluster to an example git repository provided by SNP.

* export KUBECONFIG=~/.kube/gke-config

* az k8sconfiguration create \

–name app-config \

–cluster-name <ClusterName> –resource-group <YOUR_RG_NAME>\

–operator-instance-name app-config –operator-namespace cluster-config \

–repository-url https://github.com/gousiya573-snp/SourceCode/tree/master/Application \

–scope cluster –cluster-type connectedClusters

Check to see that the namespaces, deployments, and resources have been created:

* kubectl get ns –show-labels

We can see that cluster-config namespace have been created.

* kubectl get po,svc

The flux operator has been deployed to cluster-config namespace, as directed by our sourceControlConfig and application deployed successfully, we can see the pods are Running and Service LoadBalancer IP also created.

Access the EXTERNAL-IP to see the output page:

Please Note:

Supported repository-url Parameters for Public & Private repos:

* Public GitHub Repo – http://github.com/username/repo (or) git://github.com/username/repo

* Private GitHub Repo – https://github.com/username/repo (or) git@github.com:username/repo

* For the Private Repos – flux generates a SSH key and logs the public key as shown below:

For this demonstration we connect and attach AWS – Elastic Kubernetes service and Google cloud – Kubernetes engine. Below, we step through the commands used to connect and attach to each cluster.

3. Azure Policy for Kubernetes

Use Azure Policy to enforce that each Microsoft.Kubernetes/connectedclusters resource or Git-Ops enabled Microsoft.ContainerService/managedClusters resource has specific Microsoft.KubernetesConfiguration/sourceControlConfigurations applied on it.

Assign Policy:

To create the policy navigate to Azure portal and Policy, in the Authoring section select the Definitions.
Click on Initiative definition to create the policy and search for gitops in the Available Definitions, click on Deploy GitOps to Kubernetes clusters policy to add.
Select the subscription in the Definition locations, Give the Policy assignment Name and Description.

Choose the Kubernetes in the existing Category list and scroll-down to fill the Configuration related details of an application.

Select the policy definition and click on Assign option above and set the scope for the assignment. Scope can be Azure resource group level or subscription and complete the other basics steps – Assignment name, Exclusions, remediation etc.

Click on parameters and provide name for the Configuration resource, Operator instance, Operator namespace and set the Operator scope to cluster level or namespace, Operator type is Flux and provide your application github repo url (public or private) in the Repository Url field. Now, additionally pass the Operator parameters such as “–git-branch=master –git-path=manifests –git-user=your-username –git-readonly=false” finally click on Save option and see the policy with the given name is created in the Assignments.

Once the assignment is created the Policy engine will identify all connectedCluster or managedCluster resources that are located within the scope and will apply the sourceControlConfiguration on them.

–git-readonly=false enables the CI/CD for the repo and creates the Auto releases for the commits.

Verify a Policy Assignment

Go to Azure portal and click on connected Cluster resources to check the Compliant Status, Compliant: config-agent was able to successfully configure the cluster and deploy flux without error.

We can see the policy assignment that we created above, and the Compliance state should be Compliant.

4. Azure Monitor for Containers

It provides rich monitoring experience for the Azure Kubernetes Service (AKS) and AKS Engine clusters. This can be enabled for one or more existing deployments of Arc enabled Kubernetes clusters using az cli, azure portal and resource manager.

Create Azure Log Analytics workspace or use an existing one to configure the insights and logs. Use below command to install the extension and configure it to report to the log analytics workspace.

*az k8s-extension create –name azuremonitor-containers –cluster-name <cluster-name> –resource-group <resource-group> –cluster-type connectedClusters –extension-type Microsoft.AzureMonitor.Containers –configuration-settings logAnalyticsWorkspaceResourceID=<armResourceIdOfExistingWorkspace

It takes about 10 to 15 minutes to get the health metrics, logs, and insights for the cluster. You can check the status of extension in the Azure portal or through CLI. Extension status should show as “Installed”.

We can also scrape and analyze Prometheus metrics from our cluster.

Clean Up Resources

To delete an extension:

* az k8s-extension delete –name azuremonitor-containers –cluster-type connectedClusters –cluster-name <cluster-name> –resource-group <resource-group-name>

To delete a configuration:

*az k8sconfiguration delete –name ‘<config name>‘ -g ‘<resource group name>‘ –cluster-name ‘<cluster name>‘ –cluster-type connectedClusters

To disconnect a connected cluster:

* az connectedk8s delete –name <cluster-name> –resource-group <resource-group-name>

Conclusion:

This blog provides an overview of Azure Arc-enabled Kubernetes, highlighting how SNP assists its customers in setting up Kubernetes clusters with Azure Arc for scalable deployment. It emphasizes the benefits of Azure Arc in managing Kubernetes environments effectively.

SNP offers subscription services to accelerate your Kubernetes journey, enabling the installation of production-grade Kubernetes both on-premises and in Microsoft Azure. For more information or to get assistance from SNP specialists, you can reach out through the provided contact options. Contact SNP specialists here.

Accelerate Innovation Across Hybrid & Multicloud Environments with Azure Arc

Posted on June 14, 2023November 19, 2024 by snp

With the growing trend of multicloud and edge computing, organizations are increasingly finding themselves managing a diverse array of applications, data centers, and hosting environments. This heterogeneity presents significant challenges in managing, governing, and securing IT resources. To address these complexities, organizations need a robust solution that enables them to centrally inventory, organize, and enforce control policies across their entire IT estate, regardless of location.

SNP leverages Azure Arc and a hybrid approach to empower its customers to effectively manage resources deployed in both Azure and on-premises environments through a unified control plane. With Azure Arc, organizations can simplify their infrastructure management, making it easier to accelerate migration decisions driven by policies while ensuring compliance with regulatory requirements.

Microsoft Azure enables management of a variety of services deployed externally, including:

Windows and Linux servers: These can run on bare metal, virtual machines (VMs), or public cloud IaaS environments.
Kubernetes clusters: Organizations can manage their containerized applications seamlessly across different environments.
Data services: Azure Arc supports data services based on SQL Azure and PostgreSQL Hyperscale, allowing for consistent data management practices.
Microservices applications: Applications packaged and deployed as microservices running on Kubernetes can be easily monitored and managed through Azure Arc.

Hybrid Unified Management & How it Benefits your Business

Azure Arc involves deploying an agent on servers or on Kubernetes clusters for resources to be projected on the Azure Resource Manager. Once the initial connectivity is done, Arc extends governance controls such as Azure Policy and Azure role based access controls across a hybrid infrastructure. With Azure governance controls, we can have consistency across environments which helps enhance productivity and mitigate risks.

Some key benefits of Azure Arc include:

Azure Arc enabled solutions can easily expand into a Hybrid-cloud architecture as they are designed to run virtually anywhere.
Azure Arc data includes technical and descriptive details, along with compliance and security policies.
Enterprises can use Azure security center to ensure compliance of all resources registered with Azure Arc irrespective of where they are deployed. They can quickly patch the operating systems running in VMs as soon as vulnerability is found. Policies can be defined once and automatically applied to all the resources across Azure, data center and even VMs running in other cloud platforms.
All the resources registered with Azure Arc send the logs to the central, cloud based Azure monitor. This is a comprehensive approach in deriving insights for highly distributed and disparate infrastructure environments.
Leveraging Azure Automation, mundane to advanced maintenance operations services across the public, hybrid or multi-cloud environments can be performed effortlessly.

Azure services for support management and governance of other cloud platforms. includes:

Azure Active Directory
Azure Monitor
Azure Policy
Azure Log Analytics
Azure Security Center/Defender
Azure Sentinel

Unified Kubernetes Management

With AKS and Kubernetes, Azure Arc provides the ability to deploy and configure Kubernetes applications in a consistent manner across all environments, adopting modern DevOps techniques. This offers:

Flexibility

Container platform of your choice with out-of-the-box support for most Cloud native applications.
Used across Dev, Test and Production Kubernetes clusters in your environment.

Management

Inventory, organise and tag Kubernetes clusters.
Deploy apps and configuration as code using GitOps.
Monitor and Manage at scale with policy-based deployment.

Governance and security

Built in Kubernetes Gatekeeper policies.
Apply consistent security configuration at scale.
Consistent cluster extensions for Monitor, Policy, Security, and other agents

Role-based access control

Central IT based at-scale operations.
Management by workload owner based on access privileges.

Leveraging GitOps

Azure Arc also lets us organize, view, and configure all clusters in Azure (like Azure Arc enabled servers) uniformly, with GitOps (Zero touch configuration).
In GitOps, the configurations are declared and stored in a Git-repo and Arc agents running on the cluster continuously monitor this repo for updates or changes and automatically pulls down these changes to the cluster.
We can use cloud native tools practices and GitOps configuration and app deployment to one or more clusters at scale.

Azure Arc Enabled Data Services

Azure Arc makes it possible to run Azure data services on-premises, at the edge, and 3rd party clouds using Kubernetes on hardware of our choice.

Arc can bring cloud elasticity on-premises so you can optimize performance of your data workloads with the ability to dynamically scale, without application downtime. By connecting to Azure, one can see all data services running on-premises alongside those running in Azure through a single pane of glass, using familiar tools like Azure Portal, Azure Data Studio and Azure CLI.

Azure Arc enabled data services can run Azure PostgreSQL or SQL managed instance in any supported Kubernetes environment in AWS or GCP, just the way it would run it in an on-prem environment.

With the of Azure Arc, organizations can reach, for hybrid architectures, the following overall business objectives:

Standardization of operations and procedures
Organization of resources
Regulatory Compliance and Security
Cost Management
Business Continuity and Disaster Management

For more on how you can revolutionize the management and development of your hybrid environments with Azure Arc,

Posted in TechnologyLeave a Comment on Accelerate Innovation Across Hybrid & Multicloud Environments with Azure Arc

SNP’s Managed Detect & Response Services Powered by Microsoft Sentinel & Defenders (MXDR)

Posted on December 12, 2022December 27, 2024 by snp

SNP’s Managed Detection and Response (MDR) for Microsoft Sentinel service, brings integrations with Microsoft services like Microsoft Defenders (MXDR), threat intelligence and customer’s hybrid and multi-cloud infrastructure to monitor, detect and respond to threats quickly. With our managed security operations team, SNP’s threat detection experts help identify, investigate and provide high fidelity detection through ML-based threat modelling for your hybrid and multicloud infrastructure.

SNP’s MXDR Services Entitlements:

SNP’s Managed services security framework brings the capability of centralized security assessment for managing your on-premises or cloud infrastructure, where we offer:

Leveraging SNP’s security model below, we help our customers:

Build their infrastructure and applications with cloud-native protection throughout their cloud application lifecycle.

With defined workflows, customers get the ease of separating duties in entitlements management to protect against governance and compliance challenges.

Data security is prioritized to protect sensitive data from different data sources to the point of consumption.

With Azure Sentinel, we consolidate and automate telemetry across attack surfaces while orchestrating workflows and processes to speed up response and recovery.

SNP’s Managed Extended Detection & Response (MXDR) Approach:

Our 6-step incident response approach helps our customers maintain, detect, respond, notify, investigate, and remediate cyberthreats as shown below:

For more on SNP’s Managed Detect & Response Services Powered by Microsoft Sentinel & Defenders (MXDR), contact our security experts here.

Posted in TechnologyLeave a Comment on SNP’s Managed Detect & Response Services Powered by Microsoft Sentinel & Defenders (MXDR)

Bring your Data Securely to the Cloud by Implementing Column Level security, Row Level Security & Dynamic Data Masking with Azure Synapse Analytics

Posted on May 18, 2022December 27, 2024 by snp

Azure Synapse Analytics from Microsoft is a limitless analytics service that brings together data integration, enterprise data warehousing, and big data analytics. SNP helps its customers migrate their legacy data warehouse solutions to Azure Synapse Analytics to gain the benefits of an end-to-end analytics platform that provides high availability, security, speed, scalability, cost savings, and industry-leading performance for enterprise data warehousing workloads.

A common business scenarios we cover:

As organizations scale, data grows exponentially. And with the workforce working remotely, data protection is one of the primary concerns of organizations around the world today. There are several high-level security best practices that every enterprise should adopt, to protect their data from unauthorized access. Here are our recommendations to help you prevent unauthorized data access.

The SNP solution:

With Azure Synapse Analytics, SNP provides its customers enhanced security with column level security, row-level security & dynamic data masking.

Below is an example of a sample table data which is required to implement the column level security, row-level security & dynamic data masking for your data.

Revenue table:

Codes:

Step:1 Create users

create user [CEO] without login;

create user [US Analyst] without login;

create user [WS Analyst] without login;

Column Level Security

A column-level security feature in Azure Synapse simplifies the design and coding of security in applications. It ensures column-level security by restricting column access to protect sensitive data.

In this scenario, we will be working with two users. The first one is the CEO, who needs access to all company data. The second one is an Analyst based in the United States, who does not have access to the confidential Revenue column in the Revenue table.

Follow this lab, one step at a time to see how Column-level security removes access to the revenue column to US Analyst.

Step:2 Verify the existence of the “CEO” and “US Analyst” users in the Data Warehouse.

SELECT Name as [User1] FROM sys.sysusers WHERE name = N’CEO’;

SELECT Name as [User2] FROM sys.sysusers WHERE name = N’US Analyst’;

Step:3 Now let us enforce column-level security for the US Analyst.

The revenue table in the warehouse has information like Analyst, CampaignName, Region, State, City, RevenueTarget, and Revenue. The Revenue generated from every campaign is classified and should be hidden from US Analysts.

REVOKE SELECT ON dbo.Revenue FROM [US Analyst];

GRANT SELECT ON dbo.Revenue([Analyst], [CampaignName], [Region], [State], [City], [RevenueTarget]) TO [US Analyst];

The security feature has been enforced, where the following query with the current user as ‘US Analyst’, this will result in an error. Since the US Analyst does not have access to the Revenue column the following query will succeed since we are not including the Revenue column in the query.

Row Level Security

Row-level Security (RLS) in Azure Synapse enables us to use group membership to control access to rows in a table. Azure Synapse applies the access restriction every time data access is attempted from any user.

In this scenario, the revenue table has two Analysts, US Analysts & WS Analysts. Each analyst has jurisdiction across a specific Region. US Analyst on the South East Region. An Analyst only sees the data for their own data from their own region. In the Revenue table, there is an Analyst column that we can use to filter data to a specific Analyst value.

SELECT DISTINCT Analyst, Region FROM dbo.Revenue order by Analyst ;

Review any existing security predicates in the database

SELECT * FROM sys.security_predicates

Step:1

Create a new Schema to hold the security predicate, then define the predicate function. It returns 1 (or True) when a row should be returned in the parent query.

CREATE SCHEMA Security

GO

CREATE FUNCTION Security.fn_securitypredicate(@Analyst AS sysname)

RETURNS TABLE

WITH SCHEMABINDING

AS

RETURN SELECT 1 AS fn_securitypredicate_result

WHERE @Analyst = USER_NAME() OR USER_NAME() = ‘CEO’

GO

Step:2

Now we define a security policy that adds the filter predicate to the Sale table. This will filter rows based on their login name.

CREATE SECURITY POLICY SalesFilter

ADD FILTER PREDICATE Security.fn_securitypredicate(Analyst)

ON dbo.Revenue

WITH (STATE = ON);

Allow SELECT permissions to the Sale Table.

GRANT SELECT ON dbo.Revenue TO CEO, [US Analyst], [WS Analyst];

Step:3

Let us now test the filtering predicate, by selecting data from the Sale table as ‘US Analyst’ user.

As we can see, the query has returned rows here. Login name is US Analyst and Row-level Security is working.

Dynamic Data Masking

Dynamic data masking helps prevent unauthorized access to sensitive data by enabling customers to designate how much of the sensitive data to reveal with minimal impact on the application layer. DDM can be configured on designated database fields to hide sensitive data in the result sets of queries. With DDM the data in the database is not changed. Dynamic data masking is easy to use with existing applications since masking rules are applied in the query results. Many applications can mask sensitive data without modifying existing queries.

In this scenario, we have identified some sensitive information in the customer table. The customer would like us to obfuscate the Credit Card and Email columns of the Customer table to Data Analysts.

Let us take the below customer table:

Confirmed no masking enabled as of now,

Let us make masking for Credit card & email information,

Step:1

Now let us mask the ‘CreditCard’ and ‘Email’ Column of the ‘Customer’ table.

ALTER TABLE dbo.Customer

ALTER COLUMN [CreditCard] ADD MASKED WITH (FUNCTION = ‘partial(0,”XXXX-XXXX-XXXX-“,4)’);

GO

ALTER TABLE dbo.Customer

ALTER COLUMN Email ADD MASKED WITH (FUNCTION = ’email()’);

GO

Now, the results show masking enabled for data:

Execute query as User ‘US Analyst’, now the data of both columns is masked,

Unmask data:

Conclusion:

From the above samples, SNP has shown how column level security, row level security & dynamic data masking can be implemented in different business scenarios. Contact SNP Technologies for more information.

Posted in TechnologyLeave a Comment on Bring your Data Securely to the Cloud by Implementing Column Level security, Row Level Security & Dynamic Data Masking with Azure Synapse Analytics

Top 5 FAQs on Operationalizing ML Workflow using Azure Machine Learning

Posted on May 12, 2022December 27, 2024 by snp

Enterprises today are adopting artificial intelligence (AI) at a rapid pace to stay ahead of their competition, deliver innovation, improve customer experiences, and grow revenue. However, the challenges with such integrations is that the development, deployment and monitoring of these models differ from the traditional software development lifecycle that many enterprises are already accustomed to.

Leveraging AI and machine learning applications, SNP helps bridge the gap between the existing state and the ideal state of how things should function in a machine learning lifecycle to achieve scalability, operational efficiency, and governance.

SNP has put together a list of the top 5 challenges enterprises face in the machine learning lifecycle and how SNP leverages Azure Machine Learning to help your business overcome them.

Q1. How much investment is needed on hardware for data scientists to run complex deep learning algorithms?

By leveraging Azure Machine Learning workspace, data scientists can use the same hardware virtually at a fraction of the price. The best part about these virtual compute resources is that businesses are billed based on the amount of resources consumed during active hours thereby reducing the chances of unnecessary billing.

Q2: How can data scientists manage redundancy when it comes to training segments and rewriting existing or new training scripts that involves collaboration of multiple data scientists?

With Azure data pipelines, data scientists can create their model training pipeline consisting of multiple loosely coupled segments which are reusable in other training pipelines. Data pipelines also allows multiple data scientists to collaborate on different segments of the training pipeline simultaneously, and later combine their segments to form a consolidated pipeline.

Q3. A successful machine learning life cycle involves a data scientist finding the best performing model by using multiple iterative processes. Each process involves manual versioning which results to inaccuracies during deployments and auditing. So how best can data scientists manage version controlling?

Azure Machine Learning workspace for model development can prove to be a very useful tool in such cases. It tracks performance metrics and functional metrics of each run to provide the user with a visual interface on model performance during training. It can also be leveraged to register models developed on Azure Machine Learning workspace or models developed on your local machines for versioning. Versioning done using Azure Machine Learning workspace makes the deployment process simpler and faster.

Q4. One of the biggest challenges while integrating the machine learning model with an existing application is the tedious deployment process which involves extensive manual effort. So how can data scientists simplify the packaging and model deployment process?

Using Azure Machine Learning, data scientists and app developers can easily deploy Machine Learning models almost anywhere. Machine Learning models can be deployed as a standalone endpoint or embedded into an existing app or service or to Azure IoT Edge devices.

Q5. How can data scientists automate the machine learning process?

A data scientist’s job is not complete once the Machine Learning model is integrated into the app or service and deployed successfully. It has to be closely monitored in a production environment to check its performance and must be re-trained and re-deployed once there is sufficient quantity of new training data or when there are data discrepancies (when actual data is very different from the data on which your model is trained on and is affecting your model performance).

Azure Machine Learning can be used to trigger a re-deployment when your Git repository has a new code check-in. Azure Machine Learning can also be used to create a re-training pipeline to take new training data as input to make an updated model. Additionally, Azure Machine Learning provides alerts and log analytics to monitor and govern the containers used for deployment with a drag-drop graphical user interface to simplify the model development phase.

Start building today!

SNP is excited to bring you machine learning and AI capabilities to help you accelerate your machine learning lifecycle, from new productivity experiences that make machine learning accessible to all skill levels, to robust MLOps and enterprise-grade security, built on an open and trusted platform helping you drive business transformation with AI. Contact SNP here.

Posted in TechnologyLeave a Comment on Top 5 FAQs on Operationalizing ML Workflow using Azure Machine Learning

Posts navigation

Older posts