How do you handle updates in streaming Apache Iceberg workloads? 🧊 Two options: 1. Copy-on-Write: rewrite the entire file on every update. Queries stay fast, but write latency is high. 2. Merge-on-Read: write a delete file and merge at read time. Write latency stays lower, but queries slow down over time. Most streaming workloads choose MoR for the latency benefits, but that means committing to more aggressive compaction and retention to manage delete files. See what each strategy means for write latency, reads, and compaction → https://lnkd.in/dG8BUQvw
Apache Iceberg Streaming Workload Updates: Copy-on-Write vs Merge-on-Read
More Relevant Posts
-
88% lower latency. 11.1x higher throughput. 📈 This was the result when Anyscale replaced their standard HTTP proxy layer with HAProxy and gRPC, dramatically reducing overhead in the request path. HAProxy was designed for this: eliminating network bottlenecks and letting your most demanding workloads operate at peak performance. ⚡ Huge thanks to the Anyscale team for sharing this beautifully engineered result and trusting HAProxy to push the limits of LLM inference! Check out their full post and blog article ⤵️
Announcing two core improvements to Ray Serve that deliver up to 11x higher peak throughput and 88% lower latency under high concurrency. • HAProxy integration - Replaces the Python-based HTTP proxy, removing a critical bottleneck in request handling. This enables near-linear throughput scaling with increasing replica counts, especially for LLM serving workloads. • Direct gRPC communication between replicas - Reduces serialization and routing overhead for deployment-to-deployment calls. Measured gains include up to 1.5× higher throughput for unary requests and 2.4× for streaming workloads. Full details in blog: https://lnkd.in/gvjZPfBF
To view or add a comment, sign in
-
-
Announcing two core improvements to Ray Serve that deliver up to 11x higher peak throughput and 88% lower latency under high concurrency. • HAProxy integration - Replaces the Python-based HTTP proxy, removing a critical bottleneck in request handling. This enables near-linear throughput scaling with increasing replica counts, especially for LLM serving workloads. • Direct gRPC communication between replicas - Reduces serialization and routing overhead for deployment-to-deployment calls. Measured gains include up to 1.5× higher throughput for unary requests and 2.4× for streaming workloads. Full details in blog: https://lnkd.in/gvjZPfBF
To view or add a comment, sign in
-
-
“𝐂𝐚𝐧 𝐲𝐨𝐮 𝐰𝐚𝐥𝐤 𝐭𝐡𝐫𝐨𝐮𝐠𝐡 𝐰𝐡𝐚𝐭 𝐡𝐚𝐩𝐩𝐞𝐧𝐬 𝐰𝐡𝐞𝐧 𝐲𝐨𝐮 𝐬𝐭𝐫𝐞𝐚𝐦 𝐚 𝐍𝐞𝐭𝐟𝐥𝐢𝐱 𝐦𝐨𝐯𝐢𝐞?” Most candidates jump to load balancers, S3, or EC2. Started with DNS. Because without DNS, nothing else happens. 𝐇𝐞𝐫𝐞’𝐬 𝐡𝐨𝐰 𝐬𝐡𝐞 𝐞𝐱𝐩𝐥𝐚𝐢𝐧𝐞𝐝 𝐢𝐭-𝐜𝐚𝐥𝐦, 𝐜𝐥𝐞𝐚𝐫, 𝐚𝐧𝐝 𝐜𝐨𝐫𝐫𝐞𝐜𝐭: 1️⃣ User types netflix.com 2️⃣ DNS resolver checks if it has the IP cached 3️⃣ If not, it goes to the root name server 4️⃣ Then the .com TLD name server 5️⃣ Then the netflix.com name server 6️⃣ Finally, it returns the IP of the nearest edge server From there, the movie can be streamed. No confusion. No fluff. Just a confident breakdown.
To view or add a comment, sign in
-
-
Everyone says load balancing is a solved problem. It is not. Your server is drowning. Traffic is spiking. You add two more servers. Now all three are running but one is still getting all the requests. Here is what actually matters: → Round Robin feels safe until one request takes 10 seconds and buries a server while two others sit idle. Use Least Connections instead. → L4 balancers are fast but blind. L7 balancers read headers, URLs and cookies, which unlocks A/B testing and canary deploys but adds latency. → Sticky sessions feel convenient. They quietly destroy scalability. Stateless backends and Redis are the real fix. → The load balancer itself can die. You run two of them, one on standby, ready to swap instantly. Google built their own from scratch because nothing off the shelf was fast enough. Netflix routes 200 million users this way. It is one of those problems that looks solved until you actually have to scale it. #SystemDesign #SoftwareEngineering #Backend #DistributedSystems
To view or add a comment, sign in
-
What happens when "millions" of users hit a website at the same time? Does the server crash? Does the app slow down? Not if there’s a "Load Balancer"working behind the scenes. --- Think of a load balancer like a "smart traffic controller" 🔹 It receives all incoming requests 🔹 Distributes them across multiple servers 🔹 Sends traffic only to *healthy* servers 🔹 Automatically avoids servers that are down --- Why do we need it? ✔ Handles high traffic smoothly ✔ Improves speed & performance ✔ Prevents server overload ✔ Ensures high availability (no downtime) ✔ Removes single point of failure --- Real life examples: When you use apps like Netflix, Amazon, or Instagram… You’re not talking to just one server. You’re talking to a "system of servers managed by a load balancer." --- Without load balancers: More users = More crashes With load balancers: More users = Better scaling --- #SystemDesign #LoadBalancer #Backend #Scalability #WebDevelopment #TechExplained #SoftwareEngineering
To view or add a comment, sign in
-
-
Ever wondered how Netflix handles 250M users without crashing? The secret: Load Balancers 👇 🔹 What it does Sits between users and servers, distributing traffic so no single server gets overwhelmed. 🔹 The 5 Algorithms 🔄 Round Robin — Rotates requests equally 📊 Least Connections — Picks the least busy server 🔑 IP Hash — Same user, same server (sticky sessions) ⚖️ Weighted — Stronger servers get more traffic ⏱️ Least Response Time — Routes to the fastest 🔹 Health Checks Pings servers every 5s. After 3 failures → removes from pool. That's why when a server dies, your users don't notice 🎯 🔹 Key Benefits ⚡ Scalability — Add servers as traffic grows 🛡️ High Availability — Auto-reroute on failure 🚀 Low Latency — Smart routing 🔒 SSL Termination — Handles HTTPS 🔹 Popular Tools NGINX · HAProxy · AWS ELB · Cloudflare · Traefik #systemdesign #loadbalancer #nginx #devops #backend #softwareengineering #scalability #architecture #eraaztech
To view or add a comment, sign in
-
-
🚀 Kubernetes in 30 Days — Day 24 Resource Requests & Limits — Control Your App ⚙️ What if one Pod uses all CPU? 🤔 Other apps will crash ❌ 👉 Kubernetes solves this using Requests & Limits Requests = Minimum resources needed Limits = Maximum resources allowed Example: CPU: request: 200m limit: 500m Memory: request: 256Mi limit: 512Mi Why it matters: ✔ Prevents resource starvation ✔ Ensures fair usage ✔ Avoids crashes Simple way to remember: Request = Guaranteed Limit = Maximum Tomorrow (Day 25): Liveness & Readiness Probes 🔥 📖 Detailed explanation + real scenarios: Medium article 👇 https://lnkd.in/gYZp_FVr Follow the Kubernetes in 30 Days Series 🚀 #Kubernetes #DevOps #K8s #Performance #Cloud
To view or add a comment, sign in
-
You probably interacted with a data center dozens of times today without realizing it. Sending emails. Streaming videos. Uploading files. Running business tools. Every one of those actions travels through physical infrastructure built to process and move data around the world in milliseconds. Behind every digital experience is a massive system working quietly in the background. #datacenterlife #internetinfrastructure #techsystems #servers #cloudtechnology #rackfinity #digitalworld #techbehindthescenes
To view or add a comment, sign in
-
great take on MCP vs cli. I wrote my own MCPs as well thinking they were the future. one of them just so Claude could tell the current date without constantly hallucinating bear in mind that was like around 6 months ago. prehistoric times in AI. I learned a few things but now it feels like a waste of time. p.s. I love gwscli, I'm using it every day. I have to try playwright cli instead of the MCP.
MCPs feel like the future they cost 30x more than the alternative nobody talks about i built 12 MCP servers before i realized CLIs do the same thing for free here's what happened: i had playwright MCP, supabase MCP, github MCP, slack MCP... 12 servers running. each one loading its full schema into context every single turn. burning tokens just sitting there then i read a reddit thread: "switched from MCPs to CLIs, never going back." 633 upvotes. and google built their entire workspace CLI (gwscli) specifically because MCPs are too expensive for agents the math: MCP loads the full tool schema every turn. CLI takes a command and returns output. 30x difference per operation according to FuturMinds benchmark my tier 1 stack now: - gwscli (google workspace, rust, agent-first) - gh (github CLI) - playwright CLI (not the MCP) - jq + yq (JSON/YAML) - yt-dlp (youtube) - pandoc (docs conversion) when to still use MCP: - single-service integration (apollo, attio) - bidirectional streaming - tool discovery matters when to use CLI: - everything else the CLI-Anything movement is real. 30x cheaper. faster. more secure. and google validated it link to the reddit thread and full CLI stack in comments
To view or add a comment, sign in