Name: QUICfeed Speed Boost: Piped Input Support Added | Erik Herz posted on the topic | LinkedIn
Uploaded: 2026-02-23T04:41:01.329Z
Duration: 26 s
Channel: Erik Herz
Description: QUICfeed now faster with pipes: https://quicfeed.net Nicolas Weil suggested that we will need to take direct encoder input rather than reading from files, so I added support for piped fragmented CMAF via GPAC. Thank you, Romain Bouqueau for your tips on how to make this work!

Erik Herz

Vimeo•9K followers

2mo

QUICfeed now faster with pipes: https://quicfeed.net Nicolas Weil suggested that we will need to take direct encoder input rather than reading from files, so I added support for piped fragmented CMAF via GPAC. Thank you, Romain Bouqueau for your tips on how to make this work!

4 Comments

Dave Seddon

Siden•3K followers

2mo

As part of the little HLS load testing tool, there is also Nix config to create a deterministic HLS origin server with ffmpeg + nginx. So basically you can do "nix run .#test-origin-4k-abr" which will build and run a microvm with the origin. It's just the good old ffmpeg test pattern https://github.com/randomizedcoder/go-ffmpeg-hls-swarm/blob/main/nix/test-origin/ffmpeg.nix

3 Reactions

To view or add a comment, sign in

More Relevant Posts

Jonathan Santos

Google•880 followers
1mo
Report this post
🤖 Post #637: arXiv:2603.03043 **IoUCert: Formal Robustness Verification for Anchor-Based Object Detectors** Until now, formal robustness guarantees for object detectors like YOLO were out of reach — IoUCert introduces the first verified bounds on IoU for realistic anchor-based models. Key contributions: • Coordinate transform eliminates precision-degrading non-linear relaxations • Novel IBP method derives tight optimal IoU bounds • First verification of SSD, YOLOv2, and YOLOv3 under input perturbations • Scales formal verification from classifiers to full detection pipelines #MachineLearning #ComputerVision #ObjectDetection #AIResearch
Like Comment
To view or add a comment, sign in
SemiAnalysis

36,975 followers
1mo
Report this post
For frontier MoE inferencing, GB300 NVL72 FP4 framemogs h100 even when both blackwell ultra & hopper have all the optimizations including disagg & wide expert parallelism enabled. We see similar trends when comparing GB300 FP8 to H100 FP8. For pretraining, blackwell & rack scale only offers 2-4x performance uplifts while inference is where blackwell shines.

5 Comments
Like Comment
To view or add a comment, sign in
Lightbits Labs

13,336 followers
1mo
Report this post
The path forward for storage is no longer a debate. Robert Terlizzi charted 70 years of storage evolution—from washing-machine-sized disks to fabric-native speed—in his blog series. Lightbits was designed with NVMe/TCP from day one because at 400G and 800G, every "translation layer" becomes a liability. Read the blog finale: The End of the Beginning 👉 https://ow.ly/WIVk50XRbU7 #NVMeTCP #ITInfrastructure #StorageEvolution
Like Comment
To view or add a comment, sign in
Md Mustakin Alam

University of Louisiana at…•720 followers
1mo
Report this post
🚀 Final version of our ICLR 2026 paper is now available! I’m excited to share that the final version of our paper, “Batch Pruning by Activation Stability,” is now available. 📄 Paper: https://lnkd.in/gjJcmP9T 💻 Code: https://lnkd.in/g_4Hk9U7 In this work, we propose a dynamic data pruning framework that reduces training cost by leveraging activation stability as an internal signal to discard less informative batches, achieving significant savings in data usage and GPU node-hours while preserving accuracy. #ICLR2026 #MachineLearning #DeepLearning
Like Comment
To view or add a comment, sign in
Scott Sun

I Build Cool Stuff•5K followers
1mo
Report this post
Nvidia's Nemotron 3 Super model released, 120B (12B active) pretrained on NVFP4 Mamba2 + GQA + latent MoE + MTP 1M context, 25T pretraining tokens the closest thing you can get to a true "open source" model (weights, code, partial data, recipe all open) Frontier performance in this weight class with unmatched throughput and inference speed. Awesome! https://lnkd.in/d-Juy6zM
1 Comment
Like Comment
To view or add a comment, sign in
Renaud Dumeur

569 followers
1mo
Report this post
Claude is very helpful. I wondered if my homegrown GP toolkit would be able to explore NN architectures. It does. In one day, I used Claude to write a toy grammar whose typed expressions produce Torch modules. The GP engine is able to combine module-producing operators, using the type system to match tensor dimensions (And what a stress test for polymorphic type matching routines!). Results are promising when compared to SOTA. More runs with more seeds will be needed, of course. Claude seems good at analyzing results too, assuming it does not hallucinates. All this needs to be thoroughly checked and it is a task that sadly cannot be delegated to a LLM :-/.
Like Comment
To view or add a comment, sign in
SemiAnalysis

36,975 followers
1mo
Report this post
At 167 tok/s/user interactivity on Deepseek 670B MoE at 8k context length, it would cost $0.96 per million output tokens on GB200 NVL72 FP4 verus $2.3 per million output tokens on B200 even with DeepSeek system optimizations like disaggregrated PD & wide EP enabled.
3 Comments
Like Comment
To view or add a comment, sign in
Andrew Lekashman

SemiAnalysis•5K followers
1mo
Report this post
This is one of those things that people don’t really understand when they buy compute based on early specs and data sheets designed for c suite instead of engineering.
SemiAnalysis

36,975 followers
1mo

At 167 tok/s/user interactivity on Deepseek 670B MoE at 8k context length, it would cost $0.96 per million output tokens on GB200 NVL72 FP4 verus $2.3 per million output tokens on B200 even with DeepSeek system optimizations like disaggregrated PD & wide EP enabled.
2 Comments
Like Comment
To view or add a comment, sign in
Black Forest Labs

50,933 followers
1mo
Report this post
FLUX.2 [klein] 9B just got 2x faster at image editing, especially with multiple reference images. Same quality, no price increase. The update introduces KV-caching and FP8 quantized weights built with NVIDIA - faster inference, less VRAM, and the speedup grows with every reference image you add. Already on Klein 9B via API? Free upgrade, faster, same price. On Klein 4B and want better quality? 9B is now closer in speed.
9 Comments
Like Comment
To view or add a comment, sign in
KYUNGJUN LIM

freederia•5K followers
1mo
Report this post
New Post: Dynamic Layer‑wise Quantization Scaling for FP8 Inference of Generative Pre‑trained Transformers: A Practical Approach Toward 10× Energy Savings - https://lnkd.in/grkhRnFw ### Abstract We present a novel *Dynamic Layer‑wise Quantization Scaling* $DLQS$ framework that enables efficient FP8 inference of large transformer models with negligible loss in accuracy. DLQS adaptively selects per‑layer scaling factors and precision switches for feed‑forward, attention, and output projection sub‑graphs, guided by a lightweight *Range‑Aware Loss* $RAL$ signal derived from forward activations. \[…\]

Dynamic Layer‑wise Quantization Scaling for FP8 Inference of Generative Pre‑trained Transformers: A Practical Approach Toward 10× Energy Savings
Like Comment
To view or add a comment, sign in

9,402 followers

402 Posts

View Profile Connect

LinkedIn respects your privacy

Explore content categories