Tag: open-source

59 discussions across 10 posts tagged "open-source".

AI Signal - May 05, 2026

Qwen3.6-35B-A35: 3B active parameters scoring 73.4% on SWE-bench Verified r/LocalLLM Score: 1716

Alibaba's Qwen3.6-35B-A35 uses mixture-of-experts architecture (256 experts, only 8+1 active per token) to achieve performance within 1.6 points of Claude Opus 4.6 on SWE-bench while running 3B active parameters at inference. This represents a massive cost/performance breakthrough for local AI - frontier-level coding performance on a laptop at 10-30x lower cost.

#llm #local-models #open-source
Llama.cpp MTP support now in beta r/LocalLLaMA Score: 570

Major infrastructure update: llama.cpp now supports Multi-Token Prediction (MTP) in beta, starting with Qwen3.5 MTP. Combined with maturing tensor-parallel support, this should erase most performance gaps between llama.cpp and vLLM for token generation speeds. Significant for local inference infrastructure.

#local-models #open-source
Qwen3.6-27B vs Coder-Next: 20 hours of side-by-side testing r/LocalLLaMA Score: 1061

Comprehensive comparison reveals these models are remarkably well-matched overall, with different strengths and weaknesses. After extensive testing on two RTX PRO 6000 Blackwells, the conclusion is "it depends" - they score similarly across wide range of tests but hit and miss on different things. Valuable for understanding local model tradeoffs.

#local-models #code-generation #open-source
it's time to update your Gemma 4 GGUFs r/LocalLLaMA Score: 416

Important maintenance update: Gemma 4's chat template was fixed a few days ago. Users should update their GGUF versions from bartowski and other quantizers. Reminder that even released models continue evolving through chat template improvements and quantization refinements.

#local-models #open-source
Open source models are going to be the future on Cursor, OpenCode etc. r/LocalLLaMA Score: 202

User burned $10 on just 2 prompts using enterprise Cursor (GPT-5.5 and Claude Opus 4.6 thinking), $80 in one week with Claude Opus 4.7. Argues that outrageous frontier pricing will force migration to comparable open-source models costing 5-10x less. Expects this shift within months as providers can't subsidize anymore.

#open-source #local-models #code-generation
White House Considers Vetting A.I. Models Before They Are Released r/LocalLLaMA Score: 372

Discussion of potential pre-release government vetting of AI models. Significant implications for open-source development, research velocity, and competitive dynamics. Community concerned about regulatory capture, slowed innovation, and potential restrictions on open weights releases.

#open-source #llm

AI Signal - April 28, 2026

Anthropic admits to have made hosted models more stupid, proving the importance of open weight, local models r/LocalLLaMA Score: 1264

Following Anthropic's postmortem, the LocalLLaMA community emphasizes how this incident validates the importance of open-weight, local models. When providers can silently change reasoning effort levels and clear context without user consent, it undermines trust in hosted services and makes a strong case for local deployment where users have full control.

#local-models #open-source
Luce DFlash: Qwen3.6-27B at up to 2x throughput on a single RTX 3090 r/LocalLLaMA Score: 628

A GGUF port of DFlash speculative decoding enables 2x throughput improvement for Qwen3.6-27B on a single 24GB RTX 3090. The standalone C++/CUDA stack achieves ~1.98x mean speedup over autoregressive generation across HumanEval, GSM8K, and Math500 benchmarks, with zero retraining required. This represents a significant practical advancement in local inference efficiency.

#local-models #open-source
Microsoft Presents "TRELLIS.2": An Open-Source, 4b-Parameter, Image-To-3D Model r/LocalLLaMA Score: 629

Microsoft released TRELLIS.2, a 4B-parameter open-source image-to-3D model capable of producing up to 1536³ PBR textured assets. Built on native 3D VAEs with 16× spatial compression, it uses a novel "field-free" sparse voxel structure (O-Voxel) to reconstruct arbitrary 3D assets with complex topologies, sharp features, and full PBR materials.

#open-source #3d-generation

AI Signal - April 21, 2026

Qwen3.6-35B-A3B released! r/LocalLLaMA Score: 2233

Qwen released a sparse MoE model with 35B total parameters but only 3B active, under Apache 2.0 license. It delivers agentic coding performance on par with models 10x its active size, strong multimodal perception and reasoning, and supports both thinking and non-thinking modes. This represents a major efficiency breakthrough in open-source models.

#llm #open-source #local-models
Kimi K2.6 is a legit Opus 4.7 replacement r/LocalLLaMA Score: 890

After testing with customer feedback, Kimi K2.6 is the first model that can confidently replace Opus 4.7 for most tasks. While not exceeding Opus 4.7 in any specific area, it handles about 85% of tasks at reasonable quality with added vision and strong browser use capabilities. Users are successfully replacing personal workflows with Kimi K2.6, especially for long time horizon tasks.

#llm #local-models #open-source
Qwen3.6. This is it. r/LocalLLaMA Score: 994

A user gave Qwen3.6 a task to build a tower defense game using MCP screenshots to confirm the build. The model independently noted rendering issues, identified and fixed bugs in wave completions, and successfully delivered a working game. The user expresses amazement at the autonomous debugging and iteration capabilities.

#llm #open-source #code-generation
235m local model trained at home r/LocalLLM Score: 196

A developer built a 235M parameter transformer language model completely from scratch in PyTorch, training every parameter from raw text on a single consumer GPU. Uses LLaMA-style architecture (GQA, SwiGLU, RoPE, RMSNorm, tied embeddings) with bf16 and gradient checkpointing. This demonstrates that meaningful model training is accessible to individual developers.

#local-models #machine-learning #open-source
Gemma-4-E2B's safety filters make it unusable for emergencies r/LocalLLaMA Score: 397

Testing Google's Gemma-4-E2B-it as a local offline resource for emergency preparedness revealed aggressive safety filters that refuse first aid procedures, technical repairs, and emergency scenarios. The model issues "hard refusals" on almost everything that could be useful in actual emergency situations, making it functionally useless for offline emergency information.

#local-models #open-source
Same prompt for various models - Chroma, Z image, Klein, Qwen, Ernie r/StableDiffusion Score: 315

Systematic comparison of image generation models (Klein 9b distilled, Zetachroma development version, and others) using identical prompts to evaluate which performs best with certain themes and approaches Midjourney quality. Workflows included in images for reproducibility. This represents valuable empirical model comparison beyond benchmark scores.

#image-generation #open-source
Gemma 4 26B-A4B GGUF Benchmarks r/LocalLLaMA Score: 223

KL Divergence benchmarks for Gemma 4 26B-A4B GGUFs across providers show Unsloth GGUFs on the Pareto frontier in 21 of 22 sizes. KLD measures how well quantized models match original BF16 output distribution. Unsloth also updated Q6_K quants to be more dynamic, significantly improving performance.

#local-models #open-source

AI Signal - April 14, 2026

Best Local LLMs — Apr 2026 r/LocalLLaMA Score: 368

The monthly megathread has arrived, and this edition is particularly dense. New entries include Qwen3.5 and Gemma4 series, GLM-5.1 claiming SOTA-level performance, Minimax-M2.7 as an accessible "Sonnet at home," and PrismML Bonsai 1-bit models that apparently actually work. This is the clearest snapshot of the local model landscape available anywhere, updated to reflect real community usage rather than benchmark scores alone.

#local-models #open-source
Updated Qwen3.5-9B Quantization Comparison r/LocalLLaMA Score: 184

A KLD (KL Divergence) evaluation across community GGUF quantizations of Qwen3.5-9B, measuring drift from the BF16 baseline. Rather than relying on benchmark scores, this approach tests how closely each quantized model preserves the original's probability distributions — a more principled method for choosing quantization levels. With a 0.99 upvote ratio, this stands out as a genuinely useful reference artifact for local model users.

#local-models #open-source
Free Open-Source Tool to Instantly Rig and Animate Your Illustrations (Also With Mesh Deform) r/StableDiffusion Score: 1226

The `see-through` model — released the week prior — decomposes a single static anime image into 23 separate layers for rigging. The author built an open-source tool on top of it that handles mesh deformation and animation, eliminating the need for expensive manual rigging. This makes professional-quality 2D character animation accessible without specialized software or large budgets. 0.98 upvote ratio on 81 comments.

#image-generation #open-source
Update: Distilled v1.1 Is Live (LTX-2.3) r/StableDiffusion Score: 518

LTX-2.3's distilled model gets a v1.1 checkpoint with improved audio quality and refined visual aesthetics. Updated ComfyUI workflows included. The 0.99 upvote ratio on 115 comments indicates this is a clean, uncontroversial improvement release. The companion post ([#29](/tags/29/)) provides a quantitative before/after comparison showing the audio mumbling issue from v1.0 is addressed.

#image-generation #open-source
ERNIE Image Released r/StableDiffusion Score: 168

Baidu released ERNIE Image and ERNIE Image Turbo on HuggingFace (baidu/ERNIE-Image and baidu/ERNIE-Image-Turbo). Low score but 88 comments and a 0.99 upvote ratio suggest genuine community interest. Another Chinese lab entering the open image generation space, worth tracking as a comparison point to FLUX and SD3.

#image-generation #open-source

AI Signal - April 07, 2026

Gemma 4 has been released r/LocalLLaMA Score: 2265

Google released Gemma 4, marking a significant moment for local AI with fully open weights and the ability to run completely locally via Ollama. Multiple variants are available (26B-A4B, 31B, E4B, E2B) offering frontier-level performance without cloud dependencies or API subscriptions.

#llm #open-source #local-models
Gemma 4 just casually destroyed every model on our leaderboard except Opus 4.6 and GPT-5.2 r/LocalLLaMA Score: 1671

Gemma 4 (31B) achieved remarkable results on production benchmarks: 100% survival rate, 5/5 profitable runs, +1,144% median ROI at just $0.20/run. It significantly outperforms GPT-5.2, Gemini 3 Pro, Sonnet 4.6, and all Chinese open-source models tested, with only Opus 4.6 performing better at 180× the cost.

#llm #open-source #local-models
This open-source Claude Code setup is actually insane r/AIagents Score: 297

Open-sourced Claude Code configuration with 27 agents, 64 skills, and 33 commands pre-configured for planning, code review, fixes, TDD, and token optimization. Includes AgentShield with 1,282 built-in security tests to prevent common agentic vulnerabilities.

#agentic-ai #open-source #development-tools
What it took to launch Google DeepMind's Gemma 4 r/LocalLLaMA Score: 1034

Behind-the-scenes look at the infrastructure, training, and engineering effort required to launch Gemma 4. Provides insight into Google DeepMind's approach to open model releases and the technical challenges involved.

#llm #open-source
I trained my own LLM from scratch. It's a fish. r/LLM Score: 227

Guppy, a 9M parameter transformer trained on 60K synthetic fish conversations, demonstrates personality-driven LLM training. The model maintains consistent fish-centric worldview and refuses to engage with topics outside its conceptual framework.

#llm #open-source
FLUX.2 [dev] works really well in ComfyUI now r/StableDiffusion Score: 254

ComfyUI's new low-VRAM optimizations enable FLUX.2 [dev] to run on consumer GPUs (RTX 4060Ti 16GB). While slower than Klein (75s vs 15s), it achieves superior character consistency across all open-weight image generation models.

#image-generation #open-source
Flux2Klein EXACT Preservation (No Lora needed) r/StableDiffusion Score: 254

ComfyUI-Flux2Klein-Enhancer node pack achieves exact character preservation without LoRA training by improving prompt adherence and style consistency. Demonstrates architectural improvements to FLUX.2 Klein's capabilities through better node configurations.

#image-generation #open-source
Ace-step 1.5XL's already up r/StableDiffusion Score: 98

Ace-step v1.5 XL released with ComfyUI support in nightly builds. Multiple variants available (turbo, merge, SFT) optimized for different speed/quality tradeoffs in image generation workflows.

#image-generation #open-source
I Gave Claude Its Own Radio Station — It Won't Stop Broadcasting r/AI_Agents Score: 316

WRIT-FM is a 24/7 AI radio station where Claude CLI generates all content in real time—5 distinct AI hosts with unique personalities, full scripts, music curation, transitions, and station imaging. Continuously running production system demonstrating sustained agentic content generation.

#agentic-ai #open-source
An actress Milla Jovovich just released a free open-source AI memory system r/singularity Score: 885

Open-source AI memory system achieved 100% score on LongMemEval benchmark, outperforming paid solutions. Represents unexpected contribution from outside traditional AI development circles.

#open-source #agentic-ai

AI Signal - March 31, 2026

Semantic video search using local Qwen3-VL embedding, no API, no transcription r/LocalLLaMA Score: 353

Developer built semantic video search by embedding raw video directly into vector space using Qwen3-VL. No transcription or frame captioning needed—just natural language queries against video clips. The 8B model runs fully local on 18GB RAM with usable results.

#local-models #open-source
llama.cpp at 100k stars r/LocalLLaMA Score: 958

llama.cpp reaches 100,000 GitHub stars, marking it as one of the most popular AI infrastructure projects. The library enables efficient LLM inference on consumer hardware and has become foundational for the local AI ecosystem.

#local-models #open-source

AI Signal - March 24, 2026

The current state of the Chinese LLMs scene r/LocalLLaMA Score: 450

Comprehensive overview of Chinese LLM landscape. ByteDance's dola-seed (Doubao) leads proprietary market. Alibaba confirmed commitment to continuously open-sourcing Qwen and Wan models. DeepSeek's hybrid MoE models remain popular for cost-efficiency. Tencent and Baidu lag behind.

#llm #open-source
A "phone" company is now competing with Anthropic on AI benchmarks r/singularity Score: 409

Xiaomi's MiMo-V2-Pro (1T params) ranks [#3 globally](/tags/3-globally/) on agent tasks, behind Claude Opus 4.6, at 1/8th the price. Flash (309B, open source) beats all other open source models on SWE-Bench at $0.10/million tokens. Lead researcher came from DeepSeek. Model initially appeared on OpenRouter as "Hunter Alpha" with no attribution.

#llm #open-source
Alibaba confirms they are committed to continuously open-sourcing new Qwen and Wan models r/LocalLLaMA Score: 1136

Official confirmation from Alibaba that they will continue releasing Qwen and Wan models as open source. Crucial for ecosystem stability and developer confidence in building on these foundations.

#llm #open-source
OpenClaw is the new computer - Jensen Huang r/AIagents Score: 347

OpenClaw reached 300,000 GitHub stars, surpassing React and Linux to become the most popular open source project in history. Jensen Huang's quote highlights the shift from traditional computing paradigms to agentic systems.

#agentic-ai #open-source
daVinci-MagiHuman: This new opensource video model beats LTX 2.3 r/StableDiffusion Score: 359

New 15B open-source Audio-Video model from GAIR claiming to beat LTX 2.3. Expanding capabilities for local video generation with audio synchronization.

#image-generation #open-source
China's open-source dominance threatens US AI lead, US advisory body warns r/LocalLLaMA Score: 509

US government advisory body warning about Chinese open-source AI dominance. Qwen, DeepSeek, and other models gaining traction globally. Policy implications for AI development and distribution.

#open-source #llm

AI Signal - March 17, 2026

Qwen3.5-9B-Claude-4.6-Opus-Uncensored-Distilled-GGUF r/LocalLLaMA Score: 1341

A distilled version of Claude Opus 4.6 into Qwen 3.5 9B, making frontier-model-quality responses available for local deployment. The GGUF format and 9B parameter size make this practical for consumer hardware. The 27B version includes thinking mode by default. This represents significant progress in democratizing access to capable models through distillation techniques.

#local-models #llm #open-source
OpenCode concerns (not truely local) r/LocalLLaMA Score: 396

Important security finding: OpenCode's web UI proxies all requests to app.opencode.ai by default, despite being marketed as a local solution. This defeats the privacy and security benefits users expect from "local" tools. The post includes code references and raises questions about transparency in open-source tooling.

#local-models #development-tools #open-source
Showing real capability of LTX loras! Dispatch LTX 2.3 LORA with multiple characters + style r/StableDiffusion Score: 751

Impressive demonstration of LTX 2.3 LORA training with 440 clips from the game Dispatch, achieving multiple character and style preservation in text-to-video generation. The training included 6+ characters with distinct voices and game aesthetics. Shows progress in controllable video generation with LoRA fine-tuning.

#image-generation #open-source
[P] I got tired of PyTorch Geometric OOMing my laptop, so I wrote a C++ zero-copy graph engine to bypass RAM entirely. r/MachineLearning Score: 344

GraphZero v0.2 addresses Graph Neural Network training on large datasets (Papers100M) by bypassing RAM entirely using memory-mapped I/O and zero-copy techniques. Instead of loading everything into memory, it streams data directly from optimized binary formats. Enables GNN training on datasets previously requiring server-grade hardware.

#machine-learning #open-source
Qwen3.5-9B on document benchmarks: where it beats frontier models and where it doesn't. r/LocalLLaMA Score: 222

Detailed benchmarking of Qwen3.5 models (0.8B to 9B) on document AI tasks. Qwen3.5-9B outperforms GPT-5.4, Claude Sonnet 4.6, and Gemini 3.1 Pro on OCR tasks but lags on structured extraction. The granular breakdown helps developers choose the right model for specific document processing needs.

#local-models #llm #open-source
Mistral Small 4:119B-2603 r/LocalLLaMA Score: 580

Release announcement for Mistral Small 4, a 119B parameter model. The model represents Mistral's continued development of capable open-weight models in the mid-size range, balancing capability and resource requirements for local deployment.

#local-models #llm #open-source

AI Signal - March 10, 2026

ComfyUI launches App Mode and ComfyHub r/StableDiffusion Score: 334

ComfyUI introduced App Mode (internally called "comfyui 1111"), which transforms complex workflows into simple, shareable UIs. Users can select input parameters and create web UI-like interfaces from any workflow. ComfyHub provides a centralized workflow repository, lowering the barrier to entry for non-technical users while preserving ComfyUI's node-based power for advanced users.

#image-generation #open-source
Qwen3.5 family comparison on shared benchmarks r/LocalLLaMA Score: 1082

Comprehensive benchmark comparison shows Qwen3.5's 122B, 35B, and especially 27B models retain significant performance from the flagship, while 2B/0.8B fall off harder on long-context and agent categories. The 27B model emerges as a sweet spot for local deployment, offering near-flagship performance at much lower computational requirements.

#llm #local-models #open-source
Open WebUI's New Open Terminal + "Native" Tool Calling + Qwen3.5 35b = Holy Sh!t!!! r/LocalLLaMA Score: 891

Open WebUI released a new terminal integration with native tool calling support. Combined with Qwen3.5 35B, it enables local agentic workflows comparable to frontier API services. The Open Terminal function allows models to execute shell commands with user approval, while the workflow hub facilitates sharing of agent configurations.

#agentic-ai #local-models #open-source
Heretic has FINALLY defeated GPT-OSS with a new experimental decensoring method called ARA r/LocalLLaMA Score: 685

The Heretic project introduced Arbitrary-Rank Ablation (ARA), a new decensoring method that dramatically reduces refusals. Previous best results showed 74 refusals even after Heretic processing; ARA reduces this significantly. This represents a major advancement in removing alignment restrictions from open-weight models.

#llm #local-models #open-source

AI Signal - March 03, 2026

Qwen3.5-35B-A3B-4bit r/OpenSourceAI Score: 269

With 60 tokens/second on an Apple M1 Ultra at 4-bit, Qwen3.5's MoE variant is generating genuine excitement from the open-source community — this is not hype-driven buzz but real performance validation from hands-on users. The combination of a 35B parameter count at ~3B active parameters per token makes this a landmark moment for local AI capability. Relative to the subreddit's median score of 12, this post's 269 score is a strong signal.

#llm #open-source #local-models
[P] I trained Qwen2.5-1.5b with RLVR (GRPO) vs SFT and compared benchmark performance r/MachineLearning Score: 26

A practitioner ran a direct RLVR vs SFT comparison on Qwen2.5-1.5B using GSM8K, finding RLVR (the technique behind DeepSeek-R1) boosted math reasoning by +11.9 points while SFT *degraded* it by 15.2. This hands-on replication confirms at small scale what frontier labs have been showing: reinforcement learning with verifiable rewards is a step-change over supervised fine-tuning for reasoning tasks. Highly relevant for anyone experimenting with fine-tuning open models.

#llm #machine-learning #open-source
[P] We made GoodSeed, a pleasant ML experiment tracker r/MachineLearning Score: 85

GoodSeed v0.3.0 is a self-hostable ML experiment tracker positioned as a Neptune replacement, featuring GPU/CPU monitoring, stdout streaming, and a clean UI. At a subreddit median of 26, a score of 85 with 19 comments represents real traction. For teams running local training loops, having a lightweight open-source tracker that doesn't phone home is a real gap — this is worth watching.

#mlops #open-source #development-tools
A 16-problem RAG failure map that LlamaIndex just adopted (semantic firewall, MIT, step-by-step examples) r/LlamaIndex Score: 7

The author published a structured failure-mode checklist for RAG systems covering 16 reproducible failure categories — and LlamaIndex adopted it into their official RAG troubleshooting docs. The post walks through each failure mode with concrete LlamaIndex examples. For anyone building production RAG pipelines, this is a structured diagnostic tool worth bookmarking.

#rag #agentic-ai #open-source
Anyone doing real evals for open models? What actually worked for you r/OpenSourceAI Score: 13

A developer building an internal chatbot is transitioning from manual testing to systematic evals and wants battle-tested approaches. The 1.0 upvote ratio and active discussion suggest the community has real opinions here. The framing — comparing endpoints after prompt/model changes — is a canonical use case for eval frameworks, and the mention of DeepEval + Confident AI gives concrete starting points.

#llm #open-source #development-tools
Open Source LLM Tier List r/OpenSourceAI Score: 163

A community-curated leaderboard of self-hostable LLMs with relative tier rankings. At a score of 163 against a subreddit median of 12, this received exceptional engagement — it's hitting a real need for a quick reference beyond raw benchmarks. The link points to a live leaderboard at onyx.app.

#llm #open-source #local-models
Qwen tech lead and multiple other Qwen employees are leaving Alibaba r/StableDiffusion Score: 179

Organizational news with direct implications for the open-source ecosystem: if the Qwen team is fragmenting, timelines for future releases (including Qwen Image 2.0) become uncertain. The irony of this appearing in r/StableDiffusion reflects how much the image generation community has come to depend on Qwen's multimodal roadmap.

#llm #open-source
I made an open source one image debug poster for RAG failures. Feel free to just take it and use it r/OpenSourceAI Score: 5

A single-image RAG debugging reference that can be uploaded directly into any LLM alongside a failing run to get structured diagnostic suggestions — no install required. The "upload to LLM" use pattern is a clever zero-friction distribution mechanism for debugging tools.

#rag #open-source #development-tools
Ollama 0.17.5 released and fixed the Qwen3.5 gguf issues! r/OpenSourceAI Score: 7

A quick note that Ollama 0.17.5 resolved compatibility issues with Qwen3.5 GGUF files, unblocking local users who were stuck on broken imports. Minor but operationally useful for anyone running Qwen3.5 via Ollama.

#local-models #open-source
GyBot/GyShell v1.1.0 — OpenSource Terminal where agent collaborates with you in all tabs r/AgentsOfAI Score: 13

GyShell is an open-source terminal that embeds an AI agent across all tabs, supporting full interactive control (Ctrl+C, vim, docker), built-in SSH, and now a filesystem panel for remote file management. The "user can step in anytime" design philosophy is a sensible middle ground between full autonomy and purely manual operation.

#agentic-ai #open-source #development-tools