Local AI Models with Joe Finney
π― Summary
Podcast Episode Summary: Local AI Models with Joe Finney
This 55-minute episode of .NET Rocks features host Carl Franklin and Richard Campbell interviewing Joe Finney, a mobile product owner and Microsoft MVP, focusing on the practical implementation of local AI models on Windows devices, moving beyond the hype of large language models (LLMs).
1. Focus Area
The primary focus is the integration of local, specialized AI/ML models (like OCR, image segmentation, and detection) into native Windows applications (WPF, WinUI, WinForms). The discussion contrasts the ease of using new, built-in Windows AI APIs with the complexity of managing custom or external models via WinML or direct integration.
2. Key Technical Insights
- Three Tiers of Local AI Integration: Developers can engage with local AI via three complexity levels: 1) New Windows AI APIs (easiest, model managed by OS, requires Copilot+ PC hardware); 2) WinML (middle layer, allows running custom ONNX models, requires developer to manage model files); and 3) Custom Model Training/Integration (highest complexity, maximum fine-tuning).
- The Role of WinML and ONNX: WinML provides a standardized interface for running downloaded ONNX models across different hardware accelerators (CPU, GPU, NPU) without requiring specific hardware optimization from the developer.
- Beyond LLMs: The conversation strongly emphasizes that βAIβ encompasses much more than LLMs, highlighting mature fields like OCR (Tesseract), image segmentation, and object detection, which are seeing improvements via local model deployment.
3. Business/Investment Angle
- Hardware-Enabled Features: The rollout of Copilot+ PCs creates a new baseline for consumer expectations, where developers can easily βlight upβ advanced local AI features if the hardware is present, simplifying deployment for certain capabilities.
- Model Management Overhead: A key business consideration is the overhead of shipping large models (e.g., 5GB) with an application versus relying on OS-managed models, which impacts app size and update cycles.
- Niche Model Value: For specialized tasks requiring proprietary data or extreme accuracy, the investment in training and integrating custom models (Tier 3) remains viable for specific business applications.
4. Notable Companies/People
- Joe Finney: Discussed his work on productivity apps like TextGrab (local OCR) and how he is adapting them to leverage new local AI capabilities.
- Microsoft: Mentioned the new Windows AI APIs (released around the Copilot+ PC launch) and the AI Dev Gallery app (a local model playground).
- Hugging Face: Highlighted as the massive online repository for downloading various specialized ML models (not just LLMs).
- Kaggle: Mentioned as a platform for practicing ML skills and participating in data science competitions.
5. Future Implications
The industry is moving toward ubiquitous, accessible local AI processing on consumer hardware. Developers will increasingly need to decide whether to leverage the simplified, OS-managed AI stack or delve into more complex, custom model integration. The conversation suggests a return to focusing on specific, functional AI tasks (like OCR) now that the foundational hardware support is being standardized.
6. Target Audience
This episode is highly valuable for Software Developers (especially Windows/Desktop developers), Product Owners, and AI/ML Engineers interested in the practical, low-level integration of machine learning capabilities directly into native applications, particularly those focused on productivity and utility software.
π’ Companies Mentioned
π¬ Key Insights
"If you go to Hugging Face and you look at all the different categories, I mean, OCR, image segmentation, image detection, object detection. Hugging Face. Hugging Face. Hugging Face. Yeah, this is a, I think Facebook is kind of backing it and it's a big repository for models."
"The Copilot Plus PC has a local LLM built into it."
"The first one is the new Windows AI APIs... You don't have to manage models. You don't have to manage memory or downloading. And you don't have to worry about shipping a five-gig model with your app."
"And also you have to consider the big question of why would you build local ever?"
"I'd just reach for this stuff. While Microsoft is exposing in Windows and their Phi model, it's pretty good. It's pretty robust. And I would say it's a nice middle ground there for building on top of and fine-tuning."
"These LLMs don't read PDFs by default locally. You do have to get them into a text format. So if you're thinking about how you can apply this into your work, and I know a lot of enterprises, a lot of companies, a lot of their data is not in raw text format."