Avalon local LLM

Pocster · 2026-05-08T10:51:23Z

I emailed him . Seeing if there was some wriggle room on the price ( just for fun ) . It’s a bit cheaper now ! 🤣

SteamyTea · 2026-05-08T14:59:43Z

At £32k, you could employ a few apprentices for several years, maybe from Thailand.

I am sure they could turn you on.

Whoops, meant lights on.

I asked ChatGPT what the difference between light and hard was.

It said 'you can sleep with a light on'

Edited 16 hours ago by SteamyTea

Pocster · 2026-05-08T18:48:49Z

3 hours ago, SteamyTea said:

At £32k, you could employ a few apprentices for several years, maybe from Thailand.

I am sure they could turn you on.

Whoops, meant lights on.

I asked ChatGPT what the difference between light and hard was.

It said 'you can sleep with a light on'

I messaged him direct said he’d do it for 27k outside ebay 😂😂😂😂

Pocster · 2026-05-08T18:54:57Z

Honestly! As an ex software engineer I’m stunned !

We binned qwen coder . Too many silly errors . Uma8 microphone array up and recognises “ hey jeaves “ - tomorrow custom wake word . Seems stable as is but will leave running overnight. Have a fully fledge subscription protocol running so any new features easy to add and unlikely to bust anything else .

So the Skeleton seems good !

stt should be easy . Was going porcupine route but it’s subscription so went openwake and whisper . My rules is no cost and no cloud - (expletive deleted) that !

To build an entire system with me manually doing zero coding is still insane 🤯

SimonD · 2026-05-08T20:02:39Z

42 minutes ago, Pocster said:

To build an entire system with me manually doing zero coding is still insane 🤯

I know why you want to avoid the cloud, but for a little compromise and a good few grand saved, I have to say that Claude Code is probably at the front right now. Have you tried it? I've tried Deepseek, which is pretty bloody amazing too but the way that Claude can build an entire stack, spit out all the relevant code in files, plus the debugging capability just has me flawed. The other advantage is persistent memory across conversations and chat history, including file retrieval - so you can put in all your skills and provide a context that gets updated as you work. Deepseek gets a bit of a pain because as soon as you notice the context memory starting to degrade, you've got to create a new prompt with current context and paste it into a new chat. Really don't get on with ChatGPT. Last summer when I was designing a DB schema and doing it the good old fashioned way with manual normalisation etc. it was just taking forever. Put my requirements in along with a decent prompt and it spat out the schema in about 3 minutes including all the relationships keys and foreign keys in an svg too, plus it then writes all the sql. What's amazing is then how it points you to tools available and takes you through how to integrate and implement them you'd never have heard about without weeks of trawling various tech sites. It is absolutely incredible.

Do you not find that you still have to do a little stitching in of code and a little nudging for debugging though? So you need to make sure the code is properly commented - that's been the thing with Claude for me is that it makes sure the necessary commenting is in there - Deepseek stitching in was a bit more painful as the line numbers it gave me were always quite a way off, well actually a lot as I still have a 3/4 finished app on there and I can't quite face going back to resolve the bugs right now.

But I think with all these tools, you still have to properly keep them on track as they do tend to forget stuff and when you're dealing with mathematics and especially applied physics, you've got to be very careful.

Pocster · 2026-05-08T20:11:16Z

3 minutes ago, SimonD said:

I know why you want to avoid the cloud, but for a little compromise and a good few grand saved, I have to say that Claude Code is probably at the front right now. Have you tried it? I've tried Deepseek, which is pretty bloody amazing too but the way that Claude can build an entire stack, spit out all the relevant code in files, plus the debugging capability just has me flawed. The other advantage is persistent memory across conversations and chat history, including file retrieval - so you can put in all your skills and provide a context that gets updated as you work. Deepseek gets a bit of a pain because as soon as you notice the context memory starting to degrade, you've got to create a new prompt with current context and paste it into a new chat. Really don't get on with ChatGPT. Last summer when I was designing a DB schema and doing it the good old fashioned way with manual normalisation etc. it was just taking forever. Put my requirements in along with a decent prompt and it spat out the schema in about 3 minutes including all the relationships keys and foreign keys in an svg too, plus it then writes all the sql. What's amazing is then how it points you to tools available and takes you through how to integrate and implement them you'd never have heard about without weeks of trawling various tech sites. It is absolutely incredible.

Do you not find that you still have to do a little stitching in of code and a little nudging for debugging though? So you need to make sure the code is properly commented - that's been the thing with Claude for me is that it makes sure the necessary commenting is in there - Deepseek stitching in was a bit more painful as the line numbers it gave me were always quite a way off, well actually a lot as I still have a 3/4 finished app on there and I can't quite face going back to resolve the bugs right now.

But I think with all these tools, you still have to properly keep them on track as they do tend to forget stuff and when you're dealing with mathematics and especially applied physics, you've got to be very careful.

My issue was cost per month for number of tokens I use . Also I need llm local to do stt realtime . I’m not suggesting local models match frontier yet ( if ever ! ) but with a bit of workflow adjustment you can get pretty far . ChatGPT plus blows my mind for £20 a month . This I use as my co boss to check everything . I think the media gets obsessed with “ the best “ - but you get to a point where it’s clearly good enough for your needs . I’ve done advanced physics / rendering / maths solely with ChatGPT and it was fantastic . Required work to get it to do it correctly and not (expletive deleted) up . So I’ve changed my workflow ; it’s now better and more efficient. Ultimately I’ll ditch ChatGPT when my local reasoner is good enough .

Like all cloud based stuff , line drops , timeouts , stalls , delays I.e workflow rather than model issues wasted more time than a slow model !

Pocster · 2026-05-08T20:15:33Z

As an aside a few months ago whilst testing models with chat we defined some test spec and did a prompt for it .

The coder produced ok code but a bit amateurish. Chat then re did the prompt . The coder model code was much more robust and professional- on my level ! . Went back to chat and did a full deep prompt . The coder then produced top grade software engineering code . I could do that , but it would take weeks . Chat was demonstrating a poor prompt is more problematic than a poor coding model . So I make sure prompts are solid !

Edited 11 hours ago by Pocster

-rick- · 2026-05-08T20:21:43Z

15 minutes ago, Pocster said:

My issue was cost per month for number of tokens I use .

Anthropic just announced a big increase in number of tokens included in their subscriptions. OpenAIs codex is already generous (apparently). So maybe worth another look. Seen plenty of people talking about techniques for minimizing token usage.

15 minutes ago, Pocster said:

Also I need llm local to do stt realtime.

This is the thing that confuses me the most about your plan (not that you have told us much detail). Home Assistant can already do this can't it? (some features need a live cloud connection but some don't IIRC).

Even if the operational side of your plan needs a local LLM, the coding/development aspect is one of the hardest aspects of LLM use that the frontier providers are better at. So your best value for money might be using them to do the development and then using a smaller, less memory hungry local llm to do the basic daily interactions. (Should be possible to design it so it can call out to a bigger subscription model for 'hard' problems).

Having said all that, if you are enjoying the process, have the spare money for the hardware then do what makes you happy.

Edited 11 hours ago by -rick-

-rick- · 2026-05-08T20:23:28Z

6 minutes ago, Pocster said:

As an aside a few months ago whilst testing models with chat we defined some test spec and did a prompt for it .

The coder produced ok code but a bit amateurish. Chat then re did the prompt . The coder model code was much more robust and professional- on my level ! . Went back to chat and did a full deep prompt . The coder then produced top grade software engineering code . I could do that , but it would take weeks . Chat was demonstrating a poor prompt is more problematic than a poor coding model . So I make sure prompts are solid !

Everything I'm seeing suggests that the world has changed somewhat over the last 4 months or so, so if you haven't tried with codex 5.3/opus 4.6 then you might have the wrong mental model of where things are.

Pocster · 2026-05-08T20:35:26Z

I think there’s confusion here 😊

I want local - even ( obviously that a cloud based version ) is best .

i use ChatGPT as my cloud based reasoner and sometimes coder for unlimited tokens ( effectively ) for £20 a month . It’s also my ‘chat’ for architecture / design / free balling ideas .

That coupled with local Deepseek / qwen ( when needed ) is working fine .

Home assistant can do some of this yes ; Alexa can do some of what I want .

But I want to go beyond that …

as examples

” it’s hot in here isn’t it Avalon ? “

” how often was the gate opened last Tuesday Avalon ? “

” Avalon , play that album by Coldplay I played yesterday “

then ….

we get reasoning and memory for example on when to put the radio on . I.e a weighted graph ( doesn’t require an llm admittedly) based on occupancy, time of day , whom is in the room etc

Ilm linked to camera above kitchen worktop to see the ingredients and suggest / recognise the meal .

leading to stt permanently. SWMBO mentions “ bathroom light didn’t come on again “ this is stored and can referenced later . “ I need to order a usb cable “ . Again stored / referenced . My email checked for the Amazon order etc

local llm , home assistant , ip camera , voice/speech recognition all part of the bigger picture .

My “ to do “ list is long .

Then we move into a humanoid robot to physically do tasks .

You guys are focusing on “coding “ that bits solved . I never need to code again . So any method that’s cheap and/or local is perfect . The big picture is much much bigger !! < evil laugh > 😆

Edited 10 hours ago by Pocster

Pocster · 2026-05-08T20:56:08Z

😊👍

SimonD · 2026-05-08T21:02:25Z

16 minutes ago, Pocster said:

You guys are focusing on “coding “ that bits solved .

Not at all, I'm focussed on using the most effective tool that requires the least amount of effort to get what I what done, working how I want it to work and it's reliable without too many bugs - and it can be maintained with minimal input, including completing regular security reviews of the implementation and doing what's necessary to lock it down. Now what I love about it is that quite regularly Claude has the audacity to add surprising things and functionality I hadn't thought about, making things even better that I'd asked!

19 minutes ago, Pocster said:

I never need to code again .

I can hardly contain my excitement about this too. No more brain hurt and squiffy eyes at 3am.

20 minutes ago, Pocster said:

The big picture is much much bigger !! < evil laugh > 😆

And gets bigger all the time, I hope.

46 minutes ago, Pocster said:

My issue was cost per month for number of tokens I use . Also I need llm local to do stt realtime . I’m not suggesting local models match frontier yet ( if ever ! ) but with a bit of workflow adjustment you can get pretty far . ChatGPT plus blows my mind for £20 a month . This I use as my co boss to check everything . I think the media gets obsessed with “ the best “ - but you get to a point where it’s clearly good enough for your needs . I’ve done advanced physics / rendering / maths solely with ChatGPT and it was fantastic . Required work to get it to do it correctly and not (expletive deleted) up . So I’ve changed my workflow ; it’s now better and more efficient. Ultimately I’ll ditch ChatGPT when my local reasoner is good enough .

Like all cloud based stuff , line drops , timeouts , stalls , delays I.e workflow rather than model issues wasted more time than a slow model !

Totally agree. I was listening to an AI podcast the other day and was surprised about the discussion talking about small scale efficient models running on very little computing power. The really interesting thing was that the guy (one of the gurus of our time in this field) was saying that the focus of the big data centre huge resource hungry things is more about stroking egos and the assumption that everything has to be bigger and better and more powerful all the time. It's a specific design decision rather than a necessity.

Pocster · 2026-05-08T21:05:55Z

1 minute ago, SimonD said:

Not at all, I'm focussed on using the most effective tool that requires the least amount of effort to get what I what done, working how I want it to work and it's reliable without too many bugs - and it can be maintained with minimal input, including completing regular security reviews of the implementation and doing what's necessary to lock it down. Now what I love about it is that quite regularly Claude has the audacity to add surprising things and functionality I hadn't thought about, making things even better that I'd asked!

I can hardly contain my excitement about this too. No more brain hurt and squiffy eyes at 3am.

And gets bigger all the time, I hope.

Totally agree. I was listening to an AI podcast the other day and was surprised about the discussion talking about small scale efficient models running on very little computing power. The really interesting thing was that the guy (one of the gurus of our time in this field) was saying that the focus of the big data centre huge resource hungry things is more about stroking egos and the assumption that everything has to be bigger and better and more powerful all the time. It's a specific design decision rather than a necessity.

Oh absolutely!

Remember “ i’m free “ not in an are you being served way ! 😂 . No budget limit ( well maybe a bit ) , no time constraint , no boss ( like my build ) so able to explore and experiment! . Create what I want . Equally my house is the sandbox and I’m the guinea pig ! . Perfect setup for the experiment!!

-rick- · 2026-05-08T21:55:50Z

48 minutes ago, Pocster said:

Equally my house is the sandbox and I’m the guinea pig ! . Perfect setup for the experiment!!

Careful where you might end up...

2001: A Space Odyssey | 2001 a space odyssey, Space odyssey, Movie lines

SimonD · 2026-05-08T23:16:01Z

2 hours ago, -rick- said:

Everything I'm seeing suggests that the world has changed somewhat over the last 4 months or so, so if you haven't tried with codex 5.3/opus 4.6 then you might have the wrong mental model of where things are.

Yes, there's definitely been some big changes over the last few months. It seems to have been exponential and the quality of output and reasoning is noticeable. Prompting though I still really key. I did a comparison last week with my son who'd put in a prompt and received some fairly generic stuff back. I sat down and wrote a prompt and he was taken aback by how long and detailed the prompt was compared to his. Then he was blown away by the output compared to what he had received. I actually ask the AI to create prompts for me or guide me in how best to structure prompt based on what I want and like @Pocster experiences, the output can be like night and day. Often just writing my prompts takes much longer than the thinking and generating the output.

Pocster · 2026-05-09T06:11:03Z

6 hours ago, SimonD said:

Yes, there's definitely been some big changes over the last few months. It seems to have been exponential and the quality of output and reasoning is noticeable. Prompting though I still really key. I did a comparison last week with my son who'd put in a prompt and received some fairly generic stuff back. I sat down and wrote a prompt and he was taken aback by how long and detailed the prompt was compared to his. Then he was blown away by the output compared to what he had received. I actually ask the AI to create prompts for me or guide me in how best to structure prompt based on what I want and like @Pocster experiences, the output can be like night and day. Often just writing my prompts takes much longer than the thinking and generating the output.

Precisely ! Not coding , not design - prompt writing ! Even that as you say can written by a more capable llm . Limitation is now human imagination!!!!

Pocster · 2026-05-09T06:16:30Z

😎

Edited 1 hour ago by Pocster

Sign In

Avalon local LLM

Recommended Posts

Pocster

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

Gone West

Gone West

Pocster

Posted Images

SteamyTea

Pocster

Pocster

SimonD

Pocster

Pocster

-rick-

-rick-

Pocster

Pocster

SimonD

Pocster

-rick-

SimonD

Pocster

Pocster

Create an account or sign in to comment

Create an account

Sign in

Gone West

Gone West

Pocster

Activity

Browse