Jump to content

Pocster

Members
  • Posts

    14273
  • Joined

  • Last visited

  • Days Won

    29

Pocster last won the day on August 4 2025

Pocster had the most liked content!

3 Followers

Personal Information

  • Location
    Bristol

Recent Profile Visitors

18760 profile views

Pocster's Achievements

Advanced Member

Advanced Member (5/5)

2.5k

Reputation

  1. Oh yes!. Well as you know you get good days and bad days! Mines been epic. Massive speed increases. Local llm "whats the capital of france?" working. Current affairs " whats the news?" gives headlines and options verbally if you want more detail. Will add history so you can have a conversation. Gained a further 82ms saving on STT (I know, I know !). Honestly now its so fast to respond to even complex stuff I'm well impressed. Started on timers like Alexa (a SWMBO requirement!). TBH if I coded this by hand that's weeks of work for sure. But of course I never look at the code! G n T time now!
  2. Chat has been SO good today I might give it a promotion - nothing to do with me spending 90 quid......
  3. Saved another 250ms ... yeah I know. I'll stop now! sad.
  4. WOW oh wow! Never really looked into how a LLM generates its output i.e. the cost. Assumed its just generated at the end but it isn't. It's generated as it goes ! So each token passes through the model. Never thought of that! SO! 5 seconds with a moderately complex phrase after json compaction is now 1.3 seconds! BOOM! WHO"S THE MOFO!
  5. Really awful bug. Chat 5.5 thinking kept patching and we kept rolling back. I kept trying to think of other ways to deal with it so we can try different approaches. Been at it for 45 minutes. Rolled it back. clicked "pro" gave it all the info I could. Pro then thinking for 14 minutes!. Found a really obscure issue - MAGIC! FIXED!
  6. Dont trust any of the AI firms.... Apparently GLM5.2 local is really good - of course hardly anyone can run it ....
  7. After paying 90 quid even 5.5 thinking seems considerably better then before - funny that.....
  8. I now asked it if it was legal to offer a service where I have no idea what I am getting nor for how long and yet offer of upgrade for 5x or 20x of an unknown. Its been thinking the longest I've ever seen it think!. It agrees openai could be breaching UK consumer law.
  9. not in chat. Just asked it. You just get warnings of "near limit". So you "do something" and hope you have credit left. Anyway Pro gives me "thinking" back and of course everything is so much easier now!
  10. Gave up! Paid 90 for the month. Its unusable as it is - couldnt do the simplest of tasks after multiple evidence etc.. Didnt realise there was a 90 month option!. So thats something. Now after an hour of old chat basically achieving nothing I'm expecting pro to fix this very quickly.
  11. After crippling codex now OpenAi restrict 5.5 "thinking" mode. I hate the AI companies doing this. There's no way in chat to know how much 'credit' you have and usage. So it's like paying for netflix and then being told you've watched too many premium films and you can't watch anymore until next window reset. Also chat+ is 20 a month or next tier 200!!!. That's dumb quite frankly. There's clearly a middle ground there!. Chat not very talkative today - miserable (expletive deleted)er. Now it's unsuable again.... thick as (expletive deleted)!
  12. had problems before with mlx models and prefill. Now though after more experimenting... 50% speed increase !!
  13. BOOM! +008845ms Recording command WAV until silence... +012857ms Command WAV: /Users/ultram3/avalon/.out/avalon_command_turn_0001.wav (6.00s, SILENCE_AFTER_SPEECH) +013965ms One-shot command transcript: Avalon. Set chicken timer for 10 minutes and egg timer for 4 minutes. +013966ms Wake residue stripped command transcript: Set chicken timer for 10 minutes and egg timer for 4 minutes. +013967ms Ministral prompt: /Users/ultram3/avalon/prompts/avalon_media_intent_ministral.txt +017112ms Heard phrase: Set chicken timer for 10 minutes and egg timer for 4 minutes. +017112ms Final phrase: Set chicken timer for 10 minutes and egg timer for 4 minutes +017112ms Ministral intent JSON: {"actions": [{"album_hint": null, "artist_hint": null, "confidence": 0.95, "control": null, "domain": "timer", "duration_seconds": 600, "intent": "set_timer", "notes": "set chicken timer for 10 minutes", "query": null, "timer_label": "chicken", "track_hint": null, "value": null}, {"album_hint": null, "artist_hint": null, "confidence": 0.95, "control": null, "domain": "timer", "duration_seconds": 240, "intent": "set_timer", "notes": "set egg timer for 4 minutes", "query": null, "timer_label": "egg", "track_hint": null, "value": null}]} +017113ms Timer action: set_timer label=chicken duration=600 +017113ms Real route: timer.set_timer; Label: chicken; Duration: 600; Execution: yes +017113ms Timer action: set_timer label=egg duration=240 +017113ms Real route: timer.set_timer; Label: egg; Duration: 240; Execution: yes +017113ms TTS af_jessica: Chicken timer set for 10 minutes and egg timer for 4 minutes +022172ms Restore requested volume: 51 +022185ms Restore verified: yes current_volume=51 expected_volume=51 +022190ms Restored: yes +022190ms Listening...
  14. Natural speech processing is tricky. Originally I went for Alexa simplistic "play coldplay" etc. Deterministic wording. But that's crap and apparently Alexa+ (never used one) allows natural speech. Now we have "Put some coldplay on and set an egg timer for 5 minutes". Wording and phrasing can be loose. "Play coldplay and some nice mumford and sons" Ministral does the parse but it can be funny!. It might do "Play Coldplay." or it might do "play Coldplay" resulting in sometimes an empty JSON . It's random. So empty JSON falls through to Gemma with same prompt. So far this has not failed. I got chat to write a script that tested 1000 phrases with poor spelling or "mould clay" type deliberate wording messes!. Ministral's job is just to get intent i.e. "music", "artist or album". Not to determine if they are real or correct. That goes to jellyfish to fuzzy match against my real LMS library. Regarding wake word openAI/porcupine all crap TBH. Slow and useless. So we have 2 whisper tasks running per microphone. 1 soley transcribing "avalon" (wake word) and its mis heard permutations (frequently "have a long". At the same time we run a rolling 12 second window of wav to text - this is surprisingly accurate. So once wake word has been validated we already have what was said!. Processing all this and ministral and maybe gemma then TTS (Koboros ) is a bit slow. But I cant speed it up much. Models are the smallest reliable ones I can find. Compressing ministrals prompt helps but then we get more "guesses". Another local llm oddity is even if you say "90 minutes" ministral can sometimes convert it to seconds! Other times it's fine! . So timers will enforce minutes (not a real issue tbh). So apart from a complex pipeline thats relatively slow (4 seconds for average prompt) its working well!. 100000% better than home assistant voice shite. Currently offloading rendering to a separate nuc. trying to give m3 as much gpu girth for llm processing.
  15. For isolated things this is a super fast way. Tweaking the graphics for example or an effect . I can literally “ let’s try this “ ; 5 seconds later viewing it . I love this part .
×
×
  • Create New...