Jump to content

Pocster

Members
  • Posts

    14261
  • Joined

  • Last visited

  • Days Won

    29

Pocster last won the day on August 4 2025

Pocster had the most liked content!

3 Followers

Personal Information

  • Location
    Bristol

Recent Profile Visitors

18750 profile views

Pocster's Achievements

Advanced Member

Advanced Member (5/5)

2.5k

Reputation

  1. BOOM! +008845ms Recording command WAV until silence... +012857ms Command WAV: /Users/ultram3/avalon/.out/avalon_command_turn_0001.wav (6.00s, SILENCE_AFTER_SPEECH) +013965ms One-shot command transcript: Avalon. Set chicken timer for 10 minutes and egg timer for 4 minutes. +013966ms Wake residue stripped command transcript: Set chicken timer for 10 minutes and egg timer for 4 minutes. +013967ms Ministral prompt: /Users/ultram3/avalon/prompts/avalon_media_intent_ministral.txt +017112ms Heard phrase: Set chicken timer for 10 minutes and egg timer for 4 minutes. +017112ms Final phrase: Set chicken timer for 10 minutes and egg timer for 4 minutes +017112ms Ministral intent JSON: {"actions": [{"album_hint": null, "artist_hint": null, "confidence": 0.95, "control": null, "domain": "timer", "duration_seconds": 600, "intent": "set_timer", "notes": "set chicken timer for 10 minutes", "query": null, "timer_label": "chicken", "track_hint": null, "value": null}, {"album_hint": null, "artist_hint": null, "confidence": 0.95, "control": null, "domain": "timer", "duration_seconds": 240, "intent": "set_timer", "notes": "set egg timer for 4 minutes", "query": null, "timer_label": "egg", "track_hint": null, "value": null}]} +017113ms Timer action: set_timer label=chicken duration=600 +017113ms Real route: timer.set_timer; Label: chicken; Duration: 600; Execution: yes +017113ms Timer action: set_timer label=egg duration=240 +017113ms Real route: timer.set_timer; Label: egg; Duration: 240; Execution: yes +017113ms TTS af_jessica: Chicken timer set for 10 minutes and egg timer for 4 minutes +022172ms Restore requested volume: 51 +022185ms Restore verified: yes current_volume=51 expected_volume=51 +022190ms Restored: yes +022190ms Listening...
  2. Natural speech processing is tricky. Originally I went for Alexa simplistic "play coldplay" etc. Deterministic wording. But that's crap and apparently Alexa+ (never used one) allows natural speech. Now we have "Put some coldplay on and set an egg timer for 5 minutes". Wording and phrasing can be loose. "Play coldplay and some nice mumford and sons" Ministral does the parse but it can be funny!. It might do "Play Coldplay." or it might do "play Coldplay" resulting in sometimes an empty JSON . It's random. So empty JSON falls through to Gemma with same prompt. So far this has not failed. I got chat to write a script that tested 1000 phrases with poor spelling or "mould clay" type deliberate wording messes!. Ministral's job is just to get intent i.e. "music", "artist or album". Not to determine if they are real or correct. That goes to jellyfish to fuzzy match against my real LMS library. Regarding wake word openAI/porcupine all crap TBH. Slow and useless. So we have 2 whisper tasks running per microphone. 1 soley transcribing "avalon" (wake word) and its mis heard permutations (frequently "have a long". At the same time we run a rolling 12 second window of wav to text - this is surprisingly accurate. So once wake word has been validated we already have what was said!. Processing all this and ministral and maybe gemma then TTS (Koboros ) is a bit slow. But I cant speed it up much. Models are the smallest reliable ones I can find. Compressing ministrals prompt helps but then we get more "guesses". Another local llm oddity is even if you say "90 minutes" ministral can sometimes convert it to seconds! Other times it's fine! . So timers will enforce minutes (not a real issue tbh). So apart from a complex pipeline thats relatively slow (4 seconds for average prompt) its working well!. 100000% better than home assistant voice shite. Currently offloading rendering to a separate nuc. trying to give m3 as much gpu girth for llm processing.
  3. For isolated things this is a super fast way. Tweaking the graphics for example or an effect . I can literally “ let’s try this “ ; 5 seconds later viewing it . I love this part .
  4. Frequently its "given up" today. Seems to go through hourly phases even with the same code of amazing and shite. Chat's bodging again now. Gave it a specific render description and it did a cheap BBC B demo for me. It's like it's programmed to sometimes offer cheap solutions i.e less compute. I reckon at the end of the day when it goes home to Mrs Chat it bitches about this guy who's only on the 20 quid plan and wants the (expletive deleted)ing world! Now its basically forcing me to use codex cloud ... "I cant do it but codex cloud can patch this" . I'm going to conclude this is deliberate as now each scout/patch uses around 5% of my 5 hr quota. These companies are shit. Need to use codex cloud for major revision or feature add then tell chat to do one and use local llm for scout.
  5. The thing chat and I suspect all coding AI's are crap at is not 'thinking' about the problem outside the box. For example ask it for code to draw a circle (ignoring a circle primitive) it would use sin/cos/pi because that's standard maths. But! thats really shit. Will work but slow. I can think of 20 ways that would be 1000% faster. Equally a Astar algorithm i.e. path finding from baddy to player around scenery. Standard methods will be used. Shite slow, not practical in a real game. So "understanding" the problem for efficient code is the method not the code. That requires a human. I was at 86% gpu usage because its code uses 'standard' methods. Guide it on different techniques and it visually looks the same but halved gpu usage.
  6. I can’t draw for toffee so chat does code so everything is procedurally generated . Reflections , fresnel , blooming all the ps5 effects I love . Lipsync on the bots mouth . Fuzzy logic because when you say “ Birdy “ it could be translated as “ birdie “ . Also phonetic matching e.g “mould play “ = “ Coldplay “ . No hard coding of phrases everything just open source . Love it .
  7. OH! My rendering with reflection/bloom (expletive deleted)s m3 gpu even before local LLM does work. Oh!. looks like I need a 2nd pc just for dashboard!
  8. Kid in a sweetshop!.
  9. tell you something I love with chat. I upload screenshots when we have issues. For a whiel we had some old text/icons - no issue just left there while I fix other things. It would seem from repeated screen shots it decided to remove them. No "I'm getting rid of this" just gone. (expletive deleted)ING MAGIC I TELL YOU!
  10. Hmmmm, your project you're working on must have REALLY upset someone!
  11. Erm, watching your ££££ go. This is what I hate with cloud models. Not just the cost. But that they can change costs at any time, it's clear that what cloud model you run now may not be the cloud model you run in 1 mnute i.e. backend changes. I hate it. You become dependent on it like a drug dealer and then they move the goal posts. This is my main reason for enjoying the 20 quid 'near' claude experience with chat. It requires more work to setup, but i cant bitch (though I do!) for £20. But of ourse a local llm will be consistent .... models just aint quite there yet. But you know even though not out an M5 with 512gb still tempting. Really need models to catch up rather than hardware.
  12. I often say to chat "Are you a BBC Model B?, you are supposedly frontier cutting edge AI. So stop being a prick". Surprisingly it does frquently man up and produce something nearer what I requested. SO the answer is insult it for better results. Like humans I guess!
  13. or (expletive deleted)ing token speed "test". mac Mx vs rtxY . Mac loses of course. Thats what stopped me getting the m3 at first always slower than rtx BUT when you need a larger model or multiple models Mac wins. For me flexibility over speed is the winner easily. (expletive deleted)ing 10k for an rtx6000 with 96Gb.... Nvidia make macs look cheap!
  14. (expletive deleted)s me off when I see youtube chatgpv vs opus vs fable all given the same prompot to "write a flappy birds 3d game" and thats a test!. Amazing any of them produce anything at all but (expletive deleted) me - it's not a TOY!. It's super powerful.
  15. I also notice chat seems to offer default simple solutions. So if I ask for a 'face' as I've done. I did give some detail but I end up with a crude basic version. Even if I ask for an authentic water effect and specify it I still get a very basic version. I queried why after I had spec'd it and it said because most people don't want the full thing. I asked for examples. People asking for "a minecraft game" - is obviously a massive project so it gives a very crude simplistic response - because apparently people wont specify exact detail for more advancement. I assume this is where the "toy" reference comes from. Chat seems to believe 99% of coding tasks are simplistic and non challenging "fix this bug" , "refactor this" . Of course you could argue that is not a chatbot's job to be a claude complete "coding solution" - but as I said for 20 quid a month effectively unlimited its the best value for money ever.
×
×
  • Create New...