SimonD Posted Thursday at 14:55 Posted Thursday at 14:55 2 hours ago, Pocster said: After paying 90 quid even 5.5 thinking seems considerably better - funny that..... Wouldn't put it past them. 1
Pocster Posted Thursday at 15:02 Author Posted Thursday at 15:02 6 minutes ago, SimonD said: Wouldn't put it past them. Dont trust any of the AI firms.... Apparently GLM5.2 local is really good - of course hardly anyone can run it ....
Pocster Posted Thursday at 15:52 Author Posted Thursday at 15:52 (edited) Really awful bug. Chat 5.5 thinking kept patching and we kept rolling back. I kept trying to think of other ways to deal with it so we can try different approaches. Been at it for 45 minutes. Rolled it back. clicked "pro" gave it all the info I could. Pro then thinking for 14 minutes!. Found a really obscure issue - MAGIC! FIXED! Edited Thursday at 16:00 by Pocster
Pocster Posted Friday at 12:27 Author Posted Friday at 12:27 WOW oh wow! Never really looked into how a LLM generates its output i.e. the cost. Assumed its just generated at the end but it isn't. It's generated as it goes ! So each token passes through the model. Never thought of that! SO! 5 seconds with a moderately complex phrase after json compaction is now 1.3 seconds! BOOM! WHO"S THE MOFO!
Pocster Posted Friday at 14:52 Author Posted Friday at 14:52 2 hours ago, Pocster said: WOW oh wow! Never really looked into how a LLM generates its output i.e. the cost. Assumed its just generated at the end but it isn't. It's generated as it goes ! So each token passes through the model. Never thought of that! SO! 5 seconds with a moderately complex phrase after json compaction is now 1.3 seconds! BOOM! WHO"S THE MOFO! Saved another 250ms ... yeah I know. I'll stop now! sad.
Pocster Posted Friday at 16:10 Author Posted Friday at 16:10 Chat has been SO good today I might give it a promotion - nothing to do with me spending 90 quid......
SimonD Posted Friday at 16:22 Posted Friday at 16:22 1 minute ago, Pocster said: Chat has been SO good today I might give it a promotion - nothing to do with me spending 90 quid...... Bastard! 😉 I'm having a shit day today. Realised it had left a glaring security hole in the Auth model. Decided to fix it, but instead has broken the (expletive deleted)ing app. It fixed part of the problem and then tried to tell me the rest wasn't important until I told it that I could grab an id and post it into the browser in a string and it would expose the entire records for a user, even when not logged it! What was supposed to be a couple of hours at most has ended up taking all bloody day and I'm still trying to explain to it what's going wrong and it still misunderstands me! Thank (expletive deleted) I'm on a dev server. 1
Pocster Posted Friday at 16:35 Author Posted Friday at 16:35 (edited) 13 minutes ago, SimonD said: Bastard! 😉 I'm having a shit day today. Realised it had left a glaring security hole in the Auth model. Decided to fix it, but instead has broken the (expletive deleted)ing app. It fixed part of the problem and then tried to tell me the rest wasn't important until I told it that I could grab an id and post it into the browser in a string and it would expose the entire records for a user, even when not logged it! What was supposed to be a couple of hours at most has ended up taking all bloody day and I'm still trying to explain to it what's going wrong and it still misunderstands me! Thank (expletive deleted) I'm on a dev server. Oh yes!. Well as you know you get good days and bad days! Mines been epic. Massive speed increases. Local llm "whats the capital of france?" working. Current affairs " whats the news?" gives headlines and options verbally if you want more detail. Will add history so you can have a conversation. Gained a further 82ms saving on STT (I know, I know !). Honestly now its so fast to respond to even complex stuff I'm well impressed. Started on timers like Alexa (a SWMBO requirement!). TBH if I coded this by hand that's weeks of work for sure. But of course I never look at the code! G n T time now! Edited Friday at 16:36 by Pocster
SimonD Posted yesterday at 10:32 Posted yesterday at 10:32 17 hours ago, Pocster said: Oh yes!. Well as you know you get good days and bad days! Oh yes. Finally got it sorted by about 9pm!!!!! 17 hours ago, Pocster said: But of course I never look at the code! G n T time now! It's interesting this and token burn rate. This morning I needed to do some financial modelling and analysis so selected Opus 4.8 for the conversation. 30 min later I looked at my session stats and I'd gone through nearly 80% of my session allocation. Then I select Sonnet 4.6 for another conversation - involves maths but nowhere near as complex and the conversation only burns 3% of session allocation but provides the answers I need. It certainly pays to switch models for different needs. 1
Pocster Posted yesterday at 10:38 Author Posted yesterday at 10:38 4 minutes ago, SimonD said: It certainly pays to switch models for different needs. Though cloud tokens arent an issue for me now I use ministral mlx for fast language parse. If its a knowledge question it passes it to qwen. I suspect I'll be passing other stuff around to various models depending on the requirement. I'm having a break from that part and looking at realtime 3d scenery for the backdrop - I like my eye candy! 1
SimonD Posted yesterday at 11:20 Posted yesterday at 11:20 4 minutes ago, Pocster said: v1 That's awesome! 1
Pocster Posted yesterday at 11:23 Author Posted yesterday at 11:23 2 minutes ago, SimonD said: That's awesome! All of 2 minutes work! 😀
Pocster Posted yesterday at 11:35 Author Posted yesterday at 11:35 But I expect I'll go Minecraft style!
Pocster Posted yesterday at 11:46 Author Posted yesterday at 11:46 Welcome to my BS YouTube channel. Today we are going to create a minecraft inspired game in 10 seconds!. Hit subscribe!, it's you viewers I need! - thanks to our sponsor buildhub.org.uk. 1
Pocster Posted yesterday at 15:54 Author Posted yesterday at 15:54 (edited) SWMBO made me do some outdoor work! . Anyway welcome back to my youtube channel. Chat not very good at optimising drawcalls. We started with 50k!!!!!. Now down to 1. It needed a word. Struggling with the river at the moment. Hit that subscribe button NOW! Edited yesterday at 16:17 by Pocster
Pocster Posted 9 hours ago Author Posted 9 hours ago common things I tell chat when coding. "Use" when I upload a file. Sometimes it ignores the evidence even though is asked for it. "don't guess" to hopefully avoid blind patches "you a zx spectrum?", "this a bbc c demo?", "does that look like a ps5 effect?" to 'motivate' it And of course lots of abuse. It's so funny. I has repled the obvious "yes thats shit" , but once it said "it is a load of wank"
Pocster Posted 7 hours ago Author Posted 7 hours ago Abuse yields results! River with bed, ripples, refraction. Real shadow volumes. Rocks with proper 'realistic' placement. Wind so canapoies can move. Look at that drawcount 🙂 just 270 ish.
MikeSharp01 Posted 1 hour ago Posted 1 hour ago Well I had a play today with my old surface pro - it has a GTX 1060 GPU with 6Gb of RAM, its about 9 years old! I thought can I get Gemma4 4b into that and drawing. SO I gave it a try and after some messing about I ended up with this - some HTML code output at about 25 tokens / second. Using an un-quantised Gemma4 4b model - not bad I thought without any flooding into CPU memory. 1
Pocster Posted 1 hour ago Author Posted 1 hour ago 24 minutes ago, MikeSharp01 said: Well I had a play today with my old surface pro - it has a GTX 1060 GPU with 6Gb of RAM, its about 9 years old! I thought can I get Gemma4 4b into that and drawing. SO I gave it a try and after some messing about I ended up with this - some HTML code output at about 25 tokens / second. Using an un-quantised Gemma4 4b model - not bad I thought without any flooding into CPU memory. You need to go three.js .... man stuff 🙂
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now