View Single Post
  #6  
Old 01-29-2025, 22:33
chants chants is offline
VIP
 
Join Date: Jul 2016
Posts: 826
Rept. Given: 47
Rept. Rcvd 50 Times in 31 Posts
Thanks Given: 737
Thanks Rcvd at 1,140 Times in 529 Posts
chants Reputation: 51
It would be nice to train an RE model. The good news now is that training is being shown to be feasible possibly on an academic grant level budget. Someone should train a proper open source RE model at some point.

1PFLOP FP4 is a marketing gimmick maybe, that amount of RAM is a big plus tho. The new DeepSeek models use FP8 and have shown it's reliable for training, a good breakthrough. Sounds good enough to run good size models at moderate load.

Alibaba sounds interesting haven't heard much about it.

By the way DeepSeek censorship from demos I saw is on the website but at least running R1 locally, it seems to not be censoring those things much or at all.
Reply With Quote