mirror of https://github.com/nomic-ai/gpt4all
speedup: just use mat*vec shaders for mat*mat
so far my from-scratch mat*mats are still slower than just running more invocations of the existing Metal ported mat*vec shaders - it should be theoretically possible to make a mat*mat that's faster (for actual mat*mat cases) than an optimal mat*vec, but it will need to be at *least* as fast as the mat*vec op and then take special care to be cache-friendly and save memory bandwidth, as the # of compute ops is the samekp-logger-fix
parent
22de3c56bd
commit
f79557d2aa
@ -1 +1 @@
|
||||
Subproject commit 500689ad356a81a471a7fb68cc70f7aee5a5f56e
|
||||
Subproject commit 81c24d7b7df0d3564c8563bb769bd0302588fe1f
|
Loading…
Reference in New Issue