imaginAIry/docs/emad-qa-2020-10-10-raw-transcript.txt

[00:00.000 --> 00:08.680]  1.5 isn't that big an improvement over 1.4, but it's still an improvement.
[00:08.680 --> 00:14.240]  And as we go into version 3 and the Imagen models that are training away now, which is
[00:14.240 --> 00:18.520]  like we have a 4.3 billion parameter one and others, we're considering what is the best
[00:18.520 --> 00:19.520]  data for that?
[00:19.520 --> 00:24.000]  What's the best system for that to avoid extreme edge cases, because there's always people
[00:24.000 --> 00:26.440]  who want to spoil the party.
[00:26.440 --> 00:30.120]  This has caused the developers themselves, and again, kind of I haven't done a big push
[00:30.120 --> 00:35.640]  here, it has been from the developers, to ask for a bit more time to consult and come
[00:35.640 --> 00:40.520]  up with a proper roadmap for releasing this particular class of model.
[00:40.520 --> 00:44.560]  They will be released for research and other purposes, and again, I don't think the license
[00:44.560 --> 00:48.480]  is going to change from the open rail end license, it's just that they want to make
[00:48.480 --> 00:53.360]  sure that all the boxes are ticked rather than rushing them out, given, you know, some
[00:53.360 --> 00:55.960]  of these edge cases of danger here.
[00:55.960 --> 01:01.440]  The other part is the movement of the repository and the taking over from CompViz, which is
[01:01.440 --> 01:06.760]  an academic research lab, again, who had full independence, relatively speaking, over the
[01:06.760 --> 01:11.120]  creation of decisions around the model, to StabilityAI itself.
[01:11.120 --> 01:15.960]  Now this may seem like just hitting a fork button, but you know, we've taken in legal
[01:15.960 --> 01:20.480]  counsel and a whole bunch of other things, just making sure that we are doing the right
[01:20.480 --> 01:25.440]  thing and are fully protected around releasing some of these models in this way.
[01:25.440 --> 01:31.880]  I believe that that process is nearly complete, it certainly cost us a lot of money, but you
[01:31.880 --> 01:36.720]  know, it will either be ourselves or an independent charity maintaining that particular repository
[01:36.720 --> 01:41.040]  and releasing more of these generative models.
[01:41.040 --> 01:45.480]  Stability itself, and again, kind of our associated entities, have been releasing over half a
[01:45.480 --> 01:51.440]  dozen models in the last weeks, so a model a week effectively, and in the next couple
[01:51.440 --> 01:57.280]  of days we will be making three releases, so the Discord bot will be open sourced, there
[01:57.280 --> 02:03.440]  is a diffusion-based upscaler that is really quite snazzy that will be released as well,
[02:03.440 --> 02:08.600]  and then finally there will be a new decoder architecture that Rivers Have Wings has been
[02:08.600 --> 02:14.360]  working on for better human faces and other elements trained on the aesthetic and humans
[02:14.360 --> 02:16.080]  thing.
[02:16.080 --> 02:19.240]  The core models themselves are still a little bit longer while we sort out some of these
[02:19.240 --> 02:22.880]  edge cases, but once that's in place, hopefully we should be able to release them as fast
[02:22.880 --> 02:28.080]  as our other models, such as for example the open clip model that we released, and there
[02:28.080 --> 02:33.040]  will be our clip guidance instructions released soon that will enable you to have mid-journey
[02:33.040 --> 02:42.080]  level results utilising those two, which took 1.2 million A100 hours, so like almost eight
[02:42.080 --> 02:45.720]  times as much as stable diffusion itself.
[02:45.720 --> 02:50.120]  Similarly, we released our language models and other things, and those are pretty straightforward,
[02:50.120 --> 02:54.800]  they are MIT, it's just again, this particular class of models needs to be released properly
[02:54.800 --> 02:59.440]  and responsibly, otherwise it's going to get very messy.
[02:59.440 --> 03:04.280]  Some of you will have seen a kind of congresswoman issue coming out and directly attacking us
[03:04.280 --> 03:09.440]  and asking us to be classified as dual-use technology and be banned by the NSA, there
[03:09.440 --> 03:13.960]  is European Parliament actions and others, because they just think the technology is
[03:13.960 --> 03:19.040]  simple, we are working hard to avoid that, and again, we'll continue from there.
[03:19.040 --> 03:24.000]  Okay, next question, oh wait, you've been pinning questions, thank you mods.
[03:24.000 --> 03:31.080]  Okay, the next question was, interested in hearing SD's views on artistic freedom versus
[03:31.080 --> 03:36.680]  censorship in models, so that's Cohen.
[03:36.680 --> 03:42.040]  My view is basically if it's legal, then it should be allowed, if it's illegal, then we
[03:42.040 --> 03:47.320]  should at least take some steps to try and adjust things around that, now that's obviously
[03:47.320 --> 03:50.720]  a very complicated thing, as legal is different in a lot of different countries, but there
[03:50.720 --> 03:58.120]  are certain things that you can look up the law, that's illegal to create anywhere.
[03:58.120 --> 04:02.000]  I'm in favour of more permissiveness, and you know, leaving it up to localised ethics
[04:02.000 --> 04:06.880]  and morality, because the reality is that that varies dramatically across many years,
[04:06.880 --> 04:10.840]  and I think it's our place to kind of police that, similarly, as you've seen with Dream
[04:10.840 --> 04:15.320]  Booth and all these other extensions on stable diffusion, these models are actually quite
[04:15.320 --> 04:20.160]  easy to train, so if something's not in the dataset, you can train it back in, if it doesn't
[04:20.160 --> 04:25.040]  fit in with the legal area of where we ourselves release from.
[04:25.040 --> 04:32.600]  So I think, you know, again, what's legal is legal, ethical varies, et cetera, the main
[04:32.600 --> 04:36.440]  thing that we want to try and do is that model produces what you want it to produce, I think
[04:36.440 --> 04:37.440]  that's an important thing.
[04:37.440 --> 04:41.600]  I think you guys saw at the start, before we had all the filters in place, that stable
[04:41.600 --> 04:46.880]  diffusion trained on the snapshot of the internet, as it was, it's just, when you type to the
[04:46.880 --> 04:51.140]  women, it had kind of toplessness for a lot of any type of artistic thing, because a lot
[04:51.140 --> 04:59.800]  of topless women in art, even though art is less than like, 4.5% of the dataset, you know,
[04:59.800 --> 05:02.760]  that's not what people wanted, and again, we're trying to make it so that it produces
[05:02.760 --> 05:06.680]  what you want, as long as it is legal, I think that's probably the core thing here.
[05:06.680 --> 05:11.400]  Okay, Sirius asks, any update on the updated credit pricing model that was mentioned a
[05:11.400 --> 05:14.120]  couple of days ago, as in, is it getting much cheaper?
[05:14.120 --> 05:21.960]  Yes, next week, there'll be a credit pricing, a credit pricing adjustment from our side.
[05:21.960 --> 05:27.000]  There have been lots of innovations around inference and a whole bunch of other things,
[05:27.000 --> 05:29.600]  and the team has been testing it in staging and hosting.
[05:29.600 --> 05:32.640]  You've seen this as well in the diffusers library and other things, Facebook recently
[05:32.640 --> 05:36.720]  came out with some really interesting fast attention kind of elements, and we'll be passing
[05:36.720 --> 05:38.920]  on all of those savings.
[05:38.920 --> 05:43.400]  The way that it'll probably be is that credits will remain as is, but you will be able to
[05:43.400 --> 05:47.880]  do a lot more with your credits, as opposed to the credits being changed in price, because
[05:47.880 --> 05:53.400]  I don't think that's fair to anyone if we change the price of the credits.
[05:53.400 --> 05:57.600]  Can we get an official statement on why automatic was banned and why novel AI used this code?
[05:57.600 --> 06:01.560]  Okay, so the official statement is as follows.
[06:01.560 --> 06:06.560]  I don't particularly like discussing individual user bans and things like that, but this was
[06:06.560 --> 06:12.920]  escalated to me because it's a very special case, and it comes at a time, again, of increased
[06:12.920 --> 06:16.120]  notice on the community and a lot of these other things.
[06:16.120 --> 06:18.740]  I've been working very hard around this.
[06:18.740 --> 06:24.320]  Automatic created a wonderful web UI that increased the accessibility of stable diffusion
[06:24.320 --> 06:25.320]  to a lot of different people.
[06:25.320 --> 06:28.320]  You can see that by the styles and other things.
[06:28.320 --> 06:34.040]  It's not open source, and I believe there is a copyright on it, but still, again, work
[06:34.040 --> 06:35.040]  super hard.
[06:35.040 --> 06:38.920]  A lot of people kind of helped out with that, and it was great to see.
[06:38.920 --> 06:44.000]  However, we do have a very particular stance on community as to what's acceptable and what's
[06:44.000 --> 06:45.000]  not.
[06:45.000 --> 06:51.920]  I think it's important to kind of first take a step back and understand what stability
[06:51.920 --> 06:56.600]  is and what stable diffusion is and what this community is, right?
[06:56.600 --> 07:00.440]  AI is a company that's trying to do good.
[07:00.440 --> 07:02.160]  We don't have profit as our main thing.
[07:02.160 --> 07:04.440]  We are completely independent.
[07:04.440 --> 07:08.320]  It does come a lot from me and me trying to do my best as I try to figure out governance
[07:08.320 --> 07:11.240]  structures to fit things, but I do listen to the devs.
[07:11.240 --> 07:13.680]  I do listen to my team members and other things.
[07:13.680 --> 07:16.960]  Obviously, we have a profit model and all of that, but to be honest, we don't really
[07:16.960 --> 07:21.600]  care about making revenue at the moment because it's more about the deep tech that we do.
[07:21.600 --> 07:22.980]  We don't just do image.
[07:22.980 --> 07:24.660]  We do protein folding.
[07:24.660 --> 07:28.000]  We release language models, code models, the whole gamut of things.
[07:28.000 --> 07:33.240]  In fact, we are the only multimodal AI company other than OpenAI, and we release just about
[07:33.240 --> 07:37.920]  everything with the exception of generative models until we figure out the processes for
[07:37.920 --> 07:38.920]  doing that.
[07:38.920 --> 07:39.920]  MIT open-sourced.
[07:39.920 --> 07:41.120]  What does that mean?
[07:41.120 --> 07:45.720]  It means that literally everything is open-sourced.
[07:45.720 --> 07:47.120]  Against that, we come under attack.
[07:47.120 --> 07:51.440]  So our model weights, when we released it for academia, were leaked.
[07:51.440 --> 07:55.280]  We collaborate with a lot of entities, so NovelAI is one of them, and their engineers
[07:55.280 --> 07:58.580]  have hit with various code-based things, and I think we've helped as well.
[07:58.580 --> 08:03.760]  They are very talented engineers, and you'll see they've just released a list of all the
[08:03.760 --> 08:06.880]  things that they did to improve stable diffusion because they were actually going to open-source
[08:06.880 --> 08:15.120]  it very soon, I believe it was next week, before the code was stolen from their system.
[08:15.120 --> 08:22.080]  We have a very strict no-support policy for stolen code because this is a very sensitive
[08:22.080 --> 08:23.080]  area for us.
[08:23.080 --> 08:25.960]  We do not have a commercial partnership with NovelAI.
[08:25.960 --> 08:27.000]  We do not pay them.
[08:27.000 --> 08:28.000]  They do not pay us.
[08:28.000 --> 08:31.440]  They're just members of the community like any other, but when you see these things,
[08:31.440 --> 08:36.480]  if someone stole our code and released it and it was dangerous, I wouldn't find that
[08:36.480 --> 08:37.480]  right.
[08:37.480 --> 08:39.400]  If someone stole their code, if someone stole other codes, I don't believe that's right
[08:39.400 --> 08:41.760]  either in terms of releasing.
[08:41.760 --> 08:46.320]  Now in this particular case, what happened is that the community member and person was
[08:46.320 --> 08:48.640]  contacted and there was a conversation made.
[08:48.640 --> 08:50.800]  He made some messages public.
[08:50.800 --> 08:52.520]  Other messages were not made public.
[08:52.520 --> 08:53.640]  I looked at all the facts.
[08:53.640 --> 08:58.800]  I decided that this was a banable offense on the community.
[08:58.800 --> 09:00.240]  I'm not a stupid person.
[09:00.240 --> 09:01.440]  I am technical.
[09:01.440 --> 09:06.160]  I do understand a lot of things, and I put everyone there to kind of make this as a clear
[09:06.160 --> 09:07.160]  point.
[09:07.160 --> 09:11.120]  Stable diffusion community itself is one of community of stability AI, and it's one community
[09:11.120 --> 09:12.120]  of stable diffusion.
[09:12.120 --> 09:15.440]  Stable diffusion is a model that's available to the whole world, and you can build your
[09:15.440 --> 09:18.560]  own communities and take this in a million different ways.
[09:18.560 --> 09:23.360]  It is not healthy if stability AI is at the center of everything that we do, and that's
[09:23.360 --> 09:25.240]  not what we're trying to create.
[09:25.240 --> 09:29.600]  We're trying to create a multiplicity of different areas that you can discuss and take things
[09:29.600 --> 09:35.200]  forward and communities that you feel you yourself are a stable part of.
[09:35.200 --> 09:41.880]  Now, this particular one is regulated, and it is not a free-for-all.
[09:41.880 --> 09:45.840]  It does have specific rules, and there are specific things within it.
[09:45.840 --> 09:49.360]  Again, it doesn't mean that you can't go elsewhere to have these discussions.
[09:49.360 --> 09:51.600]  We didn't take it down off GitHub or things like that.
[09:51.600 --> 09:55.360]  We leave it up to them, but the manner in which this was done and there are other things
[09:55.360 --> 10:00.240]  that aren't made public, I did not feel it was appropriate, and so I approved the banning
[10:00.240 --> 10:02.320]  and the buck stops with me there.
[10:02.320 --> 10:06.520]  If the individual in question wants to be unbanned and rejoin the community, there is
[10:06.520 --> 10:08.040]  a process for repealing bans.
[10:08.040 --> 10:12.080]  We have not received anything on that side, and I'd be willing to hear other stuff if
[10:12.080 --> 10:17.240]  maybe I didn't have the full picture, but as it is, that's where it stands, and again,
[10:17.240 --> 10:23.340]  like I said, we cannot support any illegal theft as direct theft in there.
[10:23.340 --> 10:27.960]  With regards to the specific code point, you can ask novel AI themselves what happened
[10:27.960 --> 10:28.960]  there.
[10:28.960 --> 10:33.280]  They said that there was AGPL code copied over, and then they rescinded it as soon as
[10:33.280 --> 10:35.080]  it was notified, and they apologized.
[10:35.080 --> 10:39.740]  That did not happen in this case, and again, we cannot support any leaked models, and we
[10:39.740 --> 10:42.960]  cannot support that because, again, the safety issues around this and the fact that if you
[10:42.960 --> 10:48.560]  start using leaked and stolen code, there are some very dangerous liability concerns
[10:48.560 --> 10:50.560]  that we wish to protect the community from.
[10:50.560 --> 10:56.240]  We cannot support that particular code base at the moment, and we can't support that individual
[10:56.240 --> 10:57.240]  being a member of the community.
[10:57.240 --> 11:02.640]  Also, I would like to say that a lot of insulting things were said, and we let it slide this
[11:02.640 --> 11:03.640]  once.
[11:03.640 --> 11:05.140]  Don't be mean, man.
[11:05.140 --> 11:06.140]  Just talk responsibly.
[11:06.140 --> 11:12.120]  Again, we're happy to have considered and thought-out discussions offline and online.
[11:12.120 --> 11:16.760]  If you do start insulting other members, then please flag it to moderators, and there will
[11:16.760 --> 11:20.440]  be timeouts and bans because, again, what is this community meant to be?
[11:20.440 --> 11:26.360]  It's meant to be quite a broad but core and stable community that is our private community
[11:26.360 --> 11:32.760]  as Stability AI, but, like I said, the beauty of open source is that if this is not a community
[11:32.760 --> 11:34.800]  you're comfortable with, you can go to other communities.
[11:34.800 --> 11:36.600]  You can set up your own communities.
[11:36.600 --> 11:38.900]  You can set up your notebooks and others.
[11:38.900 --> 11:46.600]  In fact, when you look at it, just about every single web UI has a member of Stability contributing.
[11:46.600 --> 11:53.280]  From Pharma Psychotic at DeForum through to Dango on Majesty through to Gandamu at Disco,
[11:53.280 --> 11:58.600]  we have been trying to push open source front-ends with no real expectations of our own because
[11:58.600 --> 12:02.860]  we believe in the ability for people to remix and build their own communities around that.
[12:02.860 --> 12:06.600]  Stability has no presence in these other communities because those are not our communities.
[12:06.600 --> 12:07.600]  This one is.
[12:07.600 --> 12:13.480]  So, again, like I said, if Automattic does want to have a discussion, my inbox is open,
[12:13.480 --> 12:17.440]  and if anyone feels that they're unjustly timed out or banned, they can appeal them.
[12:17.440 --> 12:18.840]  Again, there is a process for that.
[12:18.840 --> 12:22.640]  That hasn't happened in this case, and, again, it's a call that I made looking at some publicly
[12:22.640 --> 12:26.920]  available information and some non-publicly available information, and I wish them all
[12:26.920 --> 12:29.120]  the best.
[12:29.120 --> 12:31.760]  I think that's it.
[12:31.760 --> 12:35.160]  Will Stability provide, fund, and model to create new medicines?
[12:35.160 --> 12:39.320]  We're currently working on DNA diffusion that will be announced next week for some of the
[12:39.320 --> 12:41.880]  DNA expression things in our open Biomel community.
[12:41.880 --> 12:42.880]  Feel free to join that.
[12:42.880 --> 12:47.240]  It's about two and a half thousand members, and currently I believe it's been announced
[12:47.240 --> 12:52.600]  LibraFold with Sergei Shrinikov's lab at Harvard and UCL, so that's probably going to be the
[12:52.600 --> 12:56.720]  most advanced protein folding model in the world, more advanced than AlphaFold.
[12:56.720 --> 12:59.840]  It's just currently undergoing ablations.
[12:59.840 --> 13:02.600]  Repurposing of medicines and discovery of new medicines is something that's very close
[13:02.600 --> 13:04.600]  to my heart.
[13:04.600 --> 13:11.280]  Many of you may know that basically the origins of Stability were leading and architecting
[13:11.280 --> 13:16.920]  and running the United Nations AI Initiative against COVID-19, so I was the lead architect
[13:16.920 --> 13:23.280]  of that to try and get a lot of this knowledge coordinated around that.
[13:23.280 --> 13:26.040]  We made all the COVID research in the world free and then helped organize it with the
[13:26.040 --> 13:31.400]  backing of the UNESCO World Bank and others, so that's one of the genesis' alongside education.
[13:31.400 --> 13:35.720]  For myself as well, if you listen to some of my podcasts, I quit being a hedge fund
[13:35.720 --> 13:41.840]  manager for five years to work on repurposing drugs for my son, doing AI-based lit review
[13:41.840 --> 13:45.560]  and repurposing of drugs through neurotransmitter analysis.
[13:45.560 --> 13:53.320]  So taking things like nazepam and others to treat the symptoms of ASD, the papers around
[13:53.320 --> 13:57.880]  that will be published and we have several initiatives in that area, again, to try and
[13:57.880 --> 14:00.880]  just catalyze it going forward, because that's all we are, we're a catalyst.
[14:00.880 --> 14:04.240]  Communities should take up what we do and run forward with that.
[14:04.240 --> 14:11.640]  Okay, RM, RF, removing everything, do you think the new AI models push us closer to
[14:11.640 --> 14:14.240]  a post-copyright world?
[14:14.240 --> 14:16.040]  I don't know, I think that's a very good question, it might.
[14:16.040 --> 14:19.360]  To be honest, no one knows what the copyright is around some of these things, like at what
[14:19.360 --> 14:24.160]  point does free use stop and start and derivative works?
[14:24.160 --> 14:28.320]  It hasn't been tested, it will be tested, I'm pretty sure there will be all sorts of
[14:28.320 --> 14:33.080]  lawsuits and other things soon, again, that's something we're preparing for.
[14:33.080 --> 14:36.400]  But I think one of the first AI pieces of art was recently granted a copyright.
[14:36.400 --> 14:41.320]  I think the ability to create anything is an interesting one as well, because again,
[14:41.320 --> 14:45.920]  it makes content more valuable, so in an abundance scarcity is there, but I'm not exactly sure
[14:45.920 --> 14:47.080]  how this will play out.
[14:47.080 --> 14:50.160]  I do think you'll be able to create anything you want for yourselves, it just becomes,
[14:50.160 --> 14:53.760]  what happens when you put that into a social context and start selling that?
[14:53.760 --> 14:59.000]  This comes down to the personal agency side of the models that we build as well, you know,
[14:59.000 --> 15:02.760]  like you're responsible for the inputs and the outputs that result from that.
[15:02.760 --> 15:06.540]  And so this is where I think copyright law will be tested the most, because people usually
[15:06.540 --> 15:11.720]  did not have the means of creation, whereas now you have literally the means of creation.
[15:11.720 --> 15:17.800]  Okay, Trekstel asks, prompt engineering may well become an elective class in schools over
[15:17.800 --> 15:18.800]  the next decade.
[15:18.800 --> 15:22.800]  With extremely fast paced development, what do you foresee as the biggest barriers of
[15:22.800 --> 15:23.800]  entries?
[15:23.800 --> 15:27.720]  Some talking points might induce a reluctance to adoption, death of the concept artist and
[15:27.720 --> 15:30.360]  the dangers outweighing the benefits.
[15:30.360 --> 15:38.760]  Well, you know, the interesting thing here is that a large part of life is the ability
[15:38.760 --> 15:39.760]  to prompt.
[15:39.760 --> 15:44.800]  So, you know, prompting humans is kind of the key thing, like my wife tries to prompt
[15:44.800 --> 15:50.320]  me all the time, and she's not very successful, but she's been working on it for 16 years.
[15:50.320 --> 15:54.520]  I think that a lot of the technologies that you're seeing right now from AI, because it
[15:54.520 --> 15:57.980]  understands these latent spaces or hidden meanings, it also includes the hidden meanings
[15:57.980 --> 16:02.560]  in prompts, and I think what you see is you have these generalized models like stable
[16:02.560 --> 16:07.440]  diffusion and stable video fusion and dance diffusion and all these other things.
[16:07.440 --> 16:11.760]  It pushes intelligence to the edge, but what you've done is you compressed 100,000 gigabytes
[16:11.760 --> 16:17.360]  of images into a two gigabyte file of knowledge that understands all those contextualities.
[16:17.360 --> 16:19.600]  The next step is adapting that to your local context.
[16:19.600 --> 16:25.160]  So that's what you guys do when you use Dreambooth, or when you do textual inversion, you're injecting
[16:25.160 --> 16:28.460]  a bit yourself into that model so it understands your prompts better.
[16:28.460 --> 16:32.360]  And I think a combination of multiple models doing that will mean that prompt engineering
[16:32.360 --> 16:36.840]  isn't really the thing, it's just understanding how to chain these tools together, so more
[16:36.840 --> 16:38.680]  kind of context specific stuff.
[16:38.680 --> 16:42.540]  This is why we're partnered with an example for Replit, so that people can build dynamic
[16:42.540 --> 16:46.440]  systems and we've got some very interesting things on the way there.
[16:46.440 --> 16:50.240]  I think the barriers to entry will drop dramatically, like do you really need a class on that?
[16:50.240 --> 16:53.960]  For the next few years, yeah, but then soon it will not require that.
[16:53.960 --> 16:57.000]  Okay, Ammonite says, how long does it usually take to train?
[16:57.000 --> 16:58.960]  Well, that's a piece of string.
[16:58.960 --> 17:00.200]  It depends.
[17:00.200 --> 17:05.800]  We have models, so stable diffusion of 150,000 A100 hours, and A100 hours about $4 on Amazon,
[17:05.800 --> 17:08.920]  which you need for the interconnect.
[17:08.920 --> 17:11.180]  Open clip was 1.2 million hours.
[17:11.180 --> 17:12.860]  That's literally hours of compute.
[17:12.860 --> 17:16.640]  So for stable diffusion, can someone in the chat do this?
[17:16.640 --> 17:20.400]  It's 256 A100s over 150,000 hours.
[17:20.400 --> 17:22.520]  So divide one by the other.
[17:22.520 --> 17:23.520]  What's the number?
[17:23.520 --> 17:24.520]  Let me get it quick.
[17:24.520 --> 17:25.520]  Quickest.
[17:25.520 --> 17:26.520]  Ammonite?
[17:26.520 --> 17:28.920]  Ammonite, you guys kind of calculate slow.
[17:28.920 --> 17:30.640]  24 days, says Ninjaside.
[17:30.640 --> 17:32.360]  There we go.
[17:32.360 --> 17:34.340]  That's about how long it took to train the model.
[17:34.340 --> 17:38.160]  To do the tests and other stuff, it took a lot longer.
[17:38.160 --> 17:41.680]  And the bigger models, again, it depends because it doesn't really need any scale.
[17:41.680 --> 17:45.600]  So it's not that you chuck 512 and it's more efficient.
[17:45.600 --> 17:51.200]  It is really a lot of the heavy lifting is done by the super compute.
[17:51.200 --> 17:56.640]  So what happens is that we're doing all this work up front, and then we release the model
[17:56.640 --> 17:57.640]  to everyone.
[17:57.640 --> 18:03.560]  And then as Joe said, DreamBooth takes about 15 minutes on an A100 to then fine tune.
[18:03.560 --> 18:08.680]  Because all the work of those years of knowledge, the thousands of gigabytes, are all done for
[18:08.680 --> 18:09.680]  you.
[18:09.680 --> 18:13.040]  And that's why you can take it and extend it and kind of do what you want with it.
[18:13.040 --> 18:16.760]  That's the beauty of this model over the old school internet, which was always computing
[18:16.760 --> 18:17.760]  all the time.
[18:17.760 --> 18:21.280]  So you can push intelligence to the edges.
[18:21.280 --> 18:22.280]  All right.
[18:22.280 --> 18:26.960]  So Mr. John Fingers asking, how close do you feel you might be able to show a full motion
[18:26.960 --> 18:29.160]  video model like Google or Meta showed up recently?
[18:29.160 --> 18:32.280]  We'll have it by the end of the year.
[18:32.280 --> 18:35.360]  But better.
[18:35.360 --> 18:39.220]  Reflyn Wolf asks, when do you think we will talk to an AI about the image?
[18:39.220 --> 18:42.540]  Like can you fix his nose a little bit or make a hair longer and stuff like that?
[18:42.540 --> 18:46.760]  To be honest, I'm kind of disappointed in the community has not built that yet.
[18:46.760 --> 18:47.760]  It's not complicated.
[18:47.760 --> 18:51.280]  All you have to do is whack whisper on the front end.
[18:51.280 --> 18:52.280]  Thank you, OpenAI.
[18:52.280 --> 18:57.240]  You know, obviously, you know, that was a great benefit and then have that input into
[18:57.240 --> 19:00.700]  style clip or a kind of fit based thing.
[19:00.700 --> 19:08.960]  So if you look up, Max Wolf has this wonderful thing on style clip that you can see how to
[19:08.960 --> 19:13.960]  create various scary Zuckerberg's as if he wasn't scary himself.
[19:13.960 --> 19:17.640]  And so I'm putting that into the pipeline basically allows you to do what it says there
[19:17.640 --> 19:18.640]  with a bit of targeting.
[19:18.640 --> 19:22.320]  So there's some star clip right there in the stage chat.
[19:22.320 --> 19:26.180]  And again, with the new clip models that we have and a bunch of the other bit models that
[19:26.180 --> 19:29.320]  Google have released recently, you should be able to do that literally now when you
[19:29.320 --> 19:31.320]  can buy that with whisper.
[19:31.320 --> 19:33.620]  All right.
[19:33.620 --> 19:38.560]  And Rev. Ivy Dorey, how do you feel about the use of generative technology being used
[19:38.560 --> 19:42.100]  by surveillance capitalists to further profit aimed goals?
[19:42.100 --> 19:47.800]  What kind of stability I do about this thing we can really do is offer alternatives like
[19:47.800 --> 19:52.600]  do you really want to be in a meta what do they call it, horizon first where you got
[19:52.600 --> 19:59.160]  no legs or genitals, not really, you know, like legs are good, genitals good.
[19:59.160 --> 20:02.480]  And so by providing open alternatives, we can basically out compete the rest like look
[20:02.480 --> 20:06.200]  at the amount of innovation that's happened on the back of stable diffusion.
[20:06.200 --> 20:11.120]  And again, you know, acknowledge our place in that we don't police it, we don't control
[20:11.120 --> 20:13.520]  it, you know, like people can take it and extend it.
[20:13.520 --> 20:14.880]  If you want to use our services, great.
[20:14.880 --> 20:16.160]  If you don't, it's fine.
[20:16.160 --> 20:20.760]  We're creating a brand new ecosystem that will out compete the legacy guys, because
[20:20.760 --> 20:24.920]  thousands millions of people will be building and developing on this.
[20:24.920 --> 20:29.160]  Like we are sponsoring the faster AI course on stable diffusion, so that anyone who's
[20:29.160 --> 20:32.760]  a developer can rapidly learn to be a stable diffusion developer.
[20:32.760 --> 20:36.160]  And you know, this isn't just kind of interfaces and things like that.
[20:36.160 --> 20:37.760]  It's actually you'll be able to build your own models.
[20:37.760 --> 20:38.760]  And how crazy is that?
[20:38.760 --> 20:42.080]  Let's make it accessible to everyone and again, that's why we're working with gradios and
[20:42.080 --> 20:44.080]  others on that.
[20:44.080 --> 20:50.760]  All right, we got David, how realistic do you think dynamically creating realistic 3d
[20:50.760 --> 20:53.720]  content with enough fidelity in a VR setting would be and what do you say the timeline
[20:53.720 --> 20:55.720]  on something like that is?
[20:55.720 --> 21:02.520]  You know, unless you're Elon Musk, self driving cars have always been five years away.
[21:02.520 --> 21:10.680]  Always always, you know, $100 billion has been spent on self driving cars, and the research
[21:10.680 --> 21:15.080]  and it's to me, it's not that much closer.
[21:15.080 --> 21:19.880]  The dream of photorealistic VR though is very different with generative AI.
[21:19.880 --> 21:29.000]  Like again, look at the 24 frames per second image and video look at the
[21:29.000 --> 21:35.960]  long fanaki video as well and then consider Unreal Engine 5 what's Unreal Engine 6 going
[21:35.960 --> 21:36.960]  to look like?
[21:36.960 --> 21:40.920]  Well, it'll be photorealistic right and it'll be powered by nerf technology.
[21:40.920 --> 21:46.120]  The same as Apple is pioneering for use on the neural engine chips that make up 16.8%
[21:46.120 --> 21:49.260]  of your MacBook M1 GPU.
[21:49.260 --> 21:55.620]  It's going to come within four to five years, fully high res, 2k in each eye resolution
[21:55.620 --> 22:02.680]  via even 4k or 8k actually, it just needs an M2 chip with the specialist transformer
[22:02.680 --> 22:04.240]  architecture in there.
[22:04.240 --> 22:06.600]  And that will be available to a lot of people.
[22:06.600 --> 22:10.680]  But then like I said, Unreal Engine 6 will also be out in about four or five years.
[22:10.680 --> 22:12.980]  And so that will also up the ante.
[22:12.980 --> 22:17.880]  There's a lot of amazing compression and customized stuff you can do around this.
[22:17.880 --> 22:21.360]  And so I think it's just gonna be insane when you can create entire worlds.
[22:21.360 --> 22:26.160]  And hopefully, it'll be built on the type of architectures that we help catalyze, whether
[22:26.160 --> 22:27.920]  it's built by ourselves or others.
[22:27.920 --> 22:32.840]  So we have a metric shit ton, I believe is the appropriate term of partnerships that
[22:32.840 --> 22:37.840]  we'll be announcing over the next few months, where we're converting closed source AI companies
[22:37.840 --> 22:43.680]  into open source AI companies, because, you know, it's better to work together.
[22:43.680 --> 22:47.720]  And again, we shouldn't be at the center of all this with everything laying on our shoulders.
[22:47.720 --> 22:51.720]  But it should be a teamwork initiative, because this is cool technology that will help a lot
[22:51.720 --> 22:52.720]  of people.
[22:52.720 --> 22:55.720]  All right, what guarantees is Spit Fortress 2?
[22:55.720 --> 22:58.680]  What guarantees does the community have that stability AI won't go down on the same path
[22:58.680 --> 23:00.000]  as open AI?
[23:00.000 --> 23:03.780]  That one day you won't develop a good enough model, you decide to close things after benefiting
[23:03.780 --> 23:06.440]  from all the work of the community and the visibility generated by it?
[23:06.440 --> 23:07.440]  That's a good question.
[23:07.440 --> 23:10.040]  I mean, it kind of sucks what happened with open AI, right?
[23:10.040 --> 23:13.620]  You can say it's safety, you can say it's commercials, like whatever.
[23:13.620 --> 23:18.640]  The R&D team and the developers have in their contracts, except for one person that we need
[23:18.640 --> 23:24.680]  to send it to, that they can release any model that they work on open source.
[23:24.680 --> 23:27.680]  So legally, we can't stop them.
[23:27.680 --> 23:30.800]  Well, I think that's a pretty good kind of thing.
[23:30.800 --> 23:34.280]  I don't think there's any company in the world that does that.
[23:34.280 --> 23:39.380]  And again, if you look at it, the only thing that we haven't instantly released is this
[23:39.380 --> 23:44.040]  particular class of generative models, because it's not straightforward.
[23:44.040 --> 23:50.180]  And because you have frickin Congresswoman petitioning to ban us by the NSA.
[23:50.180 --> 23:54.480]  And a lot more stuff behind that.
[23:54.480 --> 24:00.960]  Look, you know, we're gonna get B Corp status soon, which puts in our official documents
[24:00.960 --> 24:06.320]  that we are mission focused, not profit focused.
[24:06.320 --> 24:10.320]  At the same time, I'm going to build $100 billion company that helps a billion people.
[24:10.320 --> 24:13.420]  We have some other things around governance that we'll be introducing as well.
[24:13.420 --> 24:19.840]  But currently, the governance structure is simple, yet not ideal, which is that I personally
[24:19.840 --> 24:24.240]  have control of board, ordinary common everything.
[24:24.240 --> 24:27.020]  And so a lot is resting on my shoulders are not sustainable.
[24:27.020 --> 24:31.240]  As soon as we figure that out, and how to maintain the independence and how to maintain
[24:31.240 --> 24:35.080]  it so that we are dedicated to open, which I think is a superior business model, a lot
[24:35.080 --> 24:38.840]  of people agree with, will implement that posthaste any suggestions, please do send
[24:38.840 --> 24:39.840]  them our way.
[24:39.840 --> 24:45.600]  But like I said, one core thing is, if we stop being open source, and go down the open
[24:45.600 --> 24:50.120]  AI route, there's nothing we can do to stop the developers from releasing the code.
[24:50.120 --> 24:54.920]  And without developers, what are we, you know, nice front end company that does a bit of
[24:54.920 --> 24:58.560]  model deployment, though it'd be killing ourselves.
[24:58.560 --> 25:01.160]  All right.
[25:01.160 --> 25:05.840]  Any plans for stability to this is pseudosilico, any plans for stability to tackle open source
[25:05.840 --> 25:08.680]  alternatives to AI code generators, like copilot and alpha code?
[25:08.680 --> 25:13.540]  Yeah, you can go over to karpa.ai, and see our code generation model that's training
[25:13.540 --> 25:14.940]  right now.
[25:14.940 --> 25:18.760]  We released one of the FID based language models that will be core to that plus our
[25:18.760 --> 25:23.240]  instruct framework, so that you can have the ideal complement to that.
[25:23.240 --> 25:29.120]  So I think by Q1 of next year, we will have better code models than copilot.
[25:29.120 --> 25:32.700]  And there's some very interesting things in the works there, you just look at our partners
[25:32.700 --> 25:33.700]  and other things.
[25:33.700 --> 25:38.000]  And again, there'll be open source available to everyone.
[25:38.000 --> 25:39.000]  Right.
[25:39.000 --> 25:43.880]  Sunbury, will support be added for training at sizes other than 512 by default?
[25:43.880 --> 25:44.880]  Training?
[25:44.880 --> 25:46.520]  I suppose you meant inference.
[25:46.520 --> 25:50.360]  Yeah, I mean, there are kind of things like that already.
[25:50.360 --> 25:56.680]  So like, if you look at the recently released novel AI improvements to stable diffusion,
[25:56.680 --> 26:00.460]  you'll see that there are details there as to how to implement arbitrary resolutions
[26:00.460 --> 26:04.800]  similar to something like mid journey, I'll just post it there.
[26:04.800 --> 26:08.960]  The model itself, like I said, enables that it's just that the kind of code wasn't there.
[26:08.960 --> 26:12.280]  It was part of our expected upgrades.
[26:12.280 --> 26:14.880]  And again, like different models have been trained at different sizes.
[26:14.880 --> 26:21.200]  So we have a 768 model, a 512 model, et cetera, so 1024 model, et cetera, coming in the pipeline.
[26:21.200 --> 26:24.320]  I mean, like, again, I think that not many people have actually tried to train models
[26:24.320 --> 26:25.320]  yet.
[26:25.320 --> 26:28.560]  You can get into grips with it, but you can train and extend this, again, view it as a
[26:28.560 --> 26:34.240]  base of knowledge onto which you can adjust a bunch of other stuff.
[26:34.240 --> 26:38.000]  Krakos, do you have any plans to improve the model in terms of face, limbs, and hand generation?
[26:38.000 --> 26:40.520]  Is it possible to improve on specifics on this checkpoint?
[26:40.520 --> 26:41.960]  Yep, 100%.
[26:41.960 --> 26:48.680]  So I think in the next day or so, we'll be releasing a new fine-tuned decoder that's
[26:48.680 --> 26:54.160]  just a drop-in for any latent diffusion or stable diffusion model that is fine-tuned
[26:54.160 --> 27:00.200]  on the face-lion dataset, and that makes better faces.
[27:00.200 --> 27:05.680]  Then, as well, you can train it on, like, Hagrid, which is the hand dataset to create
[27:05.680 --> 27:07.880]  better hands, et cetera.
[27:07.880 --> 27:11.160]  Some of this architecture is known as a VAE architecture for doing that.
[27:11.160 --> 27:16.840]  And again, that's discussed a bit in the novel AI thing, because they do have better hands.
[27:16.840 --> 27:22.400]  And again, this knowledge will proliferate around that.
[27:22.400 --> 27:23.400]  What is the next question?
[27:23.400 --> 27:27.240]  There's a lot of questions today.
[27:27.240 --> 27:36.980]  Any – I saw your partnership with AI Grant with Nat and Daniel.
[27:36.980 --> 27:40.760]  If you guys would support startups in case they aren't selected by them, any way startups
[27:40.760 --> 27:43.400]  can connect with you folks to get mentorship or guidance.
[27:43.400 --> 27:47.440]  We are building a grant program and more.
[27:47.440 --> 27:51.000]  It's just that we're currently hiring people to come and run it.
[27:51.000 --> 27:53.660]  That's the same as Bruce.Codes' question.
[27:53.660 --> 27:59.400]  In the next couple of weeks, there will be competitions and all sorts of grants announced
[27:59.400 --> 28:04.480]  to kind of stimulate the growth of some essential parts of infrastructure in the community.
[28:04.480 --> 28:07.640]  And we're going to try and get more community involvement in that, so people who do great
[28:07.640 --> 28:11.520]  things for the community are appropriately awarded.
[28:11.520 --> 28:13.960]  There's a lot of work being done there.
[28:13.960 --> 28:17.480]  All right.
[28:17.480 --> 28:22.160]  So Ivy Dorey, is Stability AI considering working on climate crisis via models in some
[28:22.160 --> 28:23.160]  way?
[28:23.160 --> 28:26.560]  Yes, and this will be announced in November.
[28:26.560 --> 28:27.840]  I can't announce it just yet.
[28:27.840 --> 28:30.560]  They want to do a big, grand thing, but you know.
[28:30.560 --> 28:31.560]  We're doing that.
[28:31.560 --> 28:35.300]  We're supporting several entities that are doing climate forecasting functions and working
[28:35.300 --> 28:40.160]  with a few governments on weather patterns using transformer-based technologies as well.
[28:40.160 --> 28:41.720]  There's that.
[28:41.720 --> 28:42.720]  Okay.
[28:42.720 --> 28:45.920]  What else have we got?
[28:45.920 --> 28:50.200]  We have Reflyn Wolf.
[28:50.200 --> 28:52.840]  Which jobs do you think are most dangerous being taken by AI?
[28:52.840 --> 28:54.840]  I don't know, man.
[28:54.840 --> 28:57.200]  It's a complex one.
[28:57.200 --> 29:01.440]  I think that probably the most dangerous ones are call center workers and anything that
[29:01.440 --> 29:02.960]  involves human-to-human interaction.
[29:02.960 --> 29:05.600]  I don't know if you guys have tried character.ai.
[29:05.600 --> 29:14.120]  I don't know if they've stopped it because you could create some questionable entities.
[29:14.120 --> 29:17.600]  The...
[29:17.600 --> 29:19.080]  It's very good.
[29:19.080 --> 29:21.960]  And it will just get better because I think you look at some of the voice models we have
[29:21.960 --> 29:26.200]  coming up, you can basically do emotionally accurate voices and all sorts of stuff and
[29:26.200 --> 29:29.440]  voice-to-voice, so you won't notice a call center worker.
[29:29.440 --> 29:32.280]  But that goes to a lot of different things.
[29:32.280 --> 29:34.440]  I think that's probably the first for disruption before anything else.
[29:34.440 --> 29:37.760]  I don't think that artists get disrupted that much, to be honest, by what's going on here.
[29:37.760 --> 29:41.100]  Unless you're a bad artist, in which case you can use this technology to become a great
[29:41.100 --> 29:45.060]  artist, and the great artist will become even greater.
[29:45.060 --> 29:47.400]  So I think that's probably my take on that.
[29:47.400 --> 29:50.920]  Liquid Rhino has question two parts.
[29:50.920 --> 29:55.240]  What work is being done to improve the attention mechanism of stable diffusion to better handle
[29:55.240 --> 29:58.960]  and interpret composition while preserving artistic style?
[29:58.960 --> 30:02.400]  There are natural language limitations when it comes to interpreting physics from simple
[30:02.400 --> 30:03.400]  statements.
[30:03.400 --> 30:06.520]  Artistic style further deforms and challenges this kind of interpretation.
[30:06.520 --> 30:09.880]  Is stability AI working on high-level compositional language for use of generative models?
[30:09.880 --> 30:11.680]  The answer is yes.
[30:11.680 --> 30:17.200]  This is why we spent millions of dollars releasing the new CLIP.
[30:17.200 --> 30:18.360]  CLIP is at the core of these models.
[30:18.360 --> 30:22.280]  There's a generative component and there is a guidance component, and when you infuse
[30:22.280 --> 30:25.920]  the two together, you get models like they are right now.
[30:25.920 --> 30:32.280]  The guidance component, we used CLIP-L, which was CLIP-Large, which was the largest one that
[30:32.280 --> 30:33.280]  OpenAI released.
[30:33.280 --> 30:36.520]  They had two more, H and G, which I believe are huge and gigantic.
[30:36.520 --> 30:41.640]  We released H in the first version of G, which should take like a million A100 hours to do,
[30:41.640 --> 30:45.240]  and that improves compositional qualities so that as that gets integrated into a new
[30:45.240 --> 30:51.040]  version of stable diffusion, it will be at the level of DALY2, just even with a small
[30:51.040 --> 30:52.040]  size.
[30:52.040 --> 30:57.040]  There are some problems around this in that the model learns from both things.
[30:57.040 --> 31:02.080]  It learns from the stuff the generative thing is fine-tuned on and from the CLIP models,
[31:02.080 --> 31:05.860]  and so we've been spending a lot of time over the last few weeks, and there's another reason
[31:05.860 --> 31:10.440]  for the delay, seeing what exactly does this thing know, because even if an artist isn't
[31:10.440 --> 31:13.720]  in our training dataset, it somehow knows about it, and it turns out it was CLIP all
[31:13.720 --> 31:14.720]  along.
[31:14.720 --> 31:18.120]  So we really wanted to output what we think it outputs and not output what it shouldn't
[31:18.120 --> 31:20.440]  output, so we've been doing a lot of work around that.
[31:20.440 --> 31:24.720]  Similarly, what we found is that embedding pure language models like T5, XXL, and we
[31:24.720 --> 31:29.920]  tried UL2 and some of these other models, these are like pure language models like GPT-3,
[31:29.920 --> 31:33.120]  improves the understanding of these models, which is kind of crazy.
[31:33.120 --> 31:36.740]  And so there's some work being done around that for compositional accuracy, and again,
[31:36.740 --> 31:41.800]  you can look at the blog by Novel.ai where they extended the context window so that it
[31:41.800 --> 31:46.920]  can accept three times the amount of input from this.
[31:46.920 --> 31:53.160]  So your prompts get longer from I think like 74 to 225 or something like that, and there
[31:53.160 --> 31:56.480]  are various things you can do once you do proper latence place exploration, which I
[31:56.480 --> 31:59.680]  think is probably another month away, to really hone down on this.
[31:59.680 --> 32:04.560]  I think again, a lot of these other interfaces from the ones that we support to others have
[32:04.560 --> 32:07.740]  already introduced negative prompting and all sorts of other stuff.
[32:07.740 --> 32:12.280]  You should have kind of some vector-based initialization, et cetera, coming soon.
[32:12.280 --> 32:13.280]  All right.
[32:13.280 --> 32:20.240]  We've got Mav, what are the technical limitations around recreating SD with a 1024 dataset rather
[32:20.240 --> 32:23.520]  than 512, and why not have varying resolutions for the dataset?
[32:23.520 --> 32:25.360]  Is the new model going to be a ton bigger?
[32:25.360 --> 32:29.360]  So version 3 right now has 1.4 billion parameters.
[32:29.360 --> 32:34.320]  We've got a 4.3 billion parameter image in training and 900 million parameter image in
[32:34.320 --> 32:35.320]  training.
[32:35.320 --> 32:36.320]  We've got a lot of models training.
[32:36.320 --> 32:39.000]  We're just waiting to get these things right before we just start releasing them one after
[32:39.000 --> 32:40.000]  the other.
[32:40.000 --> 32:44.520]  The main limitation is the lack of 1024 images in the training dataset.
[32:44.520 --> 32:47.600]  Like Lion doesn't have a lot of high resolution images, and this is one of the things why
[32:47.600 --> 32:53.760]  what we've been working on the last few weeks is to basically negotiate and license amazing
[32:53.760 --> 32:59.840]  datasets that we can then put out to the world so that you can have much better models.
[32:59.840 --> 33:03.520]  And we're going to pay a crap load for that, but again, release it for free and open source
[33:03.520 --> 33:04.520]  to everyone.
[33:04.520 --> 33:06.320]  And I think that should do well.
[33:06.320 --> 33:09.400]  This is also why the upscaler that you're going to see is a two times upscaler.
[33:09.400 --> 33:10.400]  That's good.
[33:10.400 --> 33:12.760]  Four times upscaling is a bit difficult for us to do.
[33:12.760 --> 33:17.600]  Like it's still decent because we're just waiting on the licensing of those images.
[33:17.600 --> 33:19.720]  All right.
[33:19.720 --> 33:23.880]  What's next?
[33:23.880 --> 33:27.280]  Any plans for creating a worthy open source alternative, something like AI Dungeon or
[33:27.280 --> 33:28.280]  Character AI?
[33:28.280 --> 33:33.140]  Well, a lot of the Carper AI teams work around instruct models and contrastive learning should
[33:33.140 --> 33:37.840]  enable Carper Character AI type systems on chatbots.
[33:37.840 --> 33:41.400]  And you know, from narrative construction to others, again, it will be ideal there.
[33:41.400 --> 33:45.640]  The open source versions of Novel AI and AI Dungeon, I believe the leading one is Cobold
[33:45.640 --> 33:46.640]  AI.
[33:46.640 --> 33:47.640]  So you might want to check that out.
[33:47.640 --> 33:50.760]  I haven't seen what the case has been with that recently.
[33:50.760 --> 33:51.760]  All right.
[33:51.760 --> 33:53.560]  We've got Joe Rogan.
[33:53.560 --> 33:56.080]  When we'll be able to create full on movies with AI?
[33:56.080 --> 34:00.600]  I don't know, like five years again.
[34:00.600 --> 34:03.240]  I'm just digging that out there.
[34:03.240 --> 34:06.080]  Okay, if I was Elon Musk, I'd say one year.
[34:06.080 --> 34:07.960]  I mean, it depends what you mean by a feature like movies.
[34:07.960 --> 34:12.880]  So like animated movies, when you combine stable diffusion with some of the language
[34:12.880 --> 34:17.480]  models and some of the code models, you should be able to create those.
[34:17.480 --> 34:23.960]  Maybe not in a UFO table or Studio Bones style within two years, I'd say, but I'd say a five
[34:23.960 --> 34:28.840]  year time frame for being able to create those in high quality, like super high res is reasonable
[34:28.840 --> 34:34.560]  because that's the time it will take to create these high res dynamic VR kind of things.
[34:34.560 --> 34:38.480]  To create fully photorealistic proper people movies, I mean, you can look at E.B.
[34:38.480 --> 34:44.560]  Synth or some of these other kind of pathway analyses, it shouldn't be that long to be
[34:44.560 --> 34:45.560]  honest.
[34:45.560 --> 34:46.940]  It depends on how much budget and how quick you want to do it.
[34:46.940 --> 34:50.800]  Real time is difficult, but you're going to see some really amazing real time stuff in
[34:50.800 --> 34:51.800]  the next year.
[34:51.800 --> 34:52.800]  Touch wood.
[34:52.800 --> 34:53.800]  We're lining it up.
[34:53.800 --> 34:55.040]  It's going to blow everyone's socks away.
[34:55.040 --> 34:59.800]  That's going to require a freaking supercomputer, but it's not movie length.
[34:59.800 --> 35:01.640]  It's something a bit different.
[35:01.640 --> 35:02.840]  All right.
[35:02.840 --> 35:06.040]  Querielmotor, did you read the installation of guided diffusion models paper?
[35:06.040 --> 35:07.360]  Do you have any thoughts on it?
[35:07.360 --> 35:10.480]  Like if it will improve things on consumer level hardware or just the high VRAM data
[35:10.480 --> 35:11.480]  centers?
[35:11.480 --> 35:16.520]  I mean, distillation and instructing these models is awesome.
[35:16.520 --> 35:21.960]  And the step counts they have for kind of reaching cohesion are kind of crazy.
[35:21.960 --> 35:26.120]  RiversideWigs has done a lot of work on a kind of DDPM fast solvent, but already reduced
[35:26.120 --> 35:28.880]  the number of steps required to get to those stages.
[35:28.880 --> 35:34.160]  And again, like I keep telling everyone, once you start chaining these models together,
[35:34.160 --> 35:38.540]  you're going to get down really sub one second and further, because I think you guys have
[35:38.540 --> 35:43.480]  seen image to image work so much better if you just even give a basic sketch than text
[35:43.480 --> 35:44.480]  to image.
[35:44.480 --> 35:47.200]  So why don't you change together different models, different modalities to kind of get
[35:47.200 --> 35:48.200]  them?
[35:48.200 --> 35:52.880]  And I think it'll be easier once we release our various model resolution sizes plus upscalers
[35:52.880 --> 35:55.140]  so you can dynamically switch between models.
[35:55.140 --> 36:01.140]  If you look at the dream studio kind of teaser that I posted six weeks ago, that's why we've
[36:01.140 --> 36:04.480]  got model chaining integrated right in there.
[36:04.480 --> 36:05.920]  All right.
[36:05.920 --> 36:10.200]  RefleonWolf, who do you think should own the copyright of an image video made by an AI
[36:10.200 --> 36:13.240]  or do you think there shouldn't be an owner?
[36:13.240 --> 36:18.360]  I think that if it isn't based on copyrighted content, it should be owned by the prompter
[36:18.360 --> 36:19.360]  of the AI.
[36:19.360 --> 36:23.880]  If the AI is a public model and not owned by someone else, otherwise it is almost like
[36:23.880 --> 36:26.680]  a code creation type of thing.
[36:26.680 --> 36:34.920]  But I'm not a lawyer and I think this will be tested severely very soon.
[36:34.920 --> 36:39.200]  Question by Prue Prue, update some more paying owner methods for dream studio.
[36:39.200 --> 36:43.840]  I think we'll be introducing some alternate ones soon, the one that we won't introduce
[36:43.840 --> 36:44.840]  is PayPal.
[36:44.840 --> 36:49.880]  No, no PayPal, because that's just crazy what's going on there.
[36:49.880 --> 36:53.800]  Jason, the artist with stable diffusion having been publicly released for over a month now
[36:53.800 --> 36:57.620]  and with the release of version five around the corner, what is the most impressive implementation
[36:57.620 --> 37:01.000]  you've seen someone create out of the application so far?
[37:01.000 --> 37:02.680]  I really love the dream booth stuff.
[37:02.680 --> 37:04.360]  I mean, come on, that shit's crazy.
[37:04.360 --> 37:09.020]  You know, even though some of you fine tuned me into kind of weird poses.
[37:09.020 --> 37:11.680]  I think it was pretty good.
[37:11.680 --> 37:13.600]  I didn't think we would get that level of quality.
[37:13.600 --> 37:16.240]  I thought it would be a textual and version level quality.
[37:16.240 --> 37:23.520]  Beyond that, I think that, you know, there's been this well of creativity, like you're
[37:23.520 --> 37:27.680]  starting to see some of the 3D stuff come out and again, I didn't think we'd get quite
[37:27.680 --> 37:29.240]  there even with the chaining.
[37:29.240 --> 37:32.600]  I think that's pretty darn impressive.
[37:32.600 --> 37:36.560]  Okay, so what is next?
[37:36.560 --> 37:41.360]  Okay, so I've just been going through all of these chat things.
[37:41.360 --> 37:47.480]  Notepad, are there any areas of the industry that is currently overlooked that you'll be
[37:47.480 --> 37:51.080]  excited to see the effects of diffusion based AI being used?
[37:51.080 --> 37:55.640]  Again, like I can't get away from this PowerPoint thing.
[37:55.640 --> 38:00.520]  Like it's such a straightforward thing that causes so much real annoyance.
[38:00.520 --> 38:02.680]  I think we could kind of get it out there.
[38:02.680 --> 38:07.160]  I think it just requires kind of a few fine tuned models plus a code model plus a language
[38:07.160 --> 38:08.920]  model to kind of kick it together.
[38:08.920 --> 38:14.000]  I mean, diffusion is all about de-noising and information is about noise.
[38:14.000 --> 38:16.920]  So our brains filter out noise and de-noise all the time.
[38:16.920 --> 38:20.240]  So these models can be used in a ridiculous number of scenarios.
[38:20.240 --> 38:25.440]  Like I said, we've got DNA diffusion model going on in OpenBIM, all that shit crazy,
[38:25.440 --> 38:26.440]  right?
[38:26.440 --> 38:30.240]  But I think right now I really want to see some of these practical high impact use cases
[38:30.240 --> 38:32.800]  like the PowerPoint kind of thing.
[38:32.800 --> 38:34.040]  All right.
[38:34.040 --> 38:39.760]  We've got S1, S2, do you have any plans to release a speech since this model likes script
[38:39.760 --> 38:40.760]  overdone voices?
[38:40.760 --> 38:46.560]  Yes, we have a plan to release a speech to speech model soon and some other ones around
[38:46.560 --> 38:47.560]  that.
[38:47.560 --> 38:50.720]  I think AudioLM by Google was super interesting recently.
[38:50.720 --> 38:55.480]  For those who don't know, that's basically you give it a snippet of a voice or of music
[38:55.480 --> 38:57.280]  or something and it just extends it.
[38:57.280 --> 38:58.280]  It's kind of crazy.
[38:58.280 --> 39:02.840]  But I think we get the arbitrary kind of length thing there and combined with some other models
[39:02.840 --> 39:05.600]  that could be really interesting.
[39:05.600 --> 39:13.800]  All right, maybe Dori, do you have any thoughts on increasing the awareness of generative
[39:13.800 --> 39:14.800]  models?
[39:14.800 --> 39:15.800]  Is this something you see as important?
[39:15.800 --> 39:19.760]  How long do you think until the mass glow population becomes aware of these models?
[39:19.760 --> 39:26.000]  I think I can't keep up as it is and I don't want to die.
[39:26.000 --> 39:29.140]  But more realistically, we have a B2B2C model.
[39:29.140 --> 39:33.720]  So we're partnering with the leading brands in the world and content creators to both
[39:33.720 --> 39:38.120]  get their content so we can build better open models and to get this technology out to just
[39:38.120 --> 39:39.120]  everyone.
[39:39.120 --> 39:43.860]  Similar on a country basis, we have country level models coming out very soon.
[39:43.860 --> 39:46.840]  So on the language side of things, you can see we released Polyglot, which is the best
[39:46.840 --> 39:52.220]  Korean language model, for example, Vera, Luther AI and our support of them recently.
[39:52.220 --> 39:57.040]  So I think you will see a lot of models coming soon, a lot of different kind of elements
[39:57.040 --> 39:59.040]  around that.
[39:59.040 --> 40:06.000]  Okie dokie, will we always be limited by the hardware cost to run AI or do you expect something
[40:06.000 --> 40:07.000]  to change?
[40:07.000 --> 40:09.980]  Yeah, I mean, like this will run on the edge, it'll run on your iPhone in a year.
[40:09.980 --> 40:14.960]  Stable diffusion will run on an iPhone in probably seconds, that level of quality.
[40:14.960 --> 40:16.960]  That's again, a bit crazy.
[40:16.960 --> 40:22.160]  All right, Aziroshin, oh, this is a long one.
[40:22.160 --> 40:25.960]  I'm unsure how to release licensed images based on SD output.
[40:25.960 --> 40:30.400]  Some suggest creative commons zero is fine.
[40:30.400 --> 40:33.280]  Some say raw output, warning of license, suggest reality.
[40:33.280 --> 40:35.880]  Oh, sorry, that's just a really long question.
[40:35.880 --> 40:38.280]  My brain's a bit fried.
[40:38.280 --> 40:43.360]  Okay, so if someone takes a CCO out image and violates the license, then something can
[40:43.360 --> 40:44.360]  be done around that.
[40:44.360 --> 40:49.280]  I would suggest that if you're worried about some of this stuff, you, CCO licensing, and
[40:49.280 --> 40:54.280]  again, I am not a lawyer, please consult with a lawyer, does not preclude copyright.
[40:54.280 --> 40:57.560]  And there's a transformational element that incorporates that.
[40:57.560 --> 41:02.480]  If you look at artists like Necro 13 and Claire Selva and others, you will see that the outputs
[41:02.480 --> 41:05.520]  usually aren't one shot, they are multi-sesic.
[41:05.520 --> 41:09.280]  And then that means that this becomes one part of that, a CCO license part that's part
[41:09.280 --> 41:10.280]  of your process.
[41:10.280 --> 41:14.880]  Like, even if you use GFPGAN or upscaling or something like that, again, I'm not a lawyer,
[41:14.880 --> 41:15.880]  please consult with one.
[41:15.880 --> 41:19.360]  I think that should be sufficiently transformative that you can assert full copyright over the
[41:19.360 --> 41:21.800]  output of your work.
[41:21.800 --> 41:25.000]  Kingping is, stability are going to give commissions to artists.
[41:25.000 --> 41:29.560]  We have some very exciting in-house artists coming online soon.
[41:29.560 --> 41:34.400]  Some very interesting ones, I'm afraid that's all I can say right now.
[41:34.400 --> 41:37.680]  But yeah, we will have more art programs and things like that as part of our community
[41:37.680 --> 41:38.680]  engagement.
[41:38.680 --> 41:43.680]  It's just that right now it's been a struggle even to keep Discord and other things going
[41:43.680 --> 41:44.680]  and growing the team.
[41:44.680 --> 41:48.320]  Like, we're just over a hundred people now, God knows how many we actually need.
[41:48.320 --> 41:50.600]  I think we probably need to hire another hundred more.
[41:50.600 --> 41:54.600]  All right, RMRF, a text-to-speech model too?
[41:54.600 --> 41:55.600]  Yep.
[41:55.600 --> 42:00.520]  I couldn't release it just yet as my sister-in-law was running Synantic, but now that she's been
[42:00.520 --> 42:04.760]  absorbed by Spotify, we can release emotional text-to-speech.
[42:04.760 --> 42:10.240]  Not soon though, I think that we want to do some extra work around that and build that
[42:10.240 --> 42:11.240]  up.
[42:11.240 --> 42:12.240]  All right.
[42:12.240 --> 42:17.920]  Anisham, is it possible to get vector images like an SVG file from stable diffusion or
[42:17.920 --> 42:20.800]  related systems?
[42:20.800 --> 42:22.720]  Not at the moment.
[42:22.720 --> 42:28.800]  You can actually do that with a language model, as you'll find out probably in the next month.
[42:28.800 --> 42:32.240]  But right now I would say just use a converter, and that's probably going to be the best way
[42:32.240 --> 42:33.240]  to do that.
[42:33.240 --> 42:38.440]  All right, Ruffling Wolf, is there a place to find all stable AI-made models in one place?
[42:38.440 --> 42:40.800]  No, there is not, because we are disorganized.
[42:40.800 --> 42:46.160]  We barely have a careers page up, and we're not really keeping a track of everything.
[42:46.160 --> 42:51.440]  We are employing someone as an AI librarian to come and help coordinate the community
[42:51.440 --> 42:53.000]  and some of these other things.
[42:53.000 --> 42:56.440]  Again, that's just a one-stop shop there.
[42:56.440 --> 43:01.080]  But yeah, also there's this collaborative thing where we're involved in a lot of stuff.
[43:01.080 --> 43:05.120]  There's a blurring line between what we need and what we don't need.
[43:05.120 --> 43:07.000]  We just are going to want to be the catalyst for all of this.
[43:07.000 --> 43:09.440]  I think the best models go viral anyway.
[43:09.440 --> 43:13.160]  All right, Infinite Monkey, where do you see stability AI in five years?
[43:13.160 --> 43:17.400]  Hopefully with someone else leading the damn thing so I can finish Elden Ring.
[43:17.400 --> 43:23.480]  No, I mean, our aim is basically to build AI subsidiaries in every single country so
[43:23.480 --> 43:29.600]  that there's localized models for every country and race that are all open and to basically
[43:29.600 --> 43:33.240]  be the biggest, best company in the world that's actually aligned with you rather than
[43:33.240 --> 43:35.800]  trying to suck up your attention to serve you ads.
[43:35.800 --> 43:41.640]  I really don't like ads, honestly, unless they're artistic, I like artistic ads.
[43:41.640 --> 43:47.440]  So the aim is to build a big company to list and to give it back to the people so ultimately
[43:47.440 --> 43:48.560]  it's all owned by the people.
[43:48.560 --> 43:55.000]  For myself, my main aim is to ramp this up and spread as much profit as possible into
[43:55.000 --> 43:59.680]  Imagine Worldwide, our education arm run by our co-founder, which currently is teaching
[43:59.680 --> 44:05.120]  kids literacy and numeracy in refugee camps in 13 months on one hour a day.
[44:05.120 --> 44:10.800]  We've just been doing the remit to extend this and incorporate AI to teach tens of millions
[44:10.800 --> 44:14.600]  of kids around the world that will be open source, hosted at the UN.
[44:14.600 --> 44:17.060]  One laptop per child, but really one AI per child.
[44:17.060 --> 44:20.680]  That's one of my main focuses because I think I did a podcast about this.
[44:20.680 --> 44:24.320]  A lot of people talk about human rights and ethics and morals and things like that.
[44:24.320 --> 44:29.160]  One of the frames I found really interesting from Vinay Gupta, who's a bit of a crazy guy,
[44:29.160 --> 44:33.160]  but a great thinker, was that we should think about human rights in terms of the rights
[44:33.160 --> 44:38.040]  of children because they don't have any agency and they can't control things and what is
[44:38.040 --> 44:42.200]  their right to have a climate, what is their right to food and education and other things.
[44:42.200 --> 44:46.320]  We should really provide for them and I'm going to use this technology to provide for
[44:46.320 --> 44:51.200]  them so there's literally no child left behind, they have access to all the tools and technology
[44:51.200 --> 44:52.200]  they need.
[44:52.200 --> 44:56.760]  That's why creativity was a core component of that and communication, education and healthcare.
[44:56.760 --> 45:01.760]  Again, it's not just us, all we are is the catalyst and it's the community that comes
[45:01.760 --> 45:06.680]  and helps and extends that.
[45:06.680 --> 45:10.920]  As Zeroshin, my question was about whether I have to pass down the rail license limitations
[45:10.920 --> 45:13.840]  when licensing SD based images or I can release as good.
[45:13.840 --> 45:18.480]  Ah yes, you don't have to do rail license, you can release as is.
[45:18.480 --> 45:22.400]  It's only if you are running the model or distributing the model to other people that
[45:22.400 --> 45:26.160]  you have to do that.
[45:26.160 --> 45:30.040]  If you'd like to learn more about our education initiative, they're at Magic Worldwide.
[45:30.040 --> 45:34.720]  Lots more on that soon as we scale up to tens of millions of kids.
[45:34.720 --> 45:39.220]  We have Chuck Still, as a composer and audio engineer myself, I cannot imagine AI will
[45:39.220 --> 45:42.920]  approach the emotional intricacies and depths of complexity found in music by world class
[45:42.920 --> 45:45.080]  musicians, at least not anytime soon.
[45:45.080 --> 45:48.600]  That said, I'm interested in AI as a tool, would love to explore how it can be used to
[45:48.600 --> 45:50.400]  help in this production process.
[45:50.400 --> 45:51.400]  Are we involved in this?
[45:51.400 --> 45:56.180]  Yes we are, I think someone just linked to harm when I play and we will be releasing
[45:56.180 --> 46:02.440]  a whole suite of tools soon to extend the capability of musicians and make more people
[46:02.440 --> 46:03.440]  into musicians.
[46:03.440 --> 46:07.040]  And this is one of the interesting ones, like these models, they pay attention to the important
[46:07.040 --> 46:08.680]  parts of any media.
[46:08.680 --> 46:14.000]  So there's always this question about expressivity and humanity, I mean they are trained on humanity
[46:14.000 --> 46:18.840]  and so they resonate and I think that's something that you kind of have to acknowledge and then
[46:18.840 --> 46:25.160]  it's about aesthetics have been solved to a degree by this type of AI.
[46:25.160 --> 46:29.240]  So something can be aesthetically pleasing, but aesthetics are not enough.
[46:29.240 --> 46:35.680]  If you are an artist, a musician or otherwise, I'd say a coder, it's largely about narrative
[46:35.680 --> 46:36.680]  and story.
[46:36.680 --> 46:39.760]  And what does that look like around all of this?
[46:39.760 --> 46:44.920]  Because things don't exist in a vacuum, it can be a beautiful thing or a piece of music,
[46:44.920 --> 46:48.540]  but you remember it because you were driving a car when you were 18 with your best friends,
[46:48.540 --> 46:51.360]  you know, or it was at your wedding or something like that.
[46:51.360 --> 46:56.440]  That's when story matters, for music, for art, for other things as well like that.
[46:56.440 --> 46:58.960]  All right, one second.
[46:58.960 --> 47:03.960]  Man, I just drank a tea.
[47:03.960 --> 47:09.560]  All right, we've got GHP Kishore, are you guys working on LMs as well, something to
[47:09.560 --> 47:11.560]  compete with OpenAI GPT-3?
[47:11.560 --> 47:12.560]  Yes.
[47:12.560 --> 47:18.280]  We recently released from the Carpa Lab, the instruct framework and we are training to
[47:18.280 --> 47:25.840]  achieve chiller optimal models, which outperformed GPT-3 on a fraction of the parameters.
[47:25.840 --> 47:27.380]  They will get better and better and better.
[47:27.380 --> 47:32.140]  And then as we create localized data sets and the education data sets, those are ideal
[47:32.140 --> 47:39.360]  for training foundation models at ridiculous power relative to the parameters.
[47:39.360 --> 47:44.320]  So I think that it will be pretty great to say the least as we kind of focus on that.
[47:44.320 --> 47:49.600]  LutherAI, which was the first community that we properly supported and a number of stability
[47:49.600 --> 47:53.360]  employees help lead that community.
[47:53.360 --> 47:58.840]  The focus was GPT Neo and GPT-J, which were the open source implementations of GPT-3 but
[47:58.840 --> 48:04.280]  on a smaller parameter scale, which had been downloaded 25 million times by developers,
[48:04.280 --> 48:07.280]  which I think is a lot more use than GPT-3 has got.
[48:07.280 --> 48:12.160]  But GPT-3 is fantastic or instruct GPT, which it really is.
[48:12.160 --> 48:14.920]  I think this instruct model that took it down a hundred times.
[48:14.920 --> 48:19.080]  Again, if you're technical, you can look at the Carpa community and you can see the framework
[48:19.080 --> 48:21.080]  around that.
[48:21.080 --> 48:22.560]  All right.
[48:22.560 --> 48:24.960]  What is the next question here?
[48:24.960 --> 48:28.120]  Oh, no, I've tapped the wrong thing.
[48:28.120 --> 48:29.120]  I've lost the questions.
[48:29.120 --> 48:31.120]  I have found them.
[48:31.120 --> 48:33.120]  Yes.
[48:33.120 --> 48:34.400]  Gimmick from the FAQ.
[48:34.400 --> 48:38.160]  In the future for other models, we are building an opt-in and opt-out system for artists and
[48:38.160 --> 48:40.820]  others that will lead to use in partnerships leading organizations.
[48:40.820 --> 48:45.160]  This model has some principles, the outputs are not direct for any single piece or initiatives
[48:45.160 --> 48:46.160]  of motion with regards to this.
[48:46.160 --> 48:52.320]  There will be announcements next week about this and various entities that we're bringing
[48:52.320 --> 48:53.320]  in place for that.
[48:53.320 --> 48:57.000]  That's all I can say, because I'm not allowed to spoil announcements, but we've been working
[48:57.000 --> 48:59.440]  super hard on this.
[48:59.440 --> 49:05.720]  I think there's two or maybe three announcements, it'll be 17th and 18th will be the dates of
[49:05.720 --> 49:06.720]  those.
[49:06.720 --> 49:11.800]  Aha, I'm through the questions, I think.
[49:11.800 --> 49:16.320]  Mod team, are we through the questions?
[49:16.320 --> 49:19.640]  Okay.
[49:19.640 --> 49:22.560]  I think now go back to center stage.
[49:22.560 --> 49:26.800]  I do not know how, there are no requests, so I can't do requests.
[49:26.800 --> 49:29.320]  Are there any other questions from anyone?
[49:29.320 --> 49:30.920]  Okay.
[49:30.920 --> 49:34.560]  As the mod team are not posting, I'm going to look in the chat.
[49:34.560 --> 49:42.400]  When will stability and Luther be able to translate geese to speech in real time?
[49:42.400 --> 49:46.280]  I think the kind of honking models are very complicated.
[49:46.280 --> 49:49.200]  Actually, this is actually very interesting.
[49:49.200 --> 49:53.640]  People have actually been using diffusion models to translate animal speech and understand
[49:53.640 --> 49:54.640]  it.
[49:54.640 --> 49:58.040]  If you look at something like whisper, it might actually be in reach.
[49:58.040 --> 50:02.360]  Whisper by open AI, they open sourced it kindly, I wonder what caused them to do that, is a
[50:02.360 --> 50:05.240]  fantastic speech to text model.
[50:05.240 --> 50:07.920]  One of the interesting things about it is you can change the language you're speaking
[50:07.920 --> 50:10.700]  in the middle of a sentence and it'll still pick that up.
[50:10.700 --> 50:14.020]  So if you train it enough, then you'll be able to kind of do that.
[50:14.020 --> 50:17.880]  So one of the entities we're talking with wants to train based on whale song to understand
[50:17.880 --> 50:18.880]  whales.
[50:18.880 --> 50:21.720]  Now this sounds a bit like Star Trek, but that's okay, I like Star Trek.
[50:21.720 --> 50:25.800]  So we'll see how that goes.
[50:25.800 --> 50:29.400]  Will dream studio front-end be open source so it can be used on local GPUs?
[50:29.400 --> 50:32.360]  I do not believe there's any plans for that at the moment because dream studio is kind
[50:32.360 --> 50:36.700]  of our pro CMR end kind of thing, but you'll see more and more local GPU usage.
[50:36.700 --> 50:40.960]  So like, you know, you've got visions of chaos at the moment on windows machines by softology
[50:40.960 --> 50:46.160]  is fantastic, where you can run just about any of these notebooks like D forum and others
[50:46.160 --> 50:49.360]  or HLKY or whatever.
[50:49.360 --> 50:51.280]  And so I think that's kind of a good step.
[50:51.280 --> 50:55.280]  Similarly, if you look at the work being done on the Photoshop plugin, it will have local
[50:55.280 --> 50:57.560]  inference in a week or two.
[50:57.560 --> 51:01.720]  So you can use that directly from Photoshop and soon many other plugins.
[51:01.720 --> 51:07.400]  All right, Aldana says, what do you think of the situation where a Google engineer believed
[51:07.400 --> 51:09.240]  the AI chatbot achieved sentience?
[51:09.240 --> 51:10.240]  It did not.
[51:10.240 --> 51:11.240]  He was stupid.
[51:11.240 --> 51:17.120]  Um, unless you have a very low bar of sentience pose, you could, I mean, some people are barely
[51:17.120 --> 51:18.120]  sentient.
[51:18.120 --> 51:21.640]  It must be said, especially when they're arguing on the internet, never went an argument on
[51:21.640 --> 51:22.640]  the internet.
[51:22.640 --> 51:26.520]  That's another thing like facts don't really work on the internet.
[51:26.520 --> 51:28.120]  A lot of people have preconceived notions.
[51:28.120 --> 51:33.200]  Instead, you should try to just be like, you know, as open minded as possible and let people
[51:33.200 --> 51:34.200]  agree to disagree.
[51:34.200 --> 51:35.200]  All right.
[51:35.200 --> 51:40.700]  Andy Cochran says, thoughts on getting seamless equirectangular 360 degree and 180 degree
[51:40.700 --> 51:46.920]  and HDR outputs in one shot for image to text and text to image.
[51:46.920 --> 51:51.580]  I mean, you could use things like, I think I called it stream fusion, which was dream
[51:51.580 --> 51:54.540]  fusions, stable diffusion kind of combined.
[51:54.540 --> 51:59.080]  There are a bunch of data sets that we're working on to enable this kind of thing, especially
[51:59.080 --> 52:00.080]  from GoPro and others.
[52:00.080 --> 52:04.680]  Um, but I think it'd probably be a year or two away still.
[52:04.680 --> 52:08.080]  Funky McShot, Emma, has any plans for text and three diffusion models?
[52:08.080 --> 52:09.080]  Yes, there are.
[52:09.080 --> 52:10.960]  And they are in the works.
[52:10.960 --> 52:13.920]  Malcontender with some of the recent backlash from artists.
[52:13.920 --> 52:17.200]  Is there anything you wish that SD did differently in the earliest stages that would have changed
[52:17.200 --> 52:19.080]  the framing around image synthesis?
[52:19.080 --> 52:20.680]  No, really.
[52:20.680 --> 52:24.680]  I mean, like the point is that these things can be fine tuned anyway.
[52:24.680 --> 52:26.920]  So I think people have attacked fine tuning.
[52:26.920 --> 52:33.840]  Um, I mean, ultimately it's like, I understand the fear, this is threatening to their jobs
[52:33.840 --> 52:38.440]  and the thing cause anyone can kind of do it, but it's not like ethically correct for
[52:38.440 --> 52:40.800]  them to say, actually, we don't want everyone to be artists.
[52:40.800 --> 52:45.560]  So instead they focus on, it's taken my art and trained on my art and you know, it's impossible
[52:45.560 --> 52:47.720]  for this to work without my art.
[52:47.720 --> 52:48.720]  Not really.
[52:48.720 --> 52:51.480]  So you train on ImageNet and it can still create just about any composition.
[52:51.480 --> 52:55.680]  Um, again, part of the problem was having the clip model embedded in there because the
[52:55.680 --> 52:57.080]  clip model knows a lot of stuff.
[52:57.080 --> 53:03.000]  We don't know what's in the open AI dataset, um, as should we do kind of, and it's interesting.
[53:03.000 --> 53:07.600]  Um, I think that all we can do is kind of learn from the feedback from the people that
[53:07.600 --> 53:13.400]  aren't shouting at us or like, uh, you know, members of the team have received death threats
[53:13.400 --> 53:15.680]  and other things which are completely over the line.
[53:15.680 --> 53:21.160]  Um, this is again, a reason why I think caution is the better part of what we're doing right
[53:21.160 --> 53:22.160]  now.
[53:22.160 --> 53:25.520]  Um, like, you know, we have put ourselves in our way, like my inbox does look a bit
[53:25.520 --> 53:30.560]  ugly, uh, in certain places, um, to try and calm things down and really listen to the
[53:30.560 --> 53:35.360]  calmer voices there and try and build systems so people can be represented appropriately.
[53:35.360 --> 53:36.360]  It's not an easy question.
[53:36.360 --> 53:42.640]  Um, but again, like I think it's incumbent on us to try and help facilitate this conversation
[53:42.640 --> 53:45.920]  because it's an important question.
[53:45.920 --> 53:50.480]  Um, all right.
[53:50.480 --> 53:51.760]  See what's next.
[53:51.760 --> 53:55.560]  How does are you looking to decentralize GPU AI compute?
[53:55.560 --> 54:01.980]  Uh, yeah, we've got kind of models that enable that, um, hive minds that you'll see, um,
[54:01.980 --> 54:07.600]  on the decentralized learning side as an example whereby I'm trained on distributed GPUs, um,
[54:07.600 --> 54:08.600]  actually models.
[54:08.600 --> 54:13.720]  I think that we need the best version of that is on reinforcement learning models.
[54:13.720 --> 54:30.320]  I think those are deep learning models, especially when considering things like, uh, community
[54:30.320 --> 54:36.560]  models, et cetera, because as those proliferate and create their own custom models bind to
[54:36.560 --> 54:40.240]  your dream booth or others, there's no way that centralized systems can keep up.
[54:40.240 --> 54:43.680]  But I think decentralized compute is pretty cheap though.
[54:43.680 --> 54:45.520]  All right.
[54:45.520 --> 54:54.640]  Um, so, uh, oops, did I kind of disappear there for a second?
[54:54.640 --> 54:55.640]  Testing, testing.
[54:55.640 --> 54:56.640]  All right.
[54:56.640 --> 54:57.640]  I'm back.
[54:57.640 --> 54:59.640]  Can you hear me?
[54:59.640 --> 55:00.640]  All right.
[55:00.640 --> 55:01.640]  Sorry.
[55:01.640 --> 55:09.160]  Okay, um, are we going to do nerf type models?
[55:09.160 --> 55:10.160]  Yes.
[55:10.160 --> 55:12.820]  Um, I think nerfs are going to be the big thing.
[55:12.820 --> 55:18.120]  They are, um, going to be supported by Apple and Apple hardware.
[55:18.120 --> 55:20.960]  So I think you'll see lots of nerf type models there.
[55:20.960 --> 55:21.960]  Oops, sorry.
[55:21.960 --> 55:23.960]  I need my laptop now.
[55:23.960 --> 55:27.120]  Do you guys hate it when there's like a lack of battery?
[55:27.120 --> 55:31.560]  I think it's so small, but I can't remember if it was a TV show or if it was in real life.
[55:31.560 --> 55:36.520]  But there was like this app called, um, like I'm dying or something like that, that you
[55:36.520 --> 55:41.600]  could only use to message people when your battery life was like below 5% or something
[55:41.600 --> 55:42.600]  like that.
[55:42.600 --> 55:47.120]  I think that's a great idea if it doesn't exist for someone to create an actual life,
[55:47.120 --> 55:53.040]  like, you know, feeling a solidarity for that tension that occurs, you know, I think makes
[55:53.040 --> 55:55.920]  you realize the fragility of the human condition.
[55:55.920 --> 55:56.920]  All right.
[55:56.920 --> 56:02.320]  Um, wait, sorry, I meant to be doing center stage.
[56:02.320 --> 56:05.920]  Well, there's nobody who can help me.
[56:05.920 --> 56:09.240]  Can't figure out how to get loud people up on the stage.
[56:09.240 --> 56:15.280]  So back to the questions, will AI lead to UBI, Casey Edwin, maybe it'll either lead
[56:15.280 --> 56:20.760]  to UBI and utopia or panopticon that we can never escape from because the models that
[56:20.760 --> 56:28.120]  were previously used to focus our attention and service ads will be used to control our
[56:28.120 --> 56:29.120]  brains instead.
[56:29.120 --> 56:30.920]  And they're really good at that.
[56:30.920 --> 56:35.960]  So, you know, no big deal, just two forks in the road.
[56:35.960 --> 56:39.960]  That's the way we kind of do.
[56:39.960 --> 56:43.240]  Um, let's see.
[56:43.240 --> 56:44.240]  Who's next?
[56:44.240 --> 56:47.280]  Joe Rogan, when will we be able to generate games with AI?
[56:47.280 --> 56:50.160]  You can already generate games with AI.
[56:50.160 --> 56:54.280]  So the code models allow you to create basic games, but then we've had generative games
[56:54.280 --> 56:55.640]  for many years already.
[56:55.640 --> 57:02.320]  Um, so I'm just trying to figure out how to get people on stage or do this.
[57:02.320 --> 57:04.600]  Maybe we don't.
[57:04.600 --> 57:05.600]  Okay.
[57:05.600 --> 57:11.160]  Um, Mars says, how's your faith influence your mission?
[57:11.160 --> 57:15.680]  I mean, it's just like all faiths are the same.
[57:15.680 --> 57:17.880]  Do you want to others as you'd have done unto yourself, right?
[57:17.880 --> 57:20.800]  The golden rule, um, for all the stuff around there.
[57:20.800 --> 57:24.840]  I think people forget that we are just trying to do our best.
[57:24.840 --> 57:26.840]  Like it can lead to bad things though.
[57:26.840 --> 57:32.880]  So Robert chief rabbi, Jonathan Sacks, sadly past very smart guy had this concept of altruistic
[57:32.880 --> 57:36.820]  evil with people who tried to do good, can do the worst evil because they believe they're
[57:36.820 --> 57:37.820]  doing good.
[57:37.820 --> 57:41.120]  No one wants to be in our soul and bad, even if we have our arguments and it makes us forget
[57:41.120 --> 57:42.120]  our humanity.
[57:42.120 --> 57:47.080]  So I think again, like what I really want to focus on is this idea of public interest
[57:47.080 --> 57:51.440]  and bring this technology to the masses because I don't want to have this world where I looked
[57:51.440 --> 57:56.520]  at the future and there's this AI God that is controlled by a private enterprise.
[57:56.520 --> 58:02.600]  Like that enterprise would be more powerful than any nation unelected and in control of
[58:02.600 --> 58:03.600]  everything.
[58:03.600 --> 58:05.560]  And that's not a future that I want from my children.
[58:05.560 --> 58:10.340]  I think, um, because again, I would not want that done unto me and I think it should be
[58:10.340 --> 58:13.760]  made available for people who have different viewpoints to me as well.
[58:13.760 --> 58:17.000]  This is why, like I said, look, I know that there was a lot of tension over the weekend
[58:17.000 --> 58:21.160]  and everything on the community, but we really shouldn't be the only community for this.
[58:21.160 --> 58:24.280]  And we don't want to be the sole arbiter of everything here.
[58:24.280 --> 58:27.800]  We're not open AI or deep mind or anyone like that.
[58:27.800 --> 58:31.840]  We're really trying to just be the catalyst to build ecosystems where you can find your
[58:31.840 --> 58:35.280]  own place, whether you agree with us or disagree with us.
[58:35.280 --> 58:40.840]  Um, having said that, I mean like the stable diffusion hashtag has been taken over by wife
[58:40.840 --> 58:44.000]  who diffusion, like big boobs.
[58:44.000 --> 58:45.000]  It's fine.
[58:45.000 --> 58:48.080]  Maybe just stick to the wife who diffusion tag, cause it's harder for me to find the
[58:48.080 --> 58:50.680]  stable diffusion pictures in my own media now.
[58:50.680 --> 58:55.640]  Um, so yeah, I think that also it'd be nice when people of other faiths or no faith can
[58:55.640 --> 58:57.000]  actually talk together reasonably.
[58:57.000 --> 59:00.440]  Um, and that's one of the reasons that we accelerated AR and faith.org.
[59:00.440 --> 59:03.520]  Again, you don't have to agree with it, but just realize these are some of the stories
[59:03.520 --> 59:08.440]  that people subscribe to and everyone's got their own faith in something or other, literally
[59:08.440 --> 59:09.440]  not.
[59:09.440 --> 59:12.840]  Well, if he says, how are you going to train speed cost and TPUs versus a one hundreds
[59:12.840 --> 59:17.600]  or the cost of switching TensorFlow from PyTorch to great, we have code that works on both.
[59:17.600 --> 59:22.600]  And we have had great results on TPU V4s, the horizontal and vertical scaling works
[59:22.600 --> 59:23.600]  really nicely.
[59:23.600 --> 59:25.920]  And gosh, there is something called a V5 coming soon.
[59:25.920 --> 59:27.480]  That'd be interesting.
[59:27.480 --> 59:31.600]  Um, you will see models trained across a variety of different architectures and we're trying
[59:31.600 --> 59:33.600]  just about all the top ones there.
[59:33.600 --> 59:38.240]  Uh, Glincey says, does StabilityEye have plans to take on investors at any point or have
[59:38.240 --> 59:39.240]  they already?
[59:39.240 --> 59:40.240]  We have taken on investors.
[59:40.240 --> 59:42.000]  There will be an announcement on that.
[59:42.000 --> 59:45.480]  We have given up zero control and we will not give up any control.
[59:45.480 --> 59:47.200]  I am very good at this.
[59:47.200 --> 59:53.240]  Um, as I mentioned previously, the original stable diffusion model was financed by some
[59:53.240 --> 59:56.280]  of the leading AI artists in the world and collectors.
[59:56.280 --> 59:58.320]  And so, you know, we've been kind of community focused.
[59:58.320 --> 01:00:03.360]  I wish that we could do a token sale or an IPO or something and be community focused,
[01:00:03.360 --> 01:00:05.080]  but it just doesn't fit with regulations right now.
[01:00:05.080 --> 01:00:09.080]  So anything that I can say is that we will and will always be independent.
[01:00:09.080 --> 01:00:14.960]  Uh, no one's going to tell us what to do because otherwise we can't pivot to waifus if it turns
[01:00:14.960 --> 01:00:17.520]  out that waifu diffusion is the next big thing.
[01:00:17.520 --> 01:00:18.520]  All right.
[01:00:18.520 --> 01:00:20.600]  Um, who have we got now?
[01:00:20.600 --> 01:00:24.680]  We've got Notepad.
[01:00:24.680 --> 01:00:28.360]  How much of an impact do you think AI will impact neural implant cybernetics?
[01:00:28.360 --> 01:00:34.000]  It appears one of the limiting facts of cybernetics is the input method, not necessarily the hardware.
[01:00:34.000 --> 01:00:35.000]  I don't know.
[01:00:35.000 --> 01:00:39.200]  I guess you have no idea too much, I never thought about that.
[01:00:39.200 --> 01:00:44.560]  Um, yeah, like I think that it's probably required for the interface layer.
[01:00:44.560 --> 01:00:48.400]  The way that you should look at this technology is that you've got the highest structure to
[01:00:48.400 --> 01:00:50.480]  the unstructured world, right?
[01:00:50.480 --> 01:00:52.400]  And this acts as a bridge between it.
[01:00:52.400 --> 01:00:57.720]  So like with stable diffusion, you can communicate in images that you couldn't do otherwise.
[01:00:57.720 --> 01:01:01.640]  Cybernetics is about the kind of interface layer between humans and computers.
[01:01:01.640 --> 01:01:05.160]  And again, you're removing that in one direction and the cybernetics allow you to remove it
[01:01:05.160 --> 01:01:06.160]  in the other direction.
[01:01:06.160 --> 01:01:08.360]  So you're going to have much better information flow.
[01:01:08.360 --> 01:01:11.400]  So I think it will have a massive impact from these foundation devices.
[01:01:11.400 --> 01:01:13.240]  All right.
[01:01:13.240 --> 01:01:18.560]  Um, over my AI cannot make cyberpunk 2077 not broken now.
[01:01:18.560 --> 01:01:24.320]  I was the largest investor in CD project at one point and it is a crying shame what happened
[01:01:24.320 --> 01:01:25.320]  there.
[01:01:25.320 --> 01:01:28.440]  Uh, I have a lot of viewpoints on that one.
[01:01:28.440 --> 01:01:33.200]  Um, but you know, we can create like cyberpunk worlds of our own in what did I say?
[01:01:33.200 --> 01:01:34.200]  Five years.
[01:01:34.200 --> 01:01:35.200]  Yeah.
[01:01:35.200 --> 01:01:36.200]  Not Elon Musk in there.
[01:01:36.200 --> 01:01:38.800]  So that's going to be pretty exciting.
[01:01:38.800 --> 01:01:43.000]  Um, do what is next?
[01:01:43.000 --> 01:01:48.080]  Uh, are you guys sure you guys planning on creating any hardware devices?
[01:01:48.080 --> 01:01:51.120]  So we can see more oriented one, which has AI as OS.
[01:01:51.120 --> 01:01:55.880]  Uh, we have been looking into customized ones.
[01:01:55.880 --> 01:02:01.680]  Um, so some of the kind of edge architecture, but it won't be for a few years on the AI
[01:02:01.680 --> 01:02:02.680]  side.
[01:02:02.680 --> 01:02:05.120]  Actually, that will be, it'll probably be towards the next year because we've got that
[01:02:05.120 --> 01:02:06.520]  on our tablets.
[01:02:06.520 --> 01:02:10.640]  So we've got basically a fully integrated stack or tablets for education, healthcare,
[01:02:10.640 --> 01:02:11.640]  and others.
[01:02:11.640 --> 01:02:13.960]  And again, we were trying to open source as much as possible.
[01:02:13.960 --> 01:02:19.360]  So looking to risk five and alternative architectures there, um, probably announcement there in
[01:02:19.360 --> 01:02:25.840]  Q1, I think, um, anything specific you'd like to see out of the community I'm at?
[01:02:25.840 --> 01:02:28.960]  I just like people to be nice to each other, right?
[01:02:28.960 --> 01:02:31.440]  Like communities are hard.
[01:02:31.440 --> 01:02:32.840]  It's hard to scale community.
[01:02:32.840 --> 01:02:38.320]  Like humans are designed for one to 150 and what happens is that as we scale communities
[01:02:38.320 --> 01:02:45.080]  bigger than that, this dark monster of our being, Moloch, kind of comes out.
[01:02:45.080 --> 01:02:49.880]  People get like really angsty and there's always going to be education, there's always
[01:02:49.880 --> 01:02:50.880]  going to be drama.
[01:02:50.880 --> 01:02:54.120]  How many communities do you know that aren't drama and like, just consider what your aunts
[01:02:54.120 --> 01:02:55.980]  do and they chat all the time.
[01:02:55.980 --> 01:02:56.980]  It's all kind of drama.
[01:02:56.980 --> 01:03:02.640]  Um, I like to focus on being positive and constructive as much as possible and acknowledging
[01:03:02.640 --> 01:03:04.200]  that everyone is bored humans.
[01:03:04.200 --> 01:03:06.640]  But again, sometimes you make tough decisions.
[01:03:06.640 --> 01:03:08.200]  I made a tough decision this weekend.
[01:03:08.200 --> 01:03:09.200]  It might be right.
[01:03:09.200 --> 01:03:13.620]  It might be wrong, but you know, it's what I thought was best for the community.
[01:03:13.620 --> 01:03:17.080]  We wanted to have checks and balances and things, but it's a work in progress.
[01:03:17.080 --> 01:03:23.200]  Like I don't know how many people we've got in the community right now, um, like 60,000
[01:03:23.200 --> 01:03:24.200]  or something like that.
[01:03:24.200 --> 01:03:32.480]  Um, that's a lot of people and you know, I think it's, um, 78,000, that's a lot of fricking
[01:03:32.480 --> 01:03:33.480]  people.
[01:03:33.480 --> 01:03:38.560]  That's like a small town in the U S or like a city in Finland or something like that.
[01:03:38.560 --> 01:03:39.560]  Right.
[01:03:39.560 --> 01:03:44.920]  Um, so yeah, I just like people to be excellent to each other and Mr. M says, how are you
[01:03:44.920 --> 01:03:45.920]  Ahmad?
[01:03:45.920 --> 01:03:47.080]  I'm a bit tired.
[01:03:47.080 --> 01:03:52.000]  Back in London for the first time in a long time, I was traveling, trying to get the education
[01:03:52.000 --> 01:03:53.000]  thing set up.
[01:03:53.000 --> 01:03:54.800]  There's a stability Africa set up as well.
[01:03:54.800 --> 01:03:59.080]  Um, there's some work that we're doing in Lebanon, which unfortunately is really bad.
[01:03:59.080 --> 01:04:03.560]  Um, I said stability does a lot more than image and it's just been a bit of a stretch
[01:04:03.560 --> 01:04:05.640]  even now with a hundred people.
[01:04:05.640 --> 01:04:08.580]  But the reason that we're doing everything so aggressively is cause you kind of have
[01:04:08.580 --> 01:04:13.480]  to, um, because there's just a lot of unfortunateness in the world.
[01:04:13.480 --> 01:04:17.080]  And I think you'd feel worse about yourself if you don't have to.
[01:04:17.080 --> 01:04:22.760]  And there's an interesting piece I read recently, um, it's like, I know Simon freed, uh, FTX,
[01:04:22.760 --> 01:04:24.560]  you know, he's got this thing about effective altruism.
[01:04:24.560 --> 01:04:26.940]  He talks about this thing of expected utility.
[01:04:26.940 --> 01:04:28.440]  How much impact can you make on the world?
[01:04:28.440 --> 01:04:29.600]  And you have to make big bets.
[01:04:29.600 --> 01:04:31.000]  So I made some really big bets.
[01:04:31.000 --> 01:04:33.640]  I put all my money into fricking GPU's.
[01:04:33.640 --> 01:04:35.800]  I really created together a team.
[01:04:35.800 --> 01:04:41.160]  I got government international backing and a lot of stuff because I think you, everyone
[01:04:41.160 --> 01:04:45.120]  has agency and you have to figure out where you can add the most agency and accelerate
[01:04:45.120 --> 01:04:46.120]  things up there.
[01:04:46.120 --> 01:04:50.480]  Uh, we have to bring in the best systems and we've built this multivariate system with
[01:04:50.480 --> 01:04:55.460]  multiple communities and now we're doing joint ventures in every single country because we
[01:04:55.460 --> 01:04:57.240]  think that is a whole new world.
[01:04:57.240 --> 01:05:01.880]  Again, like there's another great piece Sequoia did recently about generative AI being a whole
[01:05:01.880 --> 01:05:03.360]  new world that will create trillions.
[01:05:03.360 --> 01:05:06.300]  We're at this tipping point right now.
[01:05:06.300 --> 01:05:09.280]  And so I think unfortunately you've got to work hard to do that because it's a once in
[01:05:09.280 --> 01:05:10.280]  a lifetime opportunity.
[01:05:10.280 --> 01:05:14.440]  Just like everyone in this community here has a once in a lifetime opportunity.
[01:05:14.440 --> 01:05:18.800]  You know about this technology that how many people in your community know about now?
[01:05:18.800 --> 01:05:22.680]  Everyone in the world, everyone that you know will be using this in a few years and no one
[01:05:22.680 --> 01:05:28.000]  knows the way it's going to go.
[01:05:28.000 --> 01:05:32.880]  Forced to feel and communities, what's a good way to handle possible tribalism, extremism?
[01:05:32.880 --> 01:05:38.480]  So if you Google me and me, my name, you'll see me writing in the wall street journal
[01:05:38.480 --> 01:05:41.120]  and Reuters and all sorts of places about counter extremism.
[01:05:41.120 --> 01:05:45.800]  It's one of my expert topics and unfortunately it's difficult with the social media echo
[01:05:45.800 --> 01:05:50.720]  changers to kind of get out of that and you find people going in loops because sometimes
[01:05:50.720 --> 01:05:51.720]  things aren't fair.
[01:05:51.720 --> 01:05:54.240]  Like, you know, again, let's take our community.
[01:05:54.240 --> 01:05:57.800]  For example, this weekend actions were taken, you know, the banning that we could sit down
[01:05:57.800 --> 01:05:58.800]  fair.
[01:05:58.800 --> 01:06:04.720]  And again, that's understandable because it's not a cut and dry, easy decision.
[01:06:04.720 --> 01:06:06.860]  You had kind of the discussions going on loop.
[01:06:06.860 --> 01:06:10.060]  You had people saying some really unpleasant things, you know, some of the stuff made me
[01:06:10.060 --> 01:06:13.980]  kind of sad because I was exhausted and you know, people questioning my motivations and
[01:06:13.980 --> 01:06:14.980]  things like that.
[01:06:14.980 --> 01:06:20.680]  And again, it's your prerogative, but as a community member myself, it made me feel bad.
[01:06:20.680 --> 01:06:23.600]  I think the only way that you can really fight extremism and some things like that is to
[01:06:23.600 --> 01:06:26.080]  have checks and balances and processes in place.
[01:06:26.080 --> 01:06:27.760]  The mod team have been working super hard on that.
[01:06:27.760 --> 01:06:32.840]  I think this community has been really well behaved, like, you know, it was super difficult
[01:06:32.840 --> 01:06:36.780]  and some of the community members got really burned out during the beta because they had
[01:06:36.780 --> 01:06:39.280]  to put up with a lot of shit, to put it quite simply.
[01:06:39.280 --> 01:06:44.000]  But getting people on the same page, getting a common mission and kind of having a degree
[01:06:44.000 --> 01:06:47.880]  of psychological safety where people can say what they want, which is really difficult
[01:06:47.880 --> 01:06:50.080]  in a community where you don't know where everyone is.
[01:06:50.080 --> 01:06:53.040]  That's the only way that you can get around some of this extremism and some of this hate
[01:06:53.040 --> 01:06:54.040]  element.
[01:06:54.040 --> 01:06:55.520]  Again, I think the common mission is the main thing.
[01:06:55.520 --> 01:06:59.560]  I think everyone here is in a common mission to build cool shit, create cool shit.
[01:06:59.560 --> 01:07:05.440]  And you know, like I said, the tagline kind of create, don't hate, right?
[01:07:05.440 --> 01:07:08.120]  People said, Emad, in real meetup for us members.
[01:07:08.120 --> 01:07:12.640]  Yeah, we're going to have little stability societies all over the place and hackathons.
[01:07:12.640 --> 01:07:15.880]  We're just putting an events team together to really make sure they're well organized
[01:07:15.880 --> 01:07:17.800]  and not our usual disorganized shambles.
[01:07:17.800 --> 01:07:23.180]  But you know, feel free to do it yourselves, you know, like, we're happy to amplify it
[01:07:23.180 --> 01:07:25.680]  when community members take that forward.
[01:07:25.680 --> 01:07:28.840]  And the things we're trying to encourage are going to be like artistic oriented things,
[01:07:28.840 --> 01:07:32.240]  get into the real world, go and see galleries, go and understand things, go and paint, that's
[01:07:32.240 --> 01:07:34.040]  good painting lessons, etc.
[01:07:34.040 --> 01:07:41.320]  As well as hackathons and all this more techy stuff, techy kind of stuff.
[01:07:41.320 --> 01:07:44.640]  You can be part of the events team by messaging careers at stability.ai.
[01:07:44.640 --> 01:07:48.640]  Again, we will have a careers page up soon with all the roles, we'll probably go to like
[01:07:48.640 --> 01:07:52.280]  250 people in the next few months.
[01:07:52.280 --> 01:07:57.080]  And yeah, it's going very fast.
[01:07:57.080 --> 01:07:58.960]  Protrins says, any collaboration in China yet?
[01:07:58.960 --> 01:08:02.500]  Can we use Chinese clip to guide the current one or do we need to retrain the model, embed
[01:08:02.500 --> 01:08:04.440]  the language clip into the model?
[01:08:04.440 --> 01:08:09.000]  I think you'll see a Chinese variant of stable diffusion coming out very soon.
[01:08:09.000 --> 01:08:11.180]  Can't remember what the current status is.
[01:08:11.180 --> 01:08:15.200]  We do have a lot of plans in China, we're talking to some of the coolest entities there.
[01:08:15.200 --> 01:08:20.880]  As you know, it's difficult due to sanctions and the Chinese market, but it's been heartening
[01:08:20.880 --> 01:08:23.720]  to see the community expand in China so quickly.
[01:08:23.720 --> 01:08:32.560]  And again, as it's open source, it didn't need us to go in there to kind of do that.
[01:08:32.560 --> 01:08:37.400]  I'd say that on the community side, we're going to try and accelerate a lot of the engagement
[01:08:37.400 --> 01:08:38.400]  things.
[01:08:38.400 --> 01:08:44.760]  I think that the Doctor Fusion one's ongoing, you know, shout out to Dreitweik for Nerf
[01:08:44.760 --> 01:08:49.720]  Gun and Almost 80 for kind of the really amazing kind of output there.
[01:08:49.720 --> 01:08:54.120]  I don't think we do enough to appreciate the things that you guys post up and simplify
[01:08:54.120 --> 01:08:55.120]  them.
[01:08:55.120 --> 01:08:56.440]  And I really hope we can do better in future.
[01:08:56.440 --> 01:08:59.580]  The mod team are doing as much as they can right now.
[01:08:59.580 --> 01:09:05.120]  And again, will we try to amplify the voices of the artistic members of our community as
[01:09:05.120 --> 01:09:12.680]  well, more and more, and give support through grants, credits, events and other things as
[01:09:12.680 --> 01:09:15.680]  we go forward.
[01:09:15.680 --> 01:09:20.040]  All right, who's next?
[01:09:20.040 --> 01:09:21.040]  We've got Almark.
[01:09:21.040 --> 01:09:24.920]  Is there going to be a time when we have AI friends we create ourselves, personal companions
[01:09:24.920 --> 01:09:29.200]  speaking to us via our monitor, much of the same way a webcam call is done, high quality,
[01:09:29.200 --> 01:09:30.200]  et cetera?
[01:09:30.200 --> 01:09:34.680]  Yes, you will have her from Joachim Phoenix's movie, Her, with Scarlett Johansson.
[01:09:34.680 --> 01:09:35.680]  Disparic in your ear.
[01:09:35.680 --> 01:09:40.680]  Hopefully she won't dub you at the end, but you can't guarantee that.
[01:09:40.680 --> 01:09:48.160]  If you look at some of the text to speech being emotionally resonant, then, you know,
[01:09:48.160 --> 01:09:50.420]  it's kind of creepy, but it's very immersive.
[01:09:50.420 --> 01:09:52.800]  So I think voice will definitely be there first.
[01:09:52.800 --> 01:09:56.160]  Again, try talking to a character.AI model and you'll see how good some of these chat
[01:09:56.160 --> 01:09:57.160]  bots can be.
[01:09:57.160 --> 01:09:59.040]  There are much better ones coming.
[01:09:59.040 --> 01:10:06.440]  We've seen this already with Xiaoshi in China, so Alice, which a lot of people use for mental
[01:10:06.440 --> 01:10:09.480]  health support and then Elisa in Iran.
[01:10:09.480 --> 01:10:12.600]  So millions of people use these right now as their friends.
[01:10:12.600 --> 01:10:15.080]  Again, it's good to have friends.
[01:10:15.080 --> 01:10:20.080]  Again, we recommend sevencups.com if you want to have someone to talk to, but it's not the
[01:10:20.080 --> 01:10:24.440]  same person each time or, you know, like just going out and making friends, but it's not
[01:10:24.440 --> 01:10:25.440]  easy.
[01:10:25.440 --> 01:10:28.960]  I think this will help a lot of people with their mental health, et cetera.
[01:10:28.960 --> 01:10:32.280]  He basically says, how early do you think we are in this AI wave that's emerging?
[01:10:32.280 --> 01:10:33.480]  How fast it's changing?
[01:10:33.480 --> 01:10:35.360]  Sometimes it's hard to feel FOMO.
[01:10:35.360 --> 01:10:38.440]  It is actually literally exponential.
[01:10:38.440 --> 01:10:45.660]  So like when you do a log normal return of the number of AI papers that are coming out,
[01:10:45.660 --> 01:10:47.240]  it's a straight line.
[01:10:47.240 --> 01:10:50.040]  So it's literally an exponential kind of curve.
[01:10:50.040 --> 01:10:51.780]  Like I can't keep up with it.
[01:10:51.780 --> 01:10:53.040]  No one can keep up with it.
[01:10:53.040 --> 01:10:54.440]  We have no idea what's going on.
[01:10:54.440 --> 01:10:58.760]  And the technology advances like there's that meme.
[01:10:58.760 --> 01:11:02.680]  Like one hour here is seven years on earth.
[01:11:02.680 --> 01:11:07.480]  Like from interstellar, that's how life kind of feels like I was on top of it for a few
[01:11:07.480 --> 01:11:11.400]  years and now it's like, I didn't even know what's happening.
[01:11:11.400 --> 01:11:12.800]  Here we go.
[01:11:12.800 --> 01:11:17.920]  It's a doubling rate of 24 months.
[01:11:17.920 --> 01:11:20.140]  It's a bit insane.
[01:11:20.140 --> 01:11:21.140]  So yeah.
[01:11:21.140 --> 01:11:22.960]  As wonky says any comments on Harmony AI?
[01:11:22.960 --> 01:11:26.000]  How close do you think we are to having music sound AI with the same accessibility afforded
[01:11:26.000 --> 01:11:27.360]  by stable diffusion?
[01:11:27.360 --> 01:11:31.520]  Now, Harmony has done a slightly different model of releasing dance diffusion gradually.
[01:11:31.520 --> 01:11:37.680]  We're putting it out there as we license more and more data sets, some of the O and X and
[01:11:37.680 --> 01:11:39.280]  other work that's going on.
[01:11:39.280 --> 01:11:43.840]  I mean, basically considering that you're at the VQGAN moment right now, if you guys
[01:11:43.840 --> 01:11:50.080]  can remember that from all of a year ago or 18 months ago, it'll go exponential again
[01:11:50.080 --> 01:11:55.720]  because the amount of stuff here is going to go crazy.
[01:11:55.720 --> 01:12:00.480]  Like generative AI, look at that Sequoia link I posted is going to be the biggest investment
[01:12:00.480 --> 01:12:05.000]  theme of the next few years and literally tens of billions of dollars are going to be
[01:12:05.000 --> 01:12:09.100]  deployed like probably next year alone into this sector.
[01:12:09.100 --> 01:12:13.360]  And most of it will go to stupid stuff, some will go to good stuff, most will go to stupid
[01:12:13.360 --> 01:12:17.960]  stuff but a decent amount will go to forwarding music in particular because the interesting
[01:12:17.960 --> 01:12:22.760]  thing about musicians is that they're already digitally intermediated versus artists who
[01:12:22.760 --> 01:12:23.760]  are not.
[01:12:23.760 --> 01:12:27.480]  So artists, some of them use Procreate and Photoshop, a lot of them don't.
[01:12:27.480 --> 01:12:32.440]  But musicians use synthesizers and DSPs and software all the time.
[01:12:32.440 --> 01:12:34.960]  So it's a lot easier to introduce some of these things to their workflow and then make
[01:12:34.960 --> 01:12:37.040]  it accessible to the people.
[01:12:37.040 --> 01:12:40.200]  Yeah, musicians just want more snares.
[01:12:40.200 --> 01:12:41.560]  You see the drum bass guy there.
[01:12:41.560 --> 01:12:45.920]  Safety mark, when do we launch the full Dream Studio and will it be able to do animations?
[01:12:45.920 --> 01:12:49.380]  If so, do you think it'll be more cost effective than using Colab?
[01:12:49.380 --> 01:12:53.640]  Very soon, yes, and yes, there we go.
[01:12:53.640 --> 01:12:55.680]  Keep an eye here.
[01:12:55.680 --> 01:13:01.480]  Then the next announcements won't be hopefully quite so controversial, but instead very exciting,
[01:13:01.480 --> 01:13:04.480]  shall we say.
[01:13:04.480 --> 01:13:09.240]  I'm running out of energy.
[01:13:09.240 --> 01:13:12.240]  So I think we're gonna take three more questions and then I'm going to be done.
[01:13:12.240 --> 01:13:14.520]  And then I'm going to go and have a nap.
[01:13:14.520 --> 01:13:18.900]  Do you think an AI therapist could be something to address the lack of access to qualified
[01:13:18.900 --> 01:13:21.600]  mental health experts, Racer X?
[01:13:21.600 --> 01:13:25.880]  I would rather have volunteers augmented by that.
[01:13:25.880 --> 01:13:31.160]  So again, with 7Cups.com, we have 480,000 volunteers helping 78 million people each
[01:13:31.160 --> 01:13:35.960]  month train on active listening that hopefully will augment by AI as we help them build their
[01:13:35.960 --> 01:13:36.960]  models.
[01:13:36.960 --> 01:13:45.040]  AI can only go so far, but the edge cases and the failure cases I think are too strong.
[01:13:45.040 --> 01:13:47.400]  And I think again, a lot of care needs to be taken around that because people's mental
[01:13:47.400 --> 01:13:48.400]  health is super important.
[01:13:48.400 --> 01:13:55.920]  At the same time, we're trialing art therapy with stable diffusion as a mental health adjunct
[01:13:55.920 --> 01:14:02.280]  in various settings from survivors of domestic violence to veterans and others.
[01:14:02.280 --> 01:14:07.120]  And I think it will have amazing results because there's nothing quite like the magic of using
[01:14:07.120 --> 01:14:08.120]  this technology.
[01:14:08.120 --> 01:14:14.320]  And I think, again, magic is kind of the operative word here that we have.
[01:14:14.320 --> 01:14:20.000]  That's how you know technology is cool.
[01:14:20.000 --> 01:14:22.000]  There's a nice article on magic.
[01:14:22.000 --> 01:14:24.000]  Two more questions.
[01:14:24.000 --> 01:14:31.840]  Ah, Disco, what are your thoughts on Buckminster Fuller's work and his thoughts on how to build
[01:14:31.840 --> 01:14:33.080]  a world that doesn't destroy himself?
[01:14:33.080 --> 01:14:35.760]  To be honest, I'm not familiar with it.
[01:14:35.760 --> 01:14:39.100]  But I think the world is destroying itself at the moment and we've got to do everything
[01:14:39.100 --> 01:14:40.100]  we can to stop it.
[01:14:40.100 --> 01:14:43.960]  Again, I mentioned earlier, one of the nice frames I've thought about this is really thinking
[01:14:43.960 --> 01:14:46.640]  about the rights of children because they can't defend themselves.
[01:14:46.640 --> 01:14:50.440]  And are we doing our big actions with a view to the rights of those children?
[01:14:50.440 --> 01:14:53.780]  I think that children have a right to this technology and that's every child, not just
[01:14:53.780 --> 01:14:55.040]  ones in the West.
[01:14:55.040 --> 01:14:58.740]  And that's why I think we need to create personalized systems for them and infrastructure so they
[01:14:58.740 --> 01:15:02.080]  can go up and kind of get out.
[01:15:02.080 --> 01:15:07.240]  All right, Ira, how will generative models and unlimited custom tailored content to an
[01:15:07.240 --> 01:15:10.160]  audience of one impact how we value content?
[01:15:10.160 --> 01:15:13.640]  The paradox of choice is more options tend to make people more anxious.
[01:15:13.640 --> 01:15:15.840]  And we get infinite choice right now.
[01:15:15.840 --> 01:15:19.840]  How do we get adapted to our new god-like powers in this hedonic treadmill?
[01:15:19.840 --> 01:15:21.840]  It's a net positive for humanity.
[01:15:21.840 --> 01:15:25.680]  How much consideration are we given to potential bad outcomes?
[01:15:25.680 --> 01:15:30.440]  I think this is kind of one of those interesting things whereby, like I was talking to Alexander
[01:15:30.440 --> 01:15:35.560]  Wang at scale about this and he posted something on everyone being in their own echo chambers
[01:15:35.560 --> 01:15:40.600]  as you basically get hedonic to death, entertained to death.
[01:15:40.600 --> 01:15:43.760]  Kind of like this WALL-E, you remember the fat guys with their VR headsets?
[01:15:43.760 --> 01:15:44.760]  Yeah, kind of like that.
[01:15:44.760 --> 01:15:45.760]  I don't think that's the case.
[01:15:45.760 --> 01:15:49.720]  I think people will use this to create stories because we're prosocial narrative creatures
[01:15:49.720 --> 01:15:53.840]  and the n equals one echo chambers are a result of the existing internet without intelligence
[01:15:53.840 --> 01:15:54.920]  on the edge.
[01:15:54.920 --> 01:16:01.040]  We want to communicate unless you have Asperger's like me and social communication disorder,
[01:16:01.040 --> 01:16:05.960]  in which case communicating is actually quite hard, but we learned how to do it.
[01:16:05.960 --> 01:16:08.840]  And I think, again, we're prosocial creatures that love seeing people listen to what we
[01:16:08.840 --> 01:16:09.840]  do.
[01:16:09.840 --> 01:16:15.000]  You've got likes and, you know, you've got this kind of hook model where you input something
[01:16:15.000 --> 01:16:20.600]  you're triggered and then you wait for verification and validation.
[01:16:20.600 --> 01:16:24.400]  So I think actually this will allow us to create our stories better and create a more
[01:16:24.400 --> 01:16:29.320]  egalitarian internet because right now the internet itself is this intelligence amplifier
[01:16:29.320 --> 01:16:33.080]  that means that some of the voices are more heard than others because some people know
[01:16:33.080 --> 01:16:36.720]  how to use the internet and they drown out those who do not and a lot of people don't
[01:16:36.720 --> 01:16:40.000]  even have access to this, so yeah.
[01:16:40.000 --> 01:16:50.520]  Alrighty, I am going to answer one more question because I'm tired now.
[01:16:50.520 --> 01:16:55.280]  Ivy Dory, when do you think multi-models will emerge combining language, video and image?
[01:16:55.280 --> 01:16:59.120]  I think they'll be here by Q1 of next year and they'll be good.
[01:16:59.120 --> 01:17:03.020]  I think that by 2024 they'll be truly excellent.
[01:17:03.020 --> 01:17:07.440]  You can look at the DeepMind Gato paper on the autoregression of different modalities
[01:17:07.440 --> 01:17:10.440]  on reinforcement learning to see some of the potential on this.
[01:17:10.440 --> 01:17:16.520]  So Gato is just a 1.3 billion parameter model that is a generalist agent.
[01:17:16.520 --> 01:17:21.120]  As we've kind of showed by merging image and others, these things can cross-learn just
[01:17:21.120 --> 01:17:25.000]  like humans and I think that's fascinating and that's why we have to create models for
[01:17:25.000 --> 01:17:28.980]  every culture, for every country, for every individual so we can learn from the diversity
[01:17:28.980 --> 01:17:33.240]  and plurality of humanity to create models that are aligned for us instead of against
[01:17:33.240 --> 01:17:34.240]  us.
[01:17:34.240 --> 01:17:38.280]  And I think that's much better than stack more layers and build giant freaking supercomputers
[01:17:38.280 --> 01:17:41.040]  to train models to serve ads or whatever.
[01:17:41.040 --> 01:17:43.680]  So with that, I bid you adieu.
[01:17:43.680 --> 01:17:48.560]  My apologies that I didn't bring anyone to the stage, the whole team is kind of busy
[01:17:48.560 --> 01:17:53.780]  right now and yeah, I am not good at technology right now and my brain is dead state.
[01:17:53.780 --> 01:17:56.800]  But hopefully it won't be too long until we kind of connect again, there will be a lot
[01:17:56.800 --> 01:18:00.040]  more community events coming up and engagement.
[01:18:00.040 --> 01:18:04.640]  Again I think it's been seven weeks, feels like seven years or seven minutes, I'm not
[01:18:04.640 --> 01:18:08.000]  even sure anymore, like I think we made a time machine.
[01:18:08.000 --> 01:18:11.400]  But hopefully we can start building stuff a lot more structured.
[01:18:11.400 --> 01:18:28.080]  So thanks all and you know, stay cool, rock on, bye.