No More Hustleporn: this clip from the gpt-4 demo was massively slept on

Tweet by Alex

      this clip from the gpt-4 demo was massively slept on
  gpt-4 can "see" your screen and describe the user interface of the application you are looking at
      getting gpt-4 to describe a screenshot of a discord server in painstaking detail
      my take is that text generation will not be the main value prop of LLMs very soon
  instead, it will be their ability to operate the tools we already use
  in this example, gpt-4 proves that it has a near-human level of understanding of discord's UI through just one screenshot
      i've said it before and it's been hinted at by OAI employees
  to everyone on AI twitter this may seem obvious but it's worth reiterating: chatGPT is NOT the final product here... it will look like a toy soon enough
    the GPT-4 iphone is going to be an app that uses the model's multimodal abilities to control your computer for you in a self-driving fashion
    the discord screenshot example in the gpt4 demo was just too obvious that it's within its current capabilities
      OpenAI still has a few hurdles to solve like speed, cost, and reliability
  but once these issues are ironed out, expect to see Microsoft's Edge transform into a full copilot-like system with Bing Chat being the portal that you guide it through
      some are already starting to get this to work
  in this example, GPT-4 breaks down complex browser-based tasks into actionable steps and navigates the web by evaluating custom browser-driving code that it itself generates
      the best part about that example is that it doesn’t even use gpt-4's visual abilities
  I can’t wait to see how much more powerful it will be when that is incorporated