No More Hustleporn: this clip from the gpt-4 demo was massively slept on
Tweet by Alex 
https://twitter.com/alexalbert__
    @alexalbert__:    
   
       
       this clip from the gpt-4 demo was massively slept on 
   
   gpt-4 can "see" your screen and describe the user interface of the application you are looking at 
   
   here's 
   
           @gdb    
   
       getting gpt-4 to describe a screenshot of a discord server in painstaking detail 
   
               @alexalbert__:    
   
       
       my take is that text generation will not be the main value prop of LLMs very soon 
   
   instead, it will be their ability to operate the tools we already use 
   
   in this example, gpt-4 proves that it has a near-human level of understanding of discord's UI through just one screenshot 
   
               @alexalbert__:    
   
       
       i've said it before and it's been hinted at by OAI employees 
   
   to everyone on AI twitter this may seem obvious but it's worth reiterating: chatGPT is NOT the final product here... it will look like a toy soon enough 
   
   
           twitter.com/alexalbert__/s…    
   
                            @alexalbert__:      
     
             
             prediction: 
     the GPT-4 iphone is going to be an app that uses the model's multimodal abilities to control your computer for you in a self-driving fashion 
     
     the discord screenshot example in the gpt4 demo was just too obvious that it's within its current capabilities 
     
                           
      twitter.com/rapha_gl/status/1636041957029060608      
     
                      @alexalbert__:    
   
       
       OpenAI still has a few hurdles to solve like speed, cost, and reliability 
   
   but once these issues are ironed out, expect to see Microsoft's Edge transform into a full copilot-like system with Bing Chat being the portal that you guide it through 
   
               @alexalbert__:    
   
       
       some are already starting to get this to work 
   
   in this example, GPT-4 breaks down complex browser-based tasks into actionable steps and navigates the web by evaluating custom browser-driving code that it itself generates 
   
   
           youtube.com/watch?v=Gndk9P…    
   
               @alexalbert__:    
   
       
       the best part about that example is that it doesn’t even use gpt-4's visual abilities 
   
   I can’t wait to see how much more powerful it will be when that is incorporated