221 shares

OpenAI Releases GPT-4: A Multimodal AI That Can See and Write

OpenAI, a research organization backed by Microsoft, has just announced the release of its latest artificial intelligence model, GPT-4. GPT-4 is a large multimodal model that can accept both text and images as input and generate text outputs based on them. It exhibits human-level performance on various professional and academic benchmarks, such as law exams, code generation, and language translation.

What is GPT-4 and what can it do?

GPT-4 stands for Generative Pre-trained Transformer 4, which is the fourth iteration of OpenAI’s series of large language models that use deep learning to learn from massive amounts of text data. GPT-4 has 1.5 trillion parameters, which is 10 times more than its predecessor, GPT-3.5.

GPT-4 is trained on a diverse dataset of text and images from the web, covering topics such as news, books, social media, Wikipedia, and more. It can generate coherent and fluent text on almost any topic given a prompt or a query. For example, it can write an essay about climate change, a summary of a book, or a product review.

But what makes GPT-4 different from previous models is that it can also accept images as input and generate text based on them. This means that it can perform tasks such as:

  • captioning photos
  • describing sketches
  • or creating websites from mock-ups.
  • It can also combine text and images to generate multimodal outputs such as memes or comics.

GPT-4’s capabilities are an improvement over the previous model, GPT-3.5, in terms of reliability,
creativity, and handling of nuanced instructions.

OpenAI tested the model on various benchmarks, including simulated exams designed for humans, and found that GPT-4 outperformed existing large language models. It also performs well in languages other than English, including low-resource languages such as Latvian, Welsh, Hebrew and Swahili.

How can you use GPT-4?

OpenAI has made GPT-4 available to developers and researchers through its ChatGPT platform and its API (with a waitlist). ChatGPT is a chatbot service that allows users to interact with GPT-4 using natural language. Users can choose from different personas and topics to chat with GPT-4 or ask it questions. The API allows developers to integrate GPT-4 into their own applications and customize its behavior using parameters such as temperature (creativity), top-p (diversity), frequency penalty (repetition), presence penalty (topic drift), stop sequence (end token), etc.

OpenAI has also been working on each aspect of the plan outlined in its post about defining the behavior of AIs,
including steerability. Developers and programmers can now describe the AI’s style and task in the “system” message. A great deal of personalization is possible for API users within bounds, allowing them to customize their users’ experiences.

What are the limitations and risks of GPT-4?

Even though GPT-4 offers many advantages over earlier models, it still has some limitations. Using language model outputs in high-stakes contexts should be done with caution since it can still “hallucinate” facts and make reasoning errors.
GPT-4 does not know about events after September 2021, so it can make simple reasoning mistakes
Assume false statements to be true and accept them as true. Moreover, its code might introduce security issues when confronted with challenging problems like humans.

There are times when GPT-4 makes confident, but incorrect predictions, and its work is not always thoroughly checked. The base model is good at predicting the accuracy of its answers, but this ability decreases after post-training. Despite GPT-4’s benefits, it poses new risks, including the generation of harmful advice and buggy code, or missing information.

OpenAI has made many changes to GPT-4 to make it safer than GPT-3.5 and has been working to mitigate risks.
For example:

* It uses filters to block harmful content such as hate speech or personal attacks;
* It uses alignment techniques to ensure that its outputs are consistent with human values;
* It uses transparency tools to provide provenance information for its outputs;
* It uses feedback mechanisms to allow users to report errors or abuse;
* It uses governance structures to oversee its development


Like it? Share with your friends!

221 shares
Nir Shein

Internet freak and technology geek; Aspiring screenwriter, devoted tech blogger & Technologer chief editor . Early adopter with a keen interest in gadgets, technology, internet and mobile.