OpenAI today announced that it is allowing third-party software developers to fine-tune — or modify the behavior of — custom versions of its signature new large multimodal model (LMM), GPT-4o, making it more suitable for the needs of their application or organization.
Whether it’s adjusting the tone, following specific instructions, or improving accuracy in technical tasks, fine-tuning enables significant enhancements with even small datasets.
Developers interested in the new capability can visit OpenAI’s fine-tuning dashboard, click “create,” and select gpt-4o-2024-08-06 from the base model dropdown menu.
The news comes less than a month after the company made it possible for developers to fine-tune the model’s smaller, faster, cheaper variant, GPT-4o mini — which is however, less powerful than the full GPT-4o.
“From coding to creative writing, fine-tuning can have a large impact on model performance across a variety of domains,” state OpenAI technical staff members John Allard and Steven Heidel in a blog post on the official company website. “This is just the start—we’ll continue to invest in expanding our model customization options for developers.”
Free tokens are offered now through September 23
The company notes that developers can achieve strong results with as few as a few dozen examples in their training data.
To kick off the new feature, OpenAI is offering up to 1 million tokens per day for free to use on fine-tuning GPT-4o for any third-party organization (customer) now through September 23, 2024.
Tokens refer to the numerical representations of letter combinations, numbers, and words that represent underlying concepts learned by an LLM or LMM.
As such, they effectively function like an AI model’s “native language” and are the measurement used by OpenAI and other model providers to determine how much information a model is ingesting (input) or providing (output). To fine-tune an LLM or LMM such as GPT-4o as a developer/customer, you need to convert the data relevant to your organization, team, or individual use case into tokens that it can understand, that is, tokenize it, which OpenAI’s fine-tuning tools provide.
However, this comes at a cost: ordinarily, it will cost $25 per 1 million tokens to fine-tune GPT-4o, while running the inference/production model of your fine-tuned version costs $3.75 per million input tokens and $15 per million output tokens.
For those working with the smaller GPT-4o mini model, 2 million free training tokens are available daily until September 23.
This offering extends to all developers on paid usage tiers, ensuring broad access to fine-tuning capabilities.
The move to offer free tokens comes as OpenAI faces steep competition in price from other proprietary providers such as Google and Anthropic, as well as from open-source models such as the newly unveiled Hermes 3 from Nous Research, a variant of Meta’s Llama 3.1.
However, with OpenAI and other closed/proprietary models, developers don’t have to worry about hosting the model inference or training it on their servers — they can use OpenAI for those purposes, or link their own preferred servers to OpenAI’s API.
Success stories highlight fine-tuning potential
The launch of GPT-4o fine-tuning follows extensive testing with select partners, demonstrating the potential of custom-tuned models across various domains.
Cosine, an AI software engineering firm, has leveraged fine-tuning to achieve state-of-the-art (SOTA) results of 43.8% on the SWE-bench benchmark with its autonomous AI engineer agent Genie — the highest of any AI model or product publicly declared to date.
Another standout case is Distyl, an AI solutions partner to Fortune 500 companies, whose fine-tuned GPT-4o ranked first on the BIRD-SQL benchmark, achieving an execution accuracy of 71.83%.
The model excelled in tasks such as query reformulation, intent classification, chain-of-thought reasoning, and self-correction, particularly in SQL generation.
Emphasizing safety and data privacy even as it’s used to fine-tune new models
OpenAI has reinforced that safety and data privacy remain top priorities, even as they expand customization options for developers.
Fine-tuned models allow full control over business data, with no risk of inputs or outputs being used to train other models.
Additionally, the company has implemented layered safety mitigations, including automated evaluations and usage monitoring, to ensure that applications adhere to OpenAI’s usage policies.
Yet research has shown that fine-tuning models can cause them to deviate from their guardrails and safeguards, and reduce their overall performance. Whether organizations believe it is worth the risk is up to them — however, clearly OpenAI thinks it is and is encouraging them to consider fine-tuning as a good option.
Indeed, when announcing new fine-tuning tools for developers back in April — such as epoch-based checkpoint creation — OpenAI stated at that time “We believe that in the future, the vast majority of organizations will develop customized models that are personalized to their industry, business, or use case.”
The release of new GPT-4o fine-tuning capabilities today underscores OpenAI’s ongoing commitment to that vision: a world in which every org has its own custom AI model.
Google is upgrading its Gemini writing tools in Gmail to help you polish drafts that you’ve already written. Now, among other Gemini-powered “Help me write” options like Formalize and Elaborate, you can tap “Polish” to refine your emails, Google says in a blog post. The company has also added shortcuts that appear in the body of your emails on Android and iOS, making it more obvious that there are AI writing tools to use.
The tools are available to people who pay for Google One AI Premium accounts or who have paid for Google’s Gemini add-on for Workspace. If that’s you, when you open an empty draft, you’ll see a “Help me write” shortcut appear that you can tap to have Gemini draft text for you. Once you have 12 or more words in a draft — AI-written or not — you should see a new “Refine my draft” shortcut, shown in gray letters below the words.
Swipe your thumb across the text, and you’ll be given the choice to Polish, Formalize, Elaborate, or Shorten, or to have Gemini just write a whole new draft for you. (And if the “Refine my draft” shortcut doesn’t appear, tapping the pencil icon does the same thing.)
Luma AI, a San Francisco-based startup, released Dream Machine 1.5 on Monday, marking a significant advancement in AI-powered video generation. This latest version of their text-to-video model offers enhanced realism, improved motion tracking, and more intuitive prompt understanding.
“Dream Machine 1.5 is here,” Luma AI announced on X.com. “Now with higher-quality text-to-video, smarter understanding of your prompts, custom text rendering, and improved image-to-video! Level up.”
The upgrade comes just two months after Dream Machine’s initial launch, highlighting the rapid pace of innovation in the AI video space.
One of the most notable improvements is the model’s ability to render text within generated videos, a feature that has traditionally challenged AI models. This advancement opens new possibilities for creating dynamic title sequences, animated logos, and on-screen graphics for presentations.
Breakthrough in text rendering: AI-generated videos now speak your language
One early access user (@aziz4ai) shared examples of the model’s capabilities on X.com, demonstrating its prowess in creating complex visual effects. In one instance, the model generated “Iridescent liquid 3D text” forming the word “LUMA,” showcasing smooth motion and clean execution.
Dream Machine 1.5 has also shown improved handling of non-English prompts. The same artist demonstrated this with Arabic language inputs, including a request for “a man cutting meat on a wooden board, transforming the pieces into the words ‘prepared daily’ in a cinematic way.”
The resulting video seamlessly blended text and imagery, indicating Dream Machine’s potential for multilingual content creation.
The upgrade boasts significant speed improvements, generating five seconds of high-quality video in approximately two minutes. This efficiency gain could prove crucial for content creators and marketers who need to iterate quickly on visual concepts.
Democratizing AI video: How Luma AI is outpacing giants like OpenAI and Kuaishou
Luma AI’s approach to making Dream Machine widely accessible has positioned it as a significant player in the rapidly evolving AI video generation market. While the field is becoming increasingly crowded, Luma’s strategy of continuous improvement and public availability sets it apart.
OpenAI’s Sora, while impressive in its capabilities, remains in a closed beta, accessible only to select partners. This exclusivity has limited its real-world testing and application. In contrast, Kuaishou’s Kling, which became publicly available about a month ago, has quickly gained traction. However, Luma AI’s Dream Machine has had a longer period of public accessibility, allowing it to build a substantial user base and gather extensive real-world feedback.
This head start has given Luma AI an edge in refining its model based on diverse use cases. The release of Dream Machine 1.5 demonstrates the company’s commitment to rapid iteration and improvement. By incorporating user feedback and real-world application data, Luma AI has been able to address specific pain points and enhance features that matter most to creators.
Industry analysts note that this approach of “democratized development” could lead to more robust and versatile AI video tools. The diverse range of content created by users across various industries provides Luma AI with a rich dataset for improvement, potentially accelerating its development cycle beyond what closed systems can achieve.
However, this open approach also brings challenges. As AI-generated video becomes more accessible and sophisticated, concerns about misuse, such as the creation of deepfakes or misleading content, have intensified. The industry is grappling with the need for robust detection methods and ethical guidelines. Luma AI’s position at the forefront of this democratization puts it in a unique position to lead discussions on responsible AI use, though the company has yet to publicly outline its stance on these critical issues.
As the AI video generation market continues to evolve, Luma AI’s strategy of openness and rapid iteration may prove to be a key differentiator. While competitors like Kling are catching up in terms of public availability, Luma’s longer track record and established user base could give it a sustained advantage in the race to define the future of AI-generated video content.
The future of visual content: Balancing innovation with ethical considerations
Despite these challenges, the release of Dream Machine 1.5 marks a significant milestone in the evolution of AI-generated video. As technology continues to improve, it has the potential to revolutionize industries ranging from entertainment and advertising to education and journalism.
For now, Luma AI seems focused on pushing the technical boundaries of what’s possible. As one user on Twitter noted, “The capabilities are stunning.” It remains to be seen how these capabilities will shape the future of visual content creation and consumption.
It’s not just you. The word “demure” is being used to describe just about everything online these days.
It all started earlier this month when TikTok creator Jools Lebron posted a video that would soon take social media by storm. The hair and makeup she’s wearing to work? Very demure. And paired with a vanilla perfume fragrance? How mindful.
In just weeks, Lebron’s words have become the latest vocabulary defining the internet this summer. In addition to her own viral content that continues to describe various day-to-day, arguably reserved or modest activities with adjectives like “demure,” “mindful” and “cutesy,” several big names have also hopped on the trend across social media platforms. Celebrities like Jennifer Lopez and Penn Badgley have shared their own playful takes, and even the White House used the words to boast the Biden-Harris administration’s recent student debt relief efforts.
The skyrocketing fame of Lebron’s “very mindful, very demure” influence also holds significance for the TikToker herself. Lebron, who identifies as a transgender woman, said in a post last week that she’s now able to finance the rest of her transition.
“One day, I was playing cashier and making videos on my break. And now, I’m flying across the country to host events,” Lebron said in the video, noting that her experience on the platform has changed her life.
She’s not alone. Over recent years, a handful of online creators have found meaningful income after gaining social media fame — but it’s still incredibly rare, and no easy feat.
Here’s what some experts say.
How can TikTok fame lead to meaningful sources of income?
There is no one recipe.
Finding resources to work as a creator full-time “is not as rare as it would have been years ago,” notes Erin Kristyniak, VP of global partnerships at marketing collaboration company Partnerize. But you still have to make content that meets the moment — and there’s a lot to juggle if you want to monetize.
On TikTok, most users who are making money pursue a combination of hustles. Brooke Erin Duffy, an associate professor of communication at Cornell University, explains that those granted admission into TikTok’s Creator Marketplace — the platform’s space for brand and creator collaborations — can “earn a kickback from views from TikTok expressly,” although that doesn’t typically pay very well.
Other avenues for monetization include more direct brand sponsorships, creating merchandise to sell, fundraising during livestreams, and collecting “tips” or “gifts” through features available to users who reach a certain following threshold. A lot of it also boils down to work outside of the platform.
And creators are increasingly working to build their social media presence across multiple platforms — particularly amid a potential ban of the ByteDance-owned app in the U.S., which is currently in a legal battle. Duffy notes adding that many are working on developing this wider online presence so they can “still have a financial lifeline” in case any revenue stream goes away.
Is it difficult to sustain?
Gaining traction in the macrocosm that is the internet is difficult as is — and while some have both tapped into trends that resonate and found sources of compensation that allow them to quit their nine-to-five, it still takes a lot of work to keep it going.
“These viral bursts of fame don’t necessarily translate into a stable, long-term career,” Duffy said. “On the surface, it’s kind of widely hyped as a dream job ... But I see this as a very superficial understanding of how the career works.”
Duffy, who has been studying social media content creation for a decade, says that she’s heard from creators who have months where they’re reaping tremendous sums of money from various sources of income — but then also months with nothing. “It’s akin to a gig economy job, because of the lack of stability,” she explained.
“The majority of creators aren’t full-time,” Eric Dahan, the CEO and founder of influencer marketing agency Mighty Joy, added.
Burnout is also very common. It can take a lot of emotional labor to pull content from your life, Duffy said, and the pressure of maintaining brand relationships or the potential of losing viewers if you take a break can be a lot. Ongoing risks of potential exposure to hate or online harassment also persist.
Is the landscape changing?
Like all things online, the landscape for creators is constantly evolving.
Demand is also growing. More and more platforms are not only aiming to court users but specifically bring aspiring creators to their sites. And that coincides with an increased focus on marketing goods and brands in these spaces.
Companies are doubling down “to meet consumers where they are,” Raji Srinivasan, a marketing professor at The University of Texas at Austin’s McCombs School of Business. YouTube and other social media platforms, such as Instagram, have also built out offerings to attract this kind of content in recent years, but — for now — it’s “TikTok’s day in the sun,” she added, pointing to the platform’s persisting dominance in the market.
And for aspiring creators hoping to strike it big, Dahan’s advice is just to start somewhere. As Lebron’s success shows, he added, “You don’t know what’s going to happen.”