Resources
That's Fresh! Newsletter
Read a selection of our past issues.
- 🙌 NumPy 2.0 is almost out!And: Our new data preprocessor with Polars | Interview with S2E at Italy Insurance ForumJune 5, 2024
- 😮 What a month for new LLMs!And: Datacamp webinar with ShaliniMay 22, 2024
- ✨ GenAI true value lies beyond operational enhancementsAnd: The Future of Data Protection | New updates about AI ActApril 24, 2024
- 👁 What are 1-bit Large Language Models?And: Linkedin Live about AI Act | Mastercard's Country Manager interviewed our CEOMarch 6, 2024
- LLaMAntino - Effective Text Generation in ItalianAnd: Creating train and test datasets | Use case: Detecting money muling with the help of synthetic dataFebruary 21, 2024
- 🗞️ The NY Times sues OpenAI and MicrosoftAnd: Can AI work with little data? | La Stampa: AI means developmentJanuary 10, 2024
- Synthetic Data 101 🚨And: Why synthetic data? | New project with Poste ItalianeNovember 8, 2023
- How easy is it for LLM to infer sensitive information?And: Why is data sharing important? | Our new partnership with S2EOctober 25, 2023
- Have you heard of Pythia?And: Data augmentation tutorial | Did you say AI apocalypse?August 30, 2023
- Google's answer to ChatGPTAnd: Generating synthetic data within relational databases. Let's meet at WAICF!February 8, 2023
- Understanding ChatGPT betterAnd: How to deal with imbalanced data. More about our productDecember 14, 2022
- A curated list of failed ML projectsAnd: How to build a data strategy. Clearbox AI and Bearing Point partnership.November 16, 2022
- Our open source library is now on GitHubAnd: Clearbox AI on Cybernews.June 22, 2022
- Discovering DagsterAnd: Quantifying privacy risks. Use case: a synthetic data sandbox to freely share data.June 8, 2022
- Can interaction data be fully anonymized?And: Synthetic Data for privacy preservation: understanding privacy risks. Discover our Enterprise solution.April 6, 2022
- What are GFlow nets?And: Improve models with Synthetic Data. Use case: augment financial time series.March 16, 2022
- The European Commission selected us for Women TechEU pilot project!And: What is Synthetic Data. The new Synthetic Data platform.March 09, 2022
- The EDPS on Synthetic DataAnd: From raw to good quality data. Changelogs: now you can upload unlabeled datasets.February 23, 2022
- 2022 Gartner’s Technology TrendsAnd: How to harness the power of AI in companies. Changelogs: new metrics available for your synthetic dataset.February 09, 2022
FROM THE AI WORLD
In less than one month, two state-of-the-art LLMs were released: OpenAI’s GPT-4o and Meta’s Llama 3, so today I'd like to share with you a brief comparison between them.
GPT-4o offers real-time, multimodal interaction, excels in multilingual tasks, and provides enhanced efficiency and safety features. Meta’s Llama 3, with models scaling up to 70 billion parameters, emphasizes open-source innovation and scalability, pushing the boundaries of what open-source AI can achieve.
Comparing the two, GPT-4o supports multimodal input and output, including text, images, audio, and video snapshots, though these features are not yet available via API. Llama 3, on the other hand, currently lacks multimodal capabilities but Meta plans to add them later this year. In terms of context length, GPT-4o takes 128K token window, while Llama 3 8K, which is efficiently utilized.
On benchmarks, GPT-4o impresses with its speed despite added capabilities, while Llama 3 performs well considering its size (8B). Pricing shows GPT-4o as expensive, whereas Llama 3 offers the best performance/cost ratio for most tasks.
In conclusion, both GPT-4o and Llama 3 represent significant advancements in the field of language models, each with unique strengths. Choosing the right model will depend on specific needs, balancing advanced features, context handling, and cost efficiency.
Say hello to new GTP-4o
The new flagship model by Open AI can reason across audio, vision and text in real time. It can be "a step towards much more natural human-computer interaction.
Introducing Meta Llama 3
It's the next generation of their state-of-the-art open source large language model. Company says it is "the most capable openly available LLM to date.
CLEARBOX AI
Datacamp Webinar with CEO Shalini
AI is advancing rapidly, and regulators worldwide are scrambling to understand and control the new technology. Do you want to know more?
YOUR PALS MAY ALSO LIKE...
Generate synthetic data in the blink of an eye, and for free: click on the image to find out more!