
Vice President, Engineering Ascendion
In the realm of artificial intelligence, generative AI has emerged as a transformative force, revolutionizing every aspect of our lives, asking every individual to take a pause and relook at how things are being done. At the heart of this technological revolution lie large language models (LLMs), which harness the power of deep learning and massive datasets to generate human-quality text.
With over 325,000 models available on Hugging Face and countless more in development, the question arises: why should one consider using open source LLMs?
Generative AI models can be broadly categorized into two types: proprietary and open source.
Proprietary (or closed-source) LLMs,
Open source LLMs,
It’s not true in every instance, but generally many proprietary LLMs are far larger in size than open source models. And specifically in terms of parameter size. Some of the leading proprietary LLMs extend to thousands of billions of parameters. Probably? We don’t necessarily know because, those LLMs and the parameter counts are proprietary. But “bigger isn’t necessarily better”.
However, the open source generative AI model ecosystem is showing promise in challenging the proprietary LLM business models in many use cases
Tech Note: Most frequently referenced terminologies in LLM space – Parameters & Tokens.
Parameters are a machine learning term for the variables present in the model on which it was trained that can be used to infer new content; (While training LLMs, the more the parameters, the more computational resources)
Tokens are the discrete units into which text is divided. Tokens can be as short as individual characters or as long as entire words; ; (While training LLMs, the more the tokens, the more memory needs)
Transparency: Open source LLMs provide insights into their inner workings, allowing users to understand how they generate text. This transparency fosters trust and enables developers to identify and address potential biases.
Fine-tuning: Open source LLMs can be fine-tuned to specific tasks and domains, tailoring them to unique use cases. This makes them versatile tools for diverse applications.
Community Collaboration: Open source LLMs benefit from the collective knowledge and expertise of a global community of developers and researchers. This collaborative environment drives innovation and accelerates model development.
Healthcare: Open source LLMs are being used to develop diagnostic tools and optimize treatment plans, improving healthcare outcomes.
Finance: FinGPT, an Open source LLM specifically tailored for the financial industry, is assisting with financial modeling and risk assessment.
Aerospace: NASA has developed an Open source LLM trained on geospatial data, aiding in satellite imagery analysis and mission planning.
And many more industries are rapidly adopting LLMs to solve unique business problems.
Companies like Huggingface maintains an open LLM leaderboard, and that tracks, ranks, and evaluates open source LLMs on various benchmarks like which LLM is scoring highest on the “Truthful AI Benchmark series”, which measures whether a language model is truthful in generating answers to questions.
The top spots on these leaderboards, they change frequently. And it’s quite fun to watch the progress these generative AI models are making. Some examples of top models include:
— Llama 2 Developed by Meta AI, Llama 2 encompasses a range of generative text models, from 70 billion to 7 billion parameters, offering flexibility for different applications.
— Vicuna Built upon the Llama model, it is specifically fine-tuned to follow instructions, making it ideal for task-oriented applications.
— Bloom Created by BigScience, it is a multilingual LLM developed collaboratively by over 1000 AI researchers, demonstrating the power of community-driven innovation.
Although LLM outputs often sounds fluent and authoritative, they can be confidently wrong.
Hallucinations: LLMs can generate false or misleading information, especially when trained on incomplete or inaccurate data.
Bias: LLMs can reflect biases present in their training data, leading to discriminatory or unfair outcomes.
Security Concerns: LLMs can be misused for malicious purposes, such as leaking sensitive information or generating phishing scams.
As Open source LLMs continue to mature and gain traction, it is evident that they are poised to play a transformative role in the future of AI. Their potential to democratize AI access, foster innovation, and solve real-world problems is immense. However, it is crucial to acknowledge and mitigate the associated risks to ensure responsible and ethical development and deployment of these powerful tools.

Vice President, Engineering Ascendion
[Brochure]Ascendion: Turning Salesforce Agentforce Into Real-World Advantage
[Podcast] The birth of Services-as-Software
[Podcast] Why the CEO Must be the Chief AI Officer
[POV] Agile is Dead!
[Whitepaper] AAVA: Agentic COBOL Modernization
[Whitepaper]DURESS Monitoring in Distributed Systems: A Practical Guide to Keeping Systems Healthy