Everyone knows about LLM aka Large Language model.
Now we will talk about SLM aka Small Language model
As their name implies, SLMs are smaller in scale and scope than large language models.
Some examples of SLM are
- Phi 3.5
- tiny Llama
- mobile Llama
- Gemma2
SLMs can be trained using two main techniques:
Knowledge distillation: A smaller model learns from a larger, already-trained model
Pruning: Extra bits that aren't needed are removed to make the model faster and leaner
Here are some characteristics of SLMs:
Smaller in size: SLMs have fewer parameters than LLMs, often in the tens to hundreds of millions, compared to billions in LLMs.
More efficient: SLMs are more computationally efficient and can run on less powerful hardware.
Faster training: SLMs can be trained and developed faster than LLMs.
Specialized: SLMs are trained on curated data sources and can be specialized in specific tasks.
Fine-tunable: SLMs can be fine-tuned to do exactly what is needed for a specific task.
Cost-effective: SLMs can be more cost-effective than LLMs, making them a good option for integrating intelligent features when resources are limited.
Now we will talk about SLM aka Small Language model
As their name implies, SLMs are smaller in scale and scope than large language models.
Some examples of SLM are
- Phi 3.5
- tiny Llama
- mobile Llama
- Gemma2
SLMs can be trained using two main techniques:
Knowledge distillation: A smaller model learns from a larger, already-trained model
Pruning: Extra bits that aren't needed are removed to make the model faster and leaner
Here are some characteristics of SLMs:
Smaller in size: SLMs have fewer parameters than LLMs, often in the tens to hundreds of millions, compared to billions in LLMs.
More efficient: SLMs are more computationally efficient and can run on less powerful hardware.
Faster training: SLMs can be trained and developed faster than LLMs.
Specialized: SLMs are trained on curated data sources and can be specialized in specific tasks.
Fine-tunable: SLMs can be fine-tuned to do exactly what is needed for a specific task.
Cost-effective: SLMs can be more cost-effective than LLMs, making them a good option for integrating intelligent features when resources are limited.
AI Agents are about to change everything—and it’s happening now.
Here’s the cheat sheet:
1️⃣ Agentic RAG Routers: Think of them as traffic controllers for your workflows.
2️⃣ Query Planning RAG: Perfect for making tasks super efficient.
3️⃣ Adaptive RAG: Always learning, always improving.
4️⃣ Corrective RAG: Spotting and fixing errors before they derail you.
5️⃣ Self-Reflective RAG: Basically, AI journaling to improve itself.
6️⃣ Speculative RAG: Solving problems before you even know they exist.
7️⃣ Self Route RAG: Dynamic workflow magic.
Here’s the cheat sheet:
1️⃣ Agentic RAG Routers: Think of them as traffic controllers for your workflows.
2️⃣ Query Planning RAG: Perfect for making tasks super efficient.
3️⃣ Adaptive RAG: Always learning, always improving.
4️⃣ Corrective RAG: Spotting and fixing errors before they derail you.
5️⃣ Self-Reflective RAG: Basically, AI journaling to improve itself.
6️⃣ Speculative RAG: Solving problems before you even know they exist.
7️⃣ Self Route RAG: Dynamic workflow magic.
How to make real time stock market data processing pipeline using AWS Lambda and kinesis
Complete video is available on YouTube. Like and subscribe to our YouTube channel for such content.
https://youtu.be/CNHvbGNGV1A?si=vecZlS3Fkbk5C4zp
Complete video is available on YouTube. Like and subscribe to our YouTube channel for such content.
https://youtu.be/CNHvbGNGV1A?si=vecZlS3Fkbk5C4zp
🔥 BREAKING: OpenAI Launches Operator: The Future of AI Automation
OpenAI has introduced Operator, an AI agent that can complete tasks on its own using a web browser. It’s designed to make work easier by handling tasks for you.
Operator is powered by the new Computer-Using Agent (CUA) model. It combines GPT-4o's vision with advanced reasoning, allowing it to see, click, type, and interact with websites just like a person. No special integrations are needed.
OpenAI has introduced Operator, an AI agent that can complete tasks on its own using a web browser. It’s designed to make work easier by handling tasks for you.
Operator is powered by the new Computer-Using Agent (CUA) model. It combines GPT-4o's vision with advanced reasoning, allowing it to see, click, type, and interact with websites just like a person. No special integrations are needed.
⭐️Want an open source version of OpenAI's Operator?
There's a great open source project called Browser Use that does similar things (and more) while being open source
Allows you to plug in any model you want
Love to see open source leading the way🚀
https://www.instagram.com/p/DFNKm_JSQUQ/?igsh=eXlodmVwbXdyaTUy
There's a great open source project called Browser Use that does similar things (and more) while being open source
Allows you to plug in any model you want
Love to see open source leading the way🚀
https://www.instagram.com/p/DFNKm_JSQUQ/?igsh=eXlodmVwbXdyaTUy
Complete Data Preprocessing video is available on our YouTube channel.
It contains two things
1- Checking the quality of data
2- Doing data cleaning
Steps for checking the quality of data
1- Check the data manually
2- Check for the incorrect data types
3- Check for the spelling errror in the column names
4- Check for the spelling error in the categorical column values
5- Chcek for the negative values in the numerical column
6- Check for the missing values
7- Check for the duplicates values
8- Check for the outliers in the numerical column
9- Check for the data imbalance in the target column
10- Checking for the skeweness in the numerical column
11- Checking for multicollinearity
12- Checking for Cardinality in the categorical columns
13- Encoding the categorical column
Do watch it, like and subscribe to our YouTube channel.
We are aiming for 100 likes on this video. Show your support so that we can keep uploading free content
https://youtu.be/futAzAg99uA?si=NFx1BmSf-6V7xMtr
It contains two things
1- Checking the quality of data
2- Doing data cleaning
Steps for checking the quality of data
1- Check the data manually
2- Check for the incorrect data types
3- Check for the spelling errror in the column names
4- Check for the spelling error in the categorical column values
5- Chcek for the negative values in the numerical column
6- Check for the missing values
7- Check for the duplicates values
8- Check for the outliers in the numerical column
9- Check for the data imbalance in the target column
10- Checking for the skeweness in the numerical column
11- Checking for multicollinearity
12- Checking for Cardinality in the categorical columns
13- Encoding the categorical column
Do watch it, like and subscribe to our YouTube channel.
We are aiming for 100 likes on this video. Show your support so that we can keep uploading free content
https://youtu.be/futAzAg99uA?si=NFx1BmSf-6V7xMtr
YouTube
Complete Data Preprocessing in Python
In this video you will learn about data preprocessing in Python programming. It contains two things
1- Checking the quality of data
2- Doing data cleaning
Steps for checking the quality of data
1- Check the data manually
2- Check for the incorrect data…
1- Checking the quality of data
2- Doing data cleaning
Steps for checking the quality of data
1- Check the data manually
2- Check for the incorrect data…
https://www.instagram.com/reel/DFWqxPsShSn/?igsh=bm1tZzE0dGs1aHpt
Learn how DeepSeekv3 cause stock market to crash
Learn how DeepSeekv3 cause stock market to crash
Complete Exploratory data analysis in python.
Do watch it, like and subscribe to our channel
Support our content by subscribing we will upload more free content on data science
https://youtu.be/CVIBd5x_O9k?si=L6JCi_KaEn-k664c
Do watch it, like and subscribe to our channel
Support our content by subscribing we will upload more free content on data science
https://youtu.be/CVIBd5x_O9k?si=L6JCi_KaEn-k664c
YouTube
Complete Exploratory Data analysis in Python- Part 1
In this video you will learn about Exploratory data analysis in Python. Here we will talk about the graphical data analysis.
Link to the code- https://github.com/DataSpoof/YouTube_materials
Follow us on Instagram
www.instagram.com/dataspoof
Join our telegram…
Link to the code- https://github.com/DataSpoof/YouTube_materials
Follow us on Instagram
www.instagram.com/dataspoof
Join our telegram…
How to perform statistical data analysis in Python.
Do watch it, like and subscribe to our channel
Support our content by subscribing we will upload more free content on data science
https://youtu.be/VJF6qHAl6VQ?si=VTEQvjrDR_Qp4IUy
Do watch it, like and subscribe to our channel
Support our content by subscribing we will upload more free content on data science
https://youtu.be/VJF6qHAl6VQ?si=VTEQvjrDR_Qp4IUy
YouTube
How to perform Statistical Data analysis in Python (Descriptive Statistics)
In this video you will learn about how to perform statistical data analysis in Python specifically descriptive statistics in Python.
Link to the code- https://github.com/DataSpoof/YouTube_materials
Follow us on Instagram
www.instagram.com/dataspoof
Join…
Link to the code- https://github.com/DataSpoof/YouTube_materials
Follow us on Instagram
www.instagram.com/dataspoof
Join…
How to perform Inferential statistics in Python
Do watch it like and subscribe to our YouTube channel
Support us our content by subscribing we will upload more free content on data science
https://youtu.be/G-lgNshSmr0?si=P3SSG34nZMHZHOhA
Do watch it like and subscribe to our YouTube channel
Support us our content by subscribing we will upload more free content on data science
https://youtu.be/G-lgNshSmr0?si=P3SSG34nZMHZHOhA
YouTube
How to perform Inferential Statistics In Python
In this video you will learn how to perform inferential statistics in Python such as Parameteric test and Non-parameteric test in python.
Parametric test-----------T-test, Z test, F test, Anova test
Non Parametric test--------- Chi Square and KS test
…
Parametric test-----------T-test, Z test, F test, Anova test
Non Parametric test--------- Chi Square and KS test
…
Dm us on whatsapp for real time training
+9183182 38637
These are the following Training we offer
1- Data Science Training (5 months)
2- GenAI Training (40 days)
3- Mlops Training (40 days)
4- Data analyst Training (45 days)
5- Big data Training ( 60 days)
+9183182 38637
These are the following Training we offer
1- Data Science Training (5 months)
2- GenAI Training (40 days)
3- Mlops Training (40 days)
4- Data analyst Training (45 days)
5- Big data Training ( 60 days)
GenAI Curriculum (DataSpoof).pdf
264.7 KB
GenAI Curriculum (DataSpoof).pdf
Training Details_data_science.docx
62.1 KB
Training Details_data_science.docx
Application of 1 bit LLM model
1️⃣ In a remote village, a student can use a mobile device with a 1-bit LLM to get personalized tutoring without internet access.
2️⃣ In a low-resource clinic, healthcare workers use a mobile app with a 1-bit LLM to diagnose common diseases from symptoms or images offline.
3️⃣ Farmers use a 1-bit LLM app to diagnose crop diseases and receive personalized farming advice based on soil type and weather patterns
4️⃣ In a disaster-prone area, a 1-bit LLM-powered app helps first responders and citizens communicate critical information in multiple languages offline
1️⃣ In a remote village, a student can use a mobile device with a 1-bit LLM to get personalized tutoring without internet access.
2️⃣ In a low-resource clinic, healthcare workers use a mobile app with a 1-bit LLM to diagnose common diseases from symptoms or images offline.
3️⃣ Farmers use a 1-bit LLM app to diagnose crop diseases and receive personalized farming advice based on soil type and weather patterns
4️⃣ In a disaster-prone area, a 1-bit LLM-powered app helps first responders and citizens communicate critical information in multiple languages offline
Many data scientists don't know how to push ML models to production. Here's the recipe 👇
𝗞𝗲𝘆 𝗜𝗻𝗴𝗿𝗲𝗱𝗶𝗲𝗻𝘁𝘀
🔹 𝗧𝗿𝗮𝗶𝗻 / 𝗧𝗲𝘀𝘁 𝗗𝗮𝘁𝗮𝘀𝗲𝘁 - Ensure Test is representative of Online data
🔹 𝗙𝗲𝗮𝘁𝘂𝗿𝗲 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 𝗣𝗶𝗽𝗲𝗹𝗶𝗻𝗲 - Generate features in real-time
🔹 𝗠𝗼𝗱𝗲𝗹 𝗢𝗯𝗷𝗲𝗰𝘁 - Trained SkLearn or Tensorflow Model
🔹 𝗣𝗿𝗼𝗷𝗲𝗰𝘁 𝗖𝗼𝗱𝗲 𝗥𝗲𝗽𝗼 - Save model project code to Github
🔹 𝗔𝗣𝗜 𝗙𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 - Use FastAPI or Flask to build a model API
🔹 𝗗𝗼𝗰𝗸𝗲𝗿 - Containerize the ML model API
🔹 𝗥𝗲𝗺𝗼𝘁𝗲 𝗦𝗲𝗿𝘃𝗲𝗿 - Choose a cloud service; e.g. AWS sagemaker
🔹 𝗨𝗻𝗶𝘁 𝗧𝗲𝘀𝘁𝘀 - Test inputs & outputs of functions and APIs
🔹 𝗠𝗼𝗱𝗲𝗹 𝗠𝗼𝗻𝗶𝘁𝗼𝗿𝗶𝗻𝗴 - Evidently AI, a simple, open-source for ML monitoring
𝗣𝗿𝗼𝗰𝗲𝗱𝘂𝗿𝗲
𝗦𝘁𝗲𝗽 𝟭 - 𝗗𝗮𝘁𝗮 𝗣𝗿𝗲𝗽𝗮𝗿𝗮𝘁𝗶𝗼𝗻 & 𝗙𝗲𝗮𝘁𝘂𝗿𝗲 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴
Don't push a model with 90% accuracy on train set. Do it based on the test set - if and only if, the test set is representative of the online data. Use SkLearn pipeline to chain a series of model preprocessing functions like null handling.
𝗦𝘁𝗲𝗽 𝟮 - 𝗠𝗼𝗱𝗲𝗹 𝗗𝗲𝘃𝗲𝗹𝗼𝗽𝗺𝗲𝗻𝘁
Train your model with frameworks like Sklearn or Tensorflow. Push the model code including preprocessing, training and validation scripts to Github for reproducibility.
𝗦𝘁𝗲𝗽 𝟯 - 𝗔𝗣𝗜 𝗗𝗲𝘃𝗲𝗹𝗼𝗽𝗺𝗲𝗻𝘁 & 𝗖𝗼𝗻𝘁𝗮𝗶𝗻𝗲𝗿𝗶𝘇𝗮𝘁𝗶𝗼𝗻
Your model needs a "/predict" endpoint, which receives a JSON object in the request input and generates a JSON object with the model score in the response output. You can use frameworks like FastAPI or Flask. Containzerize this API so that it's agnostic to server environment
𝗦𝘁𝗲𝗽 𝟰 - 𝗧𝗲𝘀𝘁𝗶𝗻𝗴 & 𝗗𝗲𝗽𝗹𝗼𝘆𝗺𝗲𝗻𝘁
Write tests to validate inputs & outputs of API functions to prevent errors. Push the code to remote services like AWS Sagemaker.
𝗦𝘁𝗲𝗽 𝟱 - 𝗠𝗼𝗻𝗶𝘁𝗼𝗿𝗶𝗻𝗴
Set up monitoring tools like Evidently AI, or use a built-in one within AWS Sagemaker. I use such tools to track performance metrics and data drifts on online data.
𝗞𝗲𝘆 𝗜𝗻𝗴𝗿𝗲𝗱𝗶𝗲𝗻𝘁𝘀
🔹 𝗧𝗿𝗮𝗶𝗻 / 𝗧𝗲𝘀𝘁 𝗗𝗮𝘁𝗮𝘀𝗲𝘁 - Ensure Test is representative of Online data
🔹 𝗙𝗲𝗮𝘁𝘂𝗿𝗲 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 𝗣𝗶𝗽𝗲𝗹𝗶𝗻𝗲 - Generate features in real-time
🔹 𝗠𝗼𝗱𝗲𝗹 𝗢𝗯𝗷𝗲𝗰𝘁 - Trained SkLearn or Tensorflow Model
🔹 𝗣𝗿𝗼𝗷𝗲𝗰𝘁 𝗖𝗼𝗱𝗲 𝗥𝗲𝗽𝗼 - Save model project code to Github
🔹 𝗔𝗣𝗜 𝗙𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 - Use FastAPI or Flask to build a model API
🔹 𝗗𝗼𝗰𝗸𝗲𝗿 - Containerize the ML model API
🔹 𝗥𝗲𝗺𝗼𝘁𝗲 𝗦𝗲𝗿𝘃𝗲𝗿 - Choose a cloud service; e.g. AWS sagemaker
🔹 𝗨𝗻𝗶𝘁 𝗧𝗲𝘀𝘁𝘀 - Test inputs & outputs of functions and APIs
🔹 𝗠𝗼𝗱𝗲𝗹 𝗠𝗼𝗻𝗶𝘁𝗼𝗿𝗶𝗻𝗴 - Evidently AI, a simple, open-source for ML monitoring
𝗣𝗿𝗼𝗰𝗲𝗱𝘂𝗿𝗲
𝗦𝘁𝗲𝗽 𝟭 - 𝗗𝗮𝘁𝗮 𝗣𝗿𝗲𝗽𝗮𝗿𝗮𝘁𝗶𝗼𝗻 & 𝗙𝗲𝗮𝘁𝘂𝗿𝗲 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴
Don't push a model with 90% accuracy on train set. Do it based on the test set - if and only if, the test set is representative of the online data. Use SkLearn pipeline to chain a series of model preprocessing functions like null handling.
𝗦𝘁𝗲𝗽 𝟮 - 𝗠𝗼𝗱𝗲𝗹 𝗗𝗲𝘃𝗲𝗹𝗼𝗽𝗺𝗲𝗻𝘁
Train your model with frameworks like Sklearn or Tensorflow. Push the model code including preprocessing, training and validation scripts to Github for reproducibility.
𝗦𝘁𝗲𝗽 𝟯 - 𝗔𝗣𝗜 𝗗𝗲𝘃𝗲𝗹𝗼𝗽𝗺𝗲𝗻𝘁 & 𝗖𝗼𝗻𝘁𝗮𝗶𝗻𝗲𝗿𝗶𝘇𝗮𝘁𝗶𝗼𝗻
Your model needs a "/predict" endpoint, which receives a JSON object in the request input and generates a JSON object with the model score in the response output. You can use frameworks like FastAPI or Flask. Containzerize this API so that it's agnostic to server environment
𝗦𝘁𝗲𝗽 𝟰 - 𝗧𝗲𝘀𝘁𝗶𝗻𝗴 & 𝗗𝗲𝗽𝗹𝗼𝘆𝗺𝗲𝗻𝘁
Write tests to validate inputs & outputs of API functions to prevent errors. Push the code to remote services like AWS Sagemaker.
𝗦𝘁𝗲𝗽 𝟱 - 𝗠𝗼𝗻𝗶𝘁𝗼𝗿𝗶𝗻𝗴
Set up monitoring tools like Evidently AI, or use a built-in one within AWS Sagemaker. I use such tools to track performance metrics and data drifts on online data.